mirror of
https://github.com/tursodatabase/libsql.git
synced 2025-01-08 18:49:04 +00:00
261 lines
9.4 KiB
Markdown
261 lines
9.4 KiB
Markdown
# `sqld` User Guide
|
|
|
|
Welcome to the `sqld` user guide!
|
|
|
|
## Table of Contents
|
|
|
|
* [Overview](#overview)
|
|
* [Replication](#replication)
|
|
* [TLS configuration](#tls-configuration)
|
|
* [Launching a primary server](#launching-a-primary-server)
|
|
* [Launching a replica server](#launching-a-replica-server)
|
|
* [Client Authentication](#clientauthentication)
|
|
* [Deployment](#deployment)
|
|
* [Deploying with Docker](#deploying-with-docker)
|
|
* [Deploying on Fly](#deploying-on-fly)
|
|
|
|
## Overview
|
|
|
|
The `sqld` program provides libsql over HTTP and supports transparent replication.
|
|
|
|
![libsql cluster overview.](sqld-overview.png)
|
|
<p align="center">
|
|
Figure 1. Overview of libsql clustering.
|
|
</p>
|
|
|
|
Figure 1 shows an overview of libsql cluster where clients execute SQL remotely over HTTP against `sqld` instances.
|
|
In the middle, there is the _primary_ instance, which is responsible for accepting writes and servicing replicas for write-ahead log (WAL) updates.
|
|
If a client performs a write operation such as `INSERT` statement in SQL, replicas delegate the write to a primary node.
|
|
Read operations, such as `SELECT` statements, however, are executed on the replica directly.
|
|
The replicas poll the primary instance for WAL updates periodically over a gRPC connection.
|
|
|
|
## Replication
|
|
|
|
In this section, we will walk you through how to set up a libsql cluster.
|
|
|
|
### TLS configuration
|
|
|
|
The nodes in a `sqld` cluster communicate over gRPC with TLS. To set up a `sqld` cluster, you need the following TLS configuration:
|
|
|
|
* Certificate authority (CA) certificate and private key
|
|
* Primary server certificate and private key
|
|
* Replica server certificates and private keys
|
|
|
|
In TLS speak, the primary server is the server and the replica servers are the clients.
|
|
|
|
For *development and testing* purposes, you can generate TLS keys and certificates with:
|
|
|
|
```console
|
|
python scripts/gen_certs.py
|
|
```
|
|
|
|
The script generates the following files:
|
|
|
|
* `ca_cert.pem` -- certificate authority certificate
|
|
* `ca_key.pem` -- certificate authority private key
|
|
* `server_cert.pem` -- primary server certificate
|
|
* `server_key.pem` -- primary server private key
|
|
* `client_cert.pem` -- replica server certificate
|
|
* `client_key.pem ` -- replica server private key
|
|
|
|
### Launching a primary server
|
|
|
|
To start a `sqld` server in primary mode, run:
|
|
|
|
```console
|
|
sqld \
|
|
--http-listen-addr 127.0.0.1:8081 \
|
|
--grpc-listen-addr 127.0.0.1:5001 \
|
|
--grpc-tls \
|
|
--grpc-ca-cert-file ca_cert.pem \
|
|
--grpc-cert-file server_cert.pem \
|
|
--grpc-key-file server_key.pem
|
|
```
|
|
|
|
You now have a `sqld` primary server listening to SQL over HTTP at `127.0.0.1:8081` and gRPC with TLS at `127.0.0.1:5001`.
|
|
|
|
### Launching a replica server
|
|
|
|
To start a a `sqld` server in replica mode, run:
|
|
|
|
```console
|
|
sqld \
|
|
--http-listen-addr 127.0.0.1:8082 \
|
|
--primary-grpc-url https://127.0.0.1:5001 \
|
|
--primary-grpc-tls \
|
|
--primary-grpc-ca-cert-file ca_cert.pem \
|
|
--primary-grpc-cert-file client_cert.pem \
|
|
--primary-grpc-key-file client_key.pem
|
|
```
|
|
|
|
You now have a `sqld` replica server listening to SQL over HTTP at `127.0.0.1:8082`, which is connected to a primary server at `127.0.0.1:5001`.
|
|
|
|
You can add more replicas to the cluster by just starting more `sqld` processes. However, it's recommended that you generate a different TLS configuration for every replica.
|
|
|
|
To test the cluster, you can, for example, create a table and insert rows in the replica:
|
|
|
|
```console
|
|
curl -d '{"statements": ["CREATE TABLE IF NOT EXISTS users (username)", "INSERT INTO users VALUES (\"alice\")"]}' 127.0.0.1:8082
|
|
```
|
|
|
|
and query the results from the primary:
|
|
|
|
```console
|
|
curl -d '{"statements": ["SELECT * FROM users"]}' 127.0.0.1:8081
|
|
```
|
|
|
|
## Client Authentication
|
|
|
|
You can configure client authentication by passing the `--auth-jwt-key-file FILENAME` command line option to `sqld`.
|
|
The key is either a PKCS#8-encoded Ed25519 public key in PEM, or just plain bytes of the Ed25519 public key in URL-safe base64.
|
|
|
|
## Deployment
|
|
|
|
### Deploying with Docker
|
|
|
|
You can pull the official Docker image for `sqld` with
|
|
|
|
```console
|
|
docker pull ghcr.io/libsql/sqld:main
|
|
```
|
|
|
|
You can find more information about the Docker image [here](./DOCKER.md).
|
|
|
|
### Deploying on Fly
|
|
|
|
You can use the existing `fly.toml` file from this repository.
|
|
|
|
Just run
|
|
```console
|
|
flyctl launch
|
|
```
|
|
... then pick a name and respond "Yes" when the prompt asks you to deploy.
|
|
|
|
You now have `sqld` running on Fly listening for HTTP connections.
|
|
|
|
Give it a try with this snippet, replacing `$YOUR_APP` with your app name:
|
|
```
|
|
curl -X POST -d '{"statements": ["create table testme(a,b,c)"]}' $YOUR_APP.fly.dev
|
|
curl -X POST -d '{"statements": ["insert into testme values(1,2,3)"]}' $YOUR_APP.fly.dev
|
|
curl -X POST -d '{"statements": ["select * from testme"]}' $YOUR_APP.fly.dev
|
|
```
|
|
```
|
|
[{"b":2,"a":1,"c":3}]
|
|
```
|
|
|
|
## Incremental snapshots
|
|
|
|
The `sqld` generates incremental snapshots of the database file, which you can apply to a local libSQL replica.
|
|
For example, suppose you have an application that is not always connected over the network and can't rely on the `sqld` gRPC replication method. In that case, you can configure `sqld` to notify of generated incremental snapshots, sync the snapshot files to another machine, and apply them.
|
|
|
|
You can use the `--snapshot-exec` command line option to specify a file, such as a shell script, to execute on snapshot generation. You can also use the `--max-log-duration SECS` command line option
|
|
on to control how often `sqld` generates the snapshot files to ensure the freshness of the data on local replicas.
|
|
|
|
To use incremental snapshots, first, create a shell script with the name `snapshot.sh`:
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
|
|
SNAPSHOT_FILE="$1"
|
|
NAMESPACE="$2"
|
|
|
|
echo "Generated incremental snapshot $SNAPSHOT_FILE for namespace $NAMESPACE"
|
|
|
|
# At this point we can ship the snapshot file to whereever we would like but we
|
|
# must delete it from its location on disk or else sqld will panic.
|
|
rm $SNAPSHOT_FILE
|
|
```
|
|
|
|
and then configure `sqld` to generate an incremental snapshot every 5 seconds and invoke the shell script when `sqld` generates a snapshot:
|
|
|
|
```console
|
|
sqld --snapshot-exec ./snapshot.sh --max-log-duration 5
|
|
```
|
|
|
|
When you write to the `sqld` database, you will eventually see log line such as:
|
|
|
|
```console
|
|
2023-08-11T08:21:04.183564Z INFO sqld::replication::snapshot: snapshot `e126f594-90f4-45be-9350-bc8a01160de9-0-2.snap` successfully created
|
|
Generated incremental snapshot data.sqld/dbs/default/snapshots/e126f594-90f4-45be-9350-bc8a01160de9-0-2.snap
|
|
```
|
|
|
|
The first line is logging from `sqld` and the second line is `sqld` executing `snapshot.sh` script.
|
|
You can now, for example, `rsync` the snapshot file to another machine, to apply the changes to a local replica with the `Database::sync_frames()` method of the `libsql` crate:
|
|
|
|
```rust
|
|
use libsql::Database;
|
|
use libsql_replication::{Frames, TempSnapshot};
|
|
|
|
#[tokio::main]
|
|
async fn main() {
|
|
tracing_subscriber::fmt::init();
|
|
|
|
let opts = libsql::Opts::with_sync();
|
|
let db = Database::open_with_opts("test.db", opts).await.unwrap();
|
|
let conn = db.connect().unwrap();
|
|
|
|
let args = std::env::args().collect::<Vec<String>>();
|
|
if args.len() < 2 {
|
|
println!("Usage: {} <snapshot path>", args[0]);
|
|
return;
|
|
}
|
|
let snapshot_path = args.get(1).unwrap();
|
|
let snapshot = TempSnapshot::from_snapshot_file(snapshot_path.as_ref()).unwrap();
|
|
|
|
db.sync_frames(Frames::Snapshot(snapshot)).unwrap();
|
|
|
|
let rows = conn
|
|
.query("SELECT * FROM sqlite_master", ())
|
|
.unwrap()
|
|
.unwrap();
|
|
while let Ok(Some(row)) = rows.next() {
|
|
println!(
|
|
"| {:024} | {:024} | {:024} | {:024} |",
|
|
row.get::<&str>(0).unwrap(),
|
|
row.get::<&str>(1).unwrap(),
|
|
row.get::<&str>(2).unwrap(),
|
|
row.get::<&str>(3).unwrap(),
|
|
);
|
|
}
|
|
}
|
|
```
|
|
|
|
When applying snapshots the format of the file name gives certain information.
|
|
The format is `{namespace}:{log_id}:{start_frame_no:020x}-{end_frame_no:020x}.snap` where log_id represents the unqiue write ahead log and then
|
|
for each unique log_id there will be snapshots starting at frame `0` up until
|
|
the end. Snapshots must be applied sequentially for each log_id starting at
|
|
frame 0.
|
|
|
|
## Multitenancy
|
|
|
|
The `sqld` server supports more than one database. To create a database, send a create namespace request to the [admin API](ADMIN_API.md).
|
|
For example, to create a database named `db1`, send the following HTTP request:
|
|
|
|
```shell
|
|
curl -X POST http://localhost:8080/v1/namespaces/db1/create
|
|
```
|
|
|
|
The name of the database is determined from the `Host` header in the HTTP request.
|
|
|
|
For example, if you have the following entries in your `/etc/hosts` file:
|
|
|
|
```console
|
|
127.0.0.1 db1.local
|
|
127.0.0.1 db2.local
|
|
```
|
|
|
|
You can access `db1` with the `http://db1.local:8080`URL and `db2` with `http://db2.local:8080`.
|
|
The database files for the databases are stored in `<data dir>/dbs/db1` and `<data dir/dbs/db2`, respectively.
|
|
|
|
### Path based routing
|
|
|
|
For local development and testing, you can also connect to your databases by specifying the database namespace in the URL path.
|
|
|
|
You can access namespace `db1` with the `http://local:8080/dev/db1` URL and `db2` with `http://local:8080/dev/db2`, respectively.
|
|
|
|
### Wildcard domain for development
|
|
|
|
If you don't feel like editing `/etc/hosts` each time you want to test a new namespace, you can use any domain
|
|
that returns `127.0.0.1` for all subdomains. One of such domains is `*.db.sarna.dev`, which you can use to access
|
|
your local `db1` and `db2` as `http://db1.db.sarna.dev` and `http://db2.db.sarna.dev` respectively.
|