0
0
mirror of https://github.com/tursodatabase/libsql.git synced 2024-12-15 14:29:44 +00:00
libsql/libsql-sqlite3/doc/libsql_extensions.md
Alexandr Jeliuc (a2xchip) e60a4e33f8
Updated libsql_extensions.md fix broken link (#1044)
Fixed broken link to `wal.h`
2024-02-23 05:15:27 +00:00

14 KiB

libSQL extensions

This document describes extensions to the library provided by libSQL, not available in upstream SQLite at the time of writing.

libsql_close_hook

It was reported that extensions have issues cleaning up after themselves, e.g. with regard to prepared statements they cache, because of the shutdown order incurred by libSQL. In order to allow users to customize what happens before a connection is closed, libsql_close_hook is introduced. There can only be one close hook per connection. The hook is executed before any internal cleanup routines are applied, which in particular makes it possible to terminate any outstanding cached prepared statements.

void *libsql_close_hook(
  sqlite3 *db,              /* Attach the hook to this connection */
  void(*xCallback)(         /* Callback function */
    void*,sqlite3*),
  void *pArg                /* First callback argument */
);

Altering columns

Foreign keys

Foreign keys in SQLite and libSQL require an indexed column in one table (e.g. its primary key) and a foreign key constraint on another table.

For example, if you have the following table:

libsql> CREATE TABLE users (id INT PRIMARY KEY);

You can then create another table:

libsql> CREATE TABLE emails (user_id INT, email TEXT);

and add a foreign key constraint from the user_id column to the id column of the users table:

libsql> ALTER TABLE emails ALTER COLUMN user_id TO user_id INT REFERENCES users(id);

and now you have the following schema in your database:

libsql> .schema
CREATE TABLE users (id INT PRIMARY KEY);
CREATE TABLE emails (user_id INT REFERENCES users(id), email TEXT);

To remove a foreign constraint, you do the following:

libsql> ALTER TABLE emails ALTER COLUMN user_id TO user_id INT;
libsql> .schema
CREATE TABLE users (id INT PRIMARY KEY);
CREATE TABLE emails (user_id INT, email TEXT);

Other attributes

All kind of column attributes, like type affinity, CHECK constraints, DEFAULT values, and so on, can be amended with ALTER TABLE ALTER COLUMN as well:

libsql> CREATE TABLE t(id, v);
libsql> ALTER TABLE t ALTER COLUMN v TO v NOT NULL CHECK(v < 42);
libsql> .schema t
CREATE TABLE t(id, v NOT NULL CHECK(v < 42));
libsql> ALTER TABLE t ALTER COLUMN v TO v TEXT DEFAULT 'hai';
libsql> .schema t
CREATE TABLE t(id, v TEXT DEFAULT 'hai');
libsql> ALTER TABLE t ALTER COLUMN v TO v;
libsql> .schema t
CREATE TABLE t(id, v);

Caveats

Please note that altering constraints via ALTER TABLE ALTER COLUMN only applies to newly inserted or updated data - existing rows are not rewritten or revalidated.

It's also important to notice that foreign key constraints are disabled by default, and can be enabled with a PRAGMA foreign_keys=ON statement at runtime.

RANDOM ROWID

Regular tables use an implicitly defined, unique, 64-bit rowid column as its primary key. If rowid value is not specified during insertion, it's auto-generated with the following heuristics:

  1. Find the current max rowid value.
  2. If max value is less than i64::max, use the next available value
  3. If max value is i64::max: a. pick a random value b. if it's not taken, use it c. if it's taken, go to (a.), rinse, repeat

Based on this algorithm, the following trick can be used to trick libSQL into generating random rowid values instead of consecutive ones - simply insert a sentinel row with rowid = i64::max.

The newly introduced RANDOM ROWID option can be used to explicitly state that the table generates random rowid values on insertions, without having to insert a dummy row with special rowid value, or manually trying to generate a random unique rowid, which some user applications may find problematic.

Usage

RANDOM ROWID keywords can be used during table creation, in a manner similar to its syntactic cousin, WITHOUT ROWID:

CREATE TABLE shopping_list(item text, quantity int) RANDOM ROWID;

On insertion, pseudorandom rowid values will be generated:

CREATE TABLE shopping_list(item text, quantity int) RANDOM ROWID;
INSERT INTO shopping_list(item, quantity) VALUES ('bread', 2);
INSERT INTO shopping_list(item, quantity) VALUES ('butter', 1);
.mode column
SELECT rowid, * FROM shopping_list;
rowid                item    quantity
-------------------  ------  --------
1177193729061749947  bread   2       
4433412906245401374  butter  1  

Restrictions

RANDOM ROWID is mutually exclusive with WITHOUT ROWID option, and cannot be used with tables having an AUTOINCREMENT primary key.

WebAssembly-based user-defined functions (experimental)

In addition to being able to define functions via the C API (http://www.sqlite.org/c3ref/create_function.html), it's possible to enable experimental support for CREATE FUNCTION syntax allowing users to dynamically register functions coded in WebAssembly.

Once enabled, CREATE FUNCTION and DROP FUNCTION are available in SQL. They act as syntactic sugar for managing data stored in a special internal table: libsql_wasm_func_table(name TEXT, body TEXT). This table can also be inspected with regular tools - e.g. to see which functions are registered and what's their source code.

How to enable

This feature is experimental and opt-in, and can be enabled by the following configure:

./configure --enable-wasm-runtime

Then, in your source code, the internal table for storing WebAssembly source code can be created via libsql_try_initialize_wasm_func_table(sqlite3 *db) function.

You can also download a pre-compiled binary from https://github.com/libsql/libsql/releases/tag/libsql-0.1.0, or use a docker image for experiments:

docker run -it piotrsarna/libsql:libsql-0.1.0-wasm-udf ./libsql

Configurations

WebAssembly runtime can be enabled in multiple configurations:

  1. Based on Wasmtime, linked statically (default)
./configure --enable-wasm-runtime
  1. Based on Wasmtime, linked dynamically
./configure --enable-wasm-runtime-dynamic
  1. Based on WasmEdge, linked dynamically with libwasmedge
./configure --enable-wasm-runtime-wasmedge

NOTE: WasmEdge backend comes without the ability to translate WebAssembly text format (WAT) to Wasm binary format. In this configuration, user-defined functions can only be defined with their source code passed as a compiled binary blob. In libSQL bindgen you can produce it by checking the "as a binary blob" checkbox. NOTE2: WasmEdge backend depends on libwasmedge compatible with their 0.11.2 release. If your package manager does not have it available, download it from the official release page.

If you're interested in a setup that links libwasmedge.a statically, let us know, or, better yet, send a patch!

shell support

In order to initialize the internal WebAssembly function lookup table in libsql shell (sqlite3 binary), one can use the .init_wasm_func_table command. This command is safe to be called multiple times, even if the internal table already exists.

CREATE FUNCTION

Creating a function requires providing its name and WebAssembly source code (in WebAssembly text format). The ABI for translating between WebAssembly types and libSQL types is to be standardized soon.

Example SQL:

CREATE FUNCTION IF NOT EXISTS fib LANGUAGE wasm AS '
(module 
 (type (;0;) (func (param i64) (result i64))) 
 (func $fib (type 0) (param i64) (result i64) 
 (local i64) 
 i64.const 0 
 local.set 1 
 block ;; label = @1 
 local.get 0 
 i64.const 2 
 i64.lt_u 
 br_if 0 (;@1;) 
 i64.const 0 
 local.set 1 
 loop ;; label = @2 
 local.get 0 
 i64.const -1 
 i64.add 
 call $fib 
 local.get 1 
 i64.add 
 local.set 1 
 local.get 0 
 i64.const -2 
 i64.add 
 local.tee 0 
 i64.const 1 
 i64.gt_u 
 br_if 0 (;@2;) 
 end 
 end 
 local.get 0 
 local.get 1 
 i64.add) 
 (memory (;0;) 16) 
 (global $__stack_pointer (mut i32) (i32.const 1048576)) 
 (global (;1;) i32 (i32.const 1048576)) 
 (global (;2;) i32 (i32.const 1048576)) 
 (export "memory" (memory 0)) 
 (export "fib" (func $fib)))
';

[1] WebAssembly source: 55b69d8d08/src/lib.rs (L68-L75) .

Drop function

Dropping a dynamically created function can be done via a DROP FUNCTION statement.

Example:

DROP FUNCTION IF EXISTS fib;

How to implement user-defined functions in WebAssembly

This paragraph is based on our blog post which describes the process in more detail.

In order for a WebAssembly function to be runnable from libSQL, it must follow its ABI - which in this case can be reduced to "how to translate libSQL types to WebAssembly and back". Fortunately, both projects have a very small set of supported types, so the whole mapping fits in a short table:

libSQL type Wasm type
INTEGER i64
REAL f64
TEXT i32*
BLOB i32*
NULL i32*

where i32 represents a pointer to WebAssembly memory. Underneath, indirectly represented types are encoded as follows:

libSQL type representation
TEXT [1 byte with value 3 (SQLITE_TEXT)][null-terminated string]
BLOB [1 byte with value 4 (SQLITE_BLOB)][4 bytes of size][binary string]
NULL [1 byte with value 5 (SQLITE_NULL)

The compiled module should export at least the function that is supposed to be later used as a user-defined function, and its memory instance.

Encoding type translation manually for each function can be cumbersome, so we provide helper libraries for languages compilable to WebAssembly. Right now the only implementation is for Rust: https://crates.io/crates/libsql_bindgen

With libsql_bindgen, a native Rust function can be annotated with a macro:

#[libsql_bindgen::libsql_bindgen]
pub fn decrypt(data: String, key: String) -> String {
  use magic_crypt::MagicCryptTrait;
  let mc = magic_crypt::new_magic_crypt!(key, 256);
  mc.decrypt_base64_to_string(data)
      .unwrap_or("[ACCESS DENIED]".to_owned())
}

Compiling the function to WebAssembly will produce code that can be registered as a user-defined function in libSQL.

cargo build --release --target wasm32-unknown-unknown

For quick experiments, our playground application can be used: https://bindgen.libsql.org

After the function is compiled, it can be registered via SQL by:

CREATE FUNCTION your_function LANGUAGE wasm AS <source-code>

, where <source-code> is either a binary .wasm blob or text presented in WebAssembly Text format.

See an example in CREATE FUNCTION paragraph above.

Virtual WAL

Write-ahead log is a journaling mode which enables nice write concurrency characteristics - it not only allows a single writer to run in parallel with readers, but also makes BEGIN CONCURRENT transactions with optimistic locking possible. In SQLite, WAL is not a virtual interface, it only has a single file-based implementation, with an additional WAL index kept in shared memory (in form of another mapped file). In libSQL, akin to VFS, it's possible to override WAL routines with custom code. That allows implementing pluggable backends for write-ahead log, which opens many possibilities (again, similar to the VFS mechanism).

API

In order to register a new set of virtual WAL methods, these methods need to be implemented. This is the current API: https://github.com/tursodatabase/libsql/blob/main/libsql-sqlite3/src/wal.h

Registering WAL methods

After the implementation is ready, the following public functions can be used to manage it:

  libsql_wal_methods_find
  libsql_wal_methods_register
  libsql_wal_methods_unregister

, and they are quite self-descriptive. They also work similarly to their sqlite3_vfs* counterparts, which they were modeled after.

It is important to note that wal_methods in themselves should be stateless. There are registered globally, and accessible from every connection. When state needs to be accessed from the WAL methods, state can be passed as the 7th argument to libsql_open_v2. This state will then become accessible in the pMethodData field of the libsql_wal struct passed to the WAL methods.

Using WAL methods

Custom WAL methods need to be declared when opening a new database connection. That can be achieved either programatically by using a new flavor of the sqlite3_open* function:

int libsql_open(
  const char *filename,   /* Database filename (UTF-8) */
  sqlite3 **ppDb,         /* OUT: SQLite db handle */
  int flags,              /* Flags */
  const char *zVfs,       /* Name of VFS module to use, NULL for default */
  const char *zWal        /* Name of WAL module to use, NULL for default */
)

... or via URI, by using a new wal parameter:

.open file:test.db?wal=my_impl_of_wal_methods

Example

An example implementation can be browsed in the Rust test suite, at test/rust_suite/src/virtual_wal.rs

xPreparedSql virtual table callback

Virtual tables provide an interface to expose different data sources. Several callbacks are already defined via function pointers on struct sqlite3_module, including xConnect and xBestIndex. libSQL introduces a new callback named xPreparedSql that allows a virtual table implementation to receive the SQL string submitted by the application for execution.

How to enable

In order to enable this callback, applications must define a value for iVersion >= 700 on struct sqlite3_module.

Usage

The following C99 snippet shows how to declare a virtual table that implements the new callback.

static int helloPreparedSql(sqlite3_vtab_cursor *cur, const char *sql) {
    printf("Prepared SQL: %s\n", sql);
    return SQLITE_OK;
}

static sqlite3_module helloModule = {
    .iVersion     = 700,
    .xCreate      = helloCreate,
    .xConnect     = helloConnect,
    // ...
    .xPreparedSql = helloPreparedSql,
};