mirror of
https://github.com/tursodatabase/libsql.git
synced 2024-12-16 03:19:00 +00:00
4b21878187
* add benchmark scripts * remove unnecessary srand
93 lines
3.7 KiB
Markdown
93 lines
3.7 KiB
Markdown
## benchmarks tools
|
|
|
|
Simple benchmark tools intentionally written in C in order to have faster feedback loops (no need to wait for Rust builds)
|
|
|
|
You need to install `numpy` for some scripts to work. You can do it globally or using virtual env:
|
|
```py
|
|
$> python -m venv .env
|
|
$> source .env/bin/activate
|
|
$> pip install -r requirements.txt
|
|
```
|
|
|
|
### benchtest
|
|
|
|
Simple generic tool which takes SQL file, db file and run all queries against provded DB file.
|
|
For SQL file generation you can use/extend `workload.py` script.
|
|
|
|
Take a look at the example:
|
|
```sh
|
|
$> LD_LIBRARY_PATH=../.libs/ ./benchtest queries.sql data.db
|
|
open queries file at queries.sql
|
|
open sqlite db at 'data.db'
|
|
executed simple statement: 'CREATE TABLE t ( id INTEGER PRIMARY KEY, emb FLOAT32(4) );'
|
|
executed simple statement: 'CREATE INDEX t_idx ON t ( libsql_vector_idx(emb) );'
|
|
prepared statement: 'INSERT INTO t VALUES ( ?, vector(?) );'
|
|
inserts (queries.sql):
|
|
insert: 710.25 micros (avg.), 4 (count)
|
|
size : 0.2695 MB
|
|
reads : 1.00 (avg.), 4 (total)
|
|
writes: 1.00 (avg.), 4 (total)
|
|
prepared statement: 'SELECT * FROM vector_top_k('t_idx', vector(?), ?);'
|
|
search (queries.sql):
|
|
select: 63.25 micros (avg.), 4 (count)
|
|
size : 0.2695 MB
|
|
reads : 1.00 (avg.), 4 (total)
|
|
```
|
|
|
|
It is linked against liblibsql.so which resides in the `../libs/` directory and must be explicitly built from `libsql-sqlite3` sources:
|
|
```sh
|
|
$> basename $(pwd)
|
|
libsql-sqlite3
|
|
$> make # this command will generate libs in the .libs directory
|
|
$> cd benchmark
|
|
$> make bruteforce
|
|
open queries file at bruteforce.sql
|
|
open sqlite db at 'test.db'
|
|
executed simple statement: 'PRAGMA journal_mode=WAL;'
|
|
executed simple statement: 'CREATE TABLE x ( id INTEGER PRIMARY KEY, embedding FLOAT32(64) );'
|
|
prepared statement: 'INSERT INTO x VALUES (?, vector(?));'
|
|
inserts (bruteforce.sql):
|
|
insert: 46.27 micros (avg.), 1000 (count)
|
|
size : 0.2695 MB
|
|
reads : 1.00 (avg.), 1000 (total)
|
|
writes: 1.00 (avg.), 1000 (total)
|
|
prepared statement: 'SELECT id FROM x ORDER BY vector_distance_cos(embedding, vector(?)) LIMIT ?;'
|
|
search (bruteforce.sql):
|
|
select: 329.32 micros (avg.), 1000 (count)
|
|
size : 0.2695 MB
|
|
reads : 2000.00 (avg.), 2000000 (total)
|
|
```
|
|
|
|
### anntest
|
|
|
|
Simple tool which takes DB file with 2 tables `data (id INTEGER PRIMARY KEY, emb FLOAT32(n))` and `queries (emb FLOAT32(n))` and execute vector search for all vectors in `queries` table abainst `data` table using provided SQL statements.
|
|
|
|
In order to generate DB file you can use `benchtest` with `workload.py` tools. Take a look at the example:
|
|
```sh
|
|
$> python workload.py recall_uniform 64 1000 1000 > recall_uniform.sql
|
|
$> LD_LIBRARY_PATH=../.libs/ ./benchtest recall_uniform.sql recall_uniform.db
|
|
$> # ./anntext [db path] [test name (used only for printed stats)] [ann query] [exact query]
|
|
$> LD_LIBRARY_PATH=../.libs/ ./anntest recall_uniform.db 10-recall@10 "SELECT rowid FROM vector_top_k('data_idx', ?, 10)" "SELECT id FROM data ORDER BY vector_distance_cos(emb, ?) LIMIT 10"
|
|
open sqlite db at 'recall_uniform.db'
|
|
ready to perform 1000 queries with SELECT rowid FROM vector_top_k('data_idx', ?, 10) ann query and SELECT id FROM data ORDER BY vector_distance_cos(emb, ?) LIMIT 10 exact query
|
|
88.91% 10-recall@10 (avg.)
|
|
```
|
|
|
|
### blobtest
|
|
|
|
Simple tool which aims to prove that `sqlite3_blob_reopen` API can substantially increase performance of reads.
|
|
|
|
Take a look at the example:
|
|
```sh
|
|
$> LD_LIBRARY_PATH=../.libs/ ./blobtest blob-read-simple.db read simple 1000 1000
|
|
open sqlite db at 'blob-read-simple.db'
|
|
blob table: ready to prepare
|
|
blob table: prepared
|
|
time: 3.76 micros (avg.), 1000 (count)
|
|
$> LD_LIBRARY_PATH=../.libs/ ./blobtest blob-read-reopen.db read reopen 1000 1000
|
|
open sqlite db at 'blob-read-reopen.db'
|
|
blob table: ready to prepare
|
|
blob table: prepared
|
|
time: 0.31 micros (avg.), 1000 (count)
|
|
```
|