Hacker News

Sqlite3 WebAssembly

501 points by whatever3 - 144 comments

simonw [3 hidden]5 mins ago

Something that would be really fun would be to run SQLite in-memory in a browser but use the same tricks as Litestream and Cloudflare Durable Objects (https://simonwillison.net/2024/Oct/13/zero-latency-sqlite-st...) to stream a copy of the WAL log to a server (maybe over a WebSocket, though intermittent fetch() POST would work too).

Then on subsequent visits use that server-side data to rehydrate the client-side database.

From https://sqlite.org/forum/info/50a4bfdb294333eec1ba4749661934... is looks like WAL mode is excluded from the default SQLite WASM build so you would have to go custom with that.

ncruces [3 hidden]5 mins ago

There are many layers of that's not how it works at play here.

In-memory SQLite databases don't use WAL. Wasm (and browser Wasm, in particular) doesn't support anything like the shared memory APIs SQLite wants for its WAL mode.

Litestream requires a very precise WAL setup to work (which just so happens to work with the default native SQLite setup, but is hard to replicate with Wasm).

Cloudflare Durable Objects may have been inspired by Litestream but works very differently (as do LiteFS, Turso, etc…)

The general idea of streaming changes from SQLite would work, but it's a lot of work, and the concurrency model of in-browser Wasm will make it challenging to implement.

(I wrote that forum post some time ago, and have WAL working in a server side Wasm build of SQLite, but none of the options to make it work would make much sense, or be possible, in browser)

digdugdirk [3 hidden]5 mins ago

As someone who uses sqlite fairly regularly, but doesn't understand what most of those paragraphs mean, do you have any recommendations for learning resources?

I'm gathering that I need to learn about: - WAL - Shared Memory APIs - Concurrency models - Durable Objects?

throwaway2037 [3 hidden]5 mins ago

WAL - Shared Memory APIs: https://sqlite.org/wal.html

concurrency model of in-browser Wasm: https://medium.com/@quincarter/wasm-is-awsm-and-multi-thread...

Cloudflare Durable Objects: https://www.cloudflare.com/developer-platform/durable-object...

wyager [3 hidden]5 mins ago

WAL: Write ahead log, common strategy for DBs (sqlite, postgres, etc.) to improve commit performance. Instead of fsync()ing every change, you just fsync() a log file that contains all the changes and then you can fsync() the actual changes at your leisure

Shared memory API: If you want to share (mutable) data between multiple processes, you need some kind of procedure in place to manage that. How do you get a reference to the data to multiple processes, how do you make sure they don't trample each other's writes, etc.

Concurrency model: There are many different ways you can formalize concurrent processes and the way they interact (message passing, locking, memory ordering semantics, etc.). Different platforms will expose different concurrency primitives that may not work the same way as other platforms and may require different reasoning or code structure

Durable objects - I think this is some Cloudflare service where they host data that can be read or modified by your users

This is all from memory, but IME, GPT is pretty good for asking about concepts at this level of abstraction

digdugdirk [3 hidden]5 mins ago

Thank you!

And side note on your last point - I've been burned too many times by confident hallucinations to trust my foundational learning to GPT. I hope someday that will improve, but for now ChatGPT is as trustworthy as an evening chat with someone at the bar.

... Someone who has been drinking since happy hour.

globular-toast [3 hidden]5 mins ago

If you'd like a trustworthy overview, the book Designing Data-Intensive Applications by Martin Kleppmann is a classic. I really hope we get an updated version, but the fundamentals all still hold anyway.

infogulch [3 hidden]5 mins ago

I was curious and Grok 2 seemed to do pretty good: https://x.com/i/grok/share/c2qCdF2wwIx7AHz0U1f2u8dTO

jchanimal [3 hidden]5 mins ago

If you like, solving these sort of problems, we are tackling them at Fireproof.

Our database API is modeled on CouchDB and MongoDB, but our storage abstractions are along the lines of what you need to build the multi writer WAL you describe.

More details here https://jsr.io/@fireproof/encrypted-blockstore

jauntywundrkind [3 hidden]5 mins ago

Then don't use in-memory sqlite? Use file backed sqlite but have your wasm implementation of those "system calls" just be to memory?

I dunno, feels like you're coming down too hard.

ncruces [3 hidden]5 mins ago

File backed SQLite in a browser? Do you mean like OPFS?

https://sqlite.org/wasm/doc/trunk/persistence.md#opfs-wal

Again, just because the all the Lego pieces sound like they should all just fit together, doesn't mean that they will.

The VFS mechanism was primarily designed to make SQLite easy to port to multiple OSes. WAL mode is hard to port everytime you step away from a more traditional OS.

“We have SQLite in the browser, let's just stream the WAL like the Litestream hack” does not add up.

It's not impossible, but it surely took a lot of effort at Cloudflare (or at Fly, or Turso) to get there. And neither of them opened it up sufficiently to help us do the same. In a browser.

andrewf [3 hidden]5 mins ago

I haven't looked but I bet a lot of the WAL complexity comes down to supporting consistency and durability guarantees, neither of which you necessarily need for your in-browser use case.

ncruces [3 hidden]5 mins ago

Not really, or it depends. That complexity is dealt with by SQLite.

The complexity for you comes from trying to reuse their (battle tested) implementation when (as I wrote above) the primitives they depend upon were not meant to make porting to a browser sandbox easy.

And the problems there are the specific concurrency model they depend upon: communicate by sharing memory.

Then, you're either working at the wrong abstraction level (and it shows), or you're patching and gutting SQLite.

SQLite is meant to work with files, file locks, shared memory, fsync and mmap.

It also doesn't work out great if you try to persist to an object store, to a KV store, or…

I'm repeating myself, but yeah. You can make it work. Others have made it work. But it's still a lot of work, and you're throwing away a lot of what makes SQLite… SQLite.

matharmin [3 hidden]5 mins ago

Unlike other opinions here I do think it is technically feasible to stream a copy of the WAL - it just has to be implemented in the VFS. "Shared memory" could be a SharedArrayBuffer, or just a normal buffer if you only have one database connection open at a time (for example in a SharedWorker, which is already common). It may not be simple to implement, but definitely possible.

The biggest actual issue is that it will capture block-level changes, not row-level changes. This means it can work to replicate a complete database, but partial sync (e.g. sharing some rows with other users) won't be feasible.

To get row-level changes for partial sync, you need to use something like triggers or the SQLite session extension [1]. For PowerSync we just use triggers on the client side. I find that works really well, and I haven't found any real downsides to that except perhaps for the work of maintaining the triggers.

[1]: https://sqlite.org/sessionintro.html

billywhizz [3 hidden]5 mins ago

hi simon. i direct messaged you on twitter about a PoC i did of this in aug 2022, but never heard back - i thought you might have been interested. my twitter handle is justjs14.

i have some code i would have to dig out that did this very thing - it allows you to open a SQLite db in browser using sqlite (with a VFS) compiled to wasm (not the official WASM build), make changes and both push and pull WALs to and from a server (or indeed browser to browser would be possible both manually or over WebRTC). it even works with github pages if you give the browser client a github token to work with.

if you are interested, feel free to ping me and i can see if i can get this up and running from scratch again. i did a ton of experiments with this approach around then and i think it could be useful for a subset of applications at least.

there's also a working demo of the pull functionality only here: https://just.billywhizz.io/sqlite/demo/#https://just.billywh...

dustinchilson [3 hidden]5 mins ago

Are you thinking something like https://electric-sql.com/

PUSH_AX [3 hidden]5 mins ago

What’s the catch with this thing?

T-Winsnes [3 hidden]5 mins ago

The security model is challenging, as it relies on Postgres users for iam. Your users essentially log directly into your db

infogulch [3 hidden]5 mins ago

You can see what this means specifically from the docs: https://electric-sql.com/docs/guides/auth

dumbo-octopus [3 hidden]5 mins ago

Isn’t Postgres a fairly capable IAM provider, all things considered? I’d their access control mechanisms at least as much as a run of the mill external backend’s.

T-Winsnes [3 hidden]5 mins ago

For basic auth it works well, but the challenge comes when you need to integrate with oidc, need to enforce mfa, enable sso etc. session invalidation is also quite complicated.

You need an identity middle man in front of the Postgres identity to tackle these and validate that the session is still active. Last time I looked at electric it was a big challenge to integrate such a service. This might have improved since then however

Fabricio20 [3 hidden]5 mins ago

I think OrbitDB [1] is the closest we have to a functional version of this right now? Unsure, but the concept is really cool!

[1]: https://github.com/orbitdb/orbitdb/blob/main/docs/GETTING_ST...

conradev [3 hidden]5 mins ago

LiteVFS can be compiled in a browser and sync with LiteFS cloud: https://github.com/superfly/litevfs

It needs to be run from a worker, though

hiccuphippo [3 hidden]5 mins ago

This sounds like what meteor was 10+ years ago but with sqlite instead of mongodb.

nesarkvechnep [3 hidden]5 mins ago

Nothing like it, actually.

fcanesin [3 hidden]5 mins ago

Isn't a more advanced/production version of this what the combo of couchdb and pouchdb do since several years ago?

simonw [3 hidden]5 mins ago

Yeah this kind of thing has certainly been tried before, I feel like SQLite WASM plus WAL might be an interesting twist on the idea.

simonw [3 hidden]5 mins ago

Slight point of confusion: that page says:

> These components were initially released for public beta with version 3.40 and will tentatively be made API-stable with the 3.41 release, pending community feedback.

But the most recent release of SQLite is 3.46.1 (from 2024-08-13)

Presumably they are now "API-stable" but the page hasn't been updated yet.

It would be great if the SQLite team published an official npm package bundling the WASM version, could be a neat distribution mechanism for them. (UPDATE: They do, see replies to this post.)

My favourite version of SQLite-in-WASM remains the Pyodide variant, which has been around since long before the official SQLite implementation. If you use Pyodide you get a WASM SQLite for free as part of the Python standard library - I use that for https://lite.datasette.io/ and you can also try it out on https://pyodide.org/en/stable/console.html

    import sqlite3
    print(sqlite3.connect(':memory:').execute(
        'select sqlite_version()'
    ).fetchall())

That returns 3.39.0 from 2022-06-25 so Pyodide could do with a version bump. Looks like it inherits that version from emscripten: https://github.com/emscripten-core/emscripten/blob/main/tool...

sgbeal [3 hidden]5 mins ago

> Presumably they are now "API-stable" but the page hasn't been updated yet.

That's correct. i'll try my best to remember to update that reference the next time i'm back on the computer.

> It would be great if the SQLite team published an official npm package

Not a chance. We publish only vanilla JS and adamantly refuse to go down the rabit hole of supporting out-of-language tools (none of which any of our project members use). We support an "officially sanctioned" npm build, maintained by Thomas Steiner, but do not actively develop for any JS frameworks.

Direct support for any given framework (npm included) would give the impression that we endorse that framework, and endorsement of third-party projects is something we actively avoid.

rezonant [3 hidden]5 mins ago

> We publish ONLY vanilla JS and adamantly refuse to go down rabit hole of supporting the frameworks du jour

A bit confused at this, NPM is just a package manager / distribution mechanism, not a framework. Totally fair if you don't want to publish for all the package managers, though for Javascript there's only a few that are relevant. NPM has been around for a decade.

sgbeal [3 hidden]5 mins ago

> A bit confused at this, NPM is just a package manager / distribution mechanism, not a framework

It's an out-of-language packaging/distribution framework (and it's not the only one). It's not part of the JS standards.

My comments above have been edited to reframe our stance on npm and frameworks in general.

rezonant [3 hidden]5 mins ago

I don't think there will ever be a package manager dictated by the Ecmascript standards.

syndicatedjelly [3 hidden]5 mins ago

Then ES will continue to remain an outlier among major language implementations

samatman [3 hidden]5 mins ago

I think the communication barrier here is that in JavaScript, framework very distinctly means things like React, Vue, Angular, and so on. It definitely does not refer to projects like Node/npm/Bun/Deno, those are toolchains, sometimes called ecosystems for obscure reasons.

If you changed the word "framework" to "toolchain" in your post I think it would make a lot more sense to people.

sgbeal [3 hidden]5 mins ago

> If you changed the word "framework" to "toolchain" in your post I think it would make a lot more sense to people.

Fair point but the edit window has passed ;). For the sake of clarity for those still following along: "framework," in the context of my above comments, includes any non-formally-standardized tools or APIs which are built atop the standardized core.

Ciantic [3 hidden]5 mins ago

> It would be great if the SQLite team published an official npm package bundling the WASM version, could be a neat distribution mechanism for them.

I think they've been doing that for a while, in JS script you can already do this:

    import sqlite3InitModule from "https://cdn.jsdelivr.net/npm/@sqlite.org/sqlite-wasm/sqlite-wasm/jswasm/sqlite3-bundler-friendly.mjs";

    const sqlite3 = await sqlite3InitModule({
        locateFile(file: string) {
            return "https://cdn.jsdelivr.net/npm/@sqlite.org/sqlite-wasm/sqlite-wasm/jswasm/sqlite3.wasm";
        },
    });

    // SQLite's C API
    const capi = sqlite3.capi;
    console.log("sqlite3 version", capi.sqlite3_libversion(), capi.sqlite3_sourceid());

    // OO API example below oo1 docs https://sqlite.org/wasm/doc/tip/api-oo1.md
    const oo = sqlite3.oo1;

    const db = new oo.DB();
    const createPersonTableSql = `
    CREATE TABLE IF NOT EXISTS person (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        name TEXT NOT NULL,
        age INTEGER NOT NULL
    );
    `;
    db.exec([createPersonTableSql]);

It works in regular old script tag with type=module, or Deno. I have example HTML here:

https://github.com/Ciantic/experimenting-sqlite-wasm/blob/ma...

sgbeal [3 hidden]5 mins ago

> > It would be great if the SQLite team published an official npm package

> I think they've been doing that for a while,

Kinda: <https://sqlite.org/wasm/doc/trunk/npm.md>

We in the sqlite project neither use nor require npm in any capacity whatsoever, so it would be kinda silly for us to attempt to support it. We instead leave that level of code/tools to folks who use and/or care about them.

There _is_ an "officially sanctioned/blessed" npm repo, and we actively support its maintainer (e.g. we participate the issue tracker and make patches in the core distribution where they're strictly needed), but we otherwise keep a "hands off" policy when it comes to non-standardized APIs and toolchains.

We _like_ to see people to plug the sources into their tools of choice, but we cannot feasibly take on the burden of doing that plugging-in for them, especially given how fluid the JavaScript ecosystem is when it comes to frameworks and tools.

Sidebar: we rely heavily on Emscripten because there is, for all practical purposes, it has no substitute, but we also actively go out of our way to ensure that the sources can be easily plugged in to an alternative should one ever appear.

rblank [3 hidden]5 mins ago

https://github.com/sqlite/sqlite-wasm

sqlite-wasm loads much faster than Pyodide, so if you don't need Python, then the former is a better choice.

simonw [3 hidden]5 mins ago

Amazing!

    npm install @sqlite.org/sqlite-wasm

simonw [3 hidden]5 mins ago

I built my own little demo page here: https://tools.simonwillison.net/sqlite-wasm

With the help of Claude, though it incorrectly hallucinated some of the details despite me pasting in documentation: https://gist.github.com/simonw/677c3794051c4dfeac94e514a8e5b...

wg0 [3 hidden]5 mins ago

This would run on client side I presume? Where the data would go?

Okay that's listed here: https://sqlite.org/wasm/doc/trunk/persistence.md

EDIT: Self answered.

jarpineh [3 hidden]5 mins ago

You can use DuckDB WASM independently of Pyodide and can extend it with SQLite.

Though it seems to be somewhat limited. I couldn't even check what version it has, since sqlite_version() was missing. Version in the repository [1] is 3.38.1, which is from quite a ways ago.

At the moment DuckDB web shell can't load SQLite extension, since that hasn't been released for yesterday's 1.1.2. Earlier version does work using recently updated WASM edition. That can be extended with spatial including GDAL, vector search etc [2]. Making your own "SQL web shell" wasn't too hard, though docs weren't quite complete enough for me.

[1] https://github.com/duckdb/sqlite_scanner/blob/main/src/sqlit... [2] https://github.com/duckdb/duckdb-wasm/releases/tag/v1.29.0

CodeWriter23 [3 hidden]5 mins ago

> It would be great if the SQLite team published an official npm package bundling the WASM version, could be a neat distribution mechanism for them.

You may benefit from perusing the FAQ on that page.

k__ [3 hidden]5 mins ago

Half-OT:

What's your opinion on SQLite in-memory vs plain objects/arrays?

When would you use which and why?

koeng [3 hidden]5 mins ago

For use in Golang, I really like ncruces wasm SQLite package - https://github.com/ncruces/go-sqlite3 . Unlike cznic's go package (which is great, btw), the wasm version works well on OpenBSD and the like.

ncruces [3 hidden]5 mins ago

Author here. If you're interested, do ask questions.

infogulch [3 hidden]5 mins ago

Hey how's the mmap-based shared memory WAL approach working out? It's been about half a year since you finished the implementation and I only see one issue about it on the tracker, a good sign?

ncruces [3 hidden]5 mins ago

It's turning out great, I guess.

The current approach is not portable to Windows, but it works fine on Linux, macOS, BSD and illumos. In general, portability is hindered more by file locking (I hate POSIX locks) than mmap.

The currently open GitHub issue is more bad default configuration than anything else. Configuring connections to use less memory by default should fix it.

I already have a PR ready for the next release that also opens this up for 32-bit platforms.

TN1ck [3 hidden]5 mins ago

Very cool project! Do you know if this would be possible for duckdb? Is there something about sqlites APIs and wasm build that made it feasible?

Context: Currently using go-duckdb and while it's working for us, getting rid of cgo would be a huge help. Would be quite interested myself to attempt this.

ncruces [3 hidden]5 mins ago

I don't know much about DuckDB's architecture.

Wasm is fine for compute (though concurrency is still a somewhat open question).

To have Wasm talk to the outside world, you need “host calls” where the guest calls the host.

On a browser that's Wasm calling JavaScript. On my Go driver, it's Wasm calling Go.

For server side, there's also a standard set of “host calls” modeled around POSIX/Linux syscalls called WASI.

I could've build my project around WASI, but WASI is rather limited (and SQLite support for WASI was more limited even, it's improved a bit since). DuckDB might work out-of-the-box this way.

I, instead, took advantage of SQLite's architecture and replaced its VFS layer with one in Go: https://sqlite.org/vfs.html

So SQLite in Wasm is just doing compute, and I do all the OS level stuff in Go. No need for Wasm concurrency, cause I can load multiple instances of my Wasm which act like independent OS processes that communicate through the filesystem (SQLite excels at this).

As I said, I dunno how well all those decisions would map to DuckDB.

koeng [3 hidden]5 mins ago

> So SQLite in Wasm is just doing compute, and I do all the OS level stuff in Go. No need for Wasm concurrency, cause I can load multiple instances of my Wasm which act like independent OS processes that communicate through the filesystem (SQLite excels at this).

Interesting. So when I am running concurrent readers using your package, it is just loading multiple instances of the wasm code? (I bottleneck to a single writer in the application)

chrysoprace [3 hidden]5 mins ago

I've been really interested in the local-first landscape lately but embedding SQLite seems really heavy-weight compared to using the browser's built-in storage APIs (in particular, IndexedDB) and it seems to be what most of the main open source libraries do. I'm interested to see a open-source solution (with sync) which provides an SQLite-like API but for the browser's native storage rather than trying to embed another executable in Web Assembly.

gavmor [3 hidden]5 mins ago

SQL is, arguably, more ergonomic than IndexedDB APIs, and may take up less RAM/CPU when, eg, using a `WHERE` clause rather than an `if() ` filter.

chrysoprace [3 hidden]5 mins ago

It's definitely more ergonomic, but the browser doesn't have an SQL API where as it does have IndexedDB. I'm hoping for tooling that makes IndexedDB more ergonomic while having a local-first syncing solution.

Dexie gives an ORM-like experience but their syncing solution is not FOSS.

jamesgpearce [3 hidden]5 mins ago

Disclaimer: I'm the author. But you might be interested in TinyBase.

chrysoprace [3 hidden]5 mins ago

Thanks I'll give it a look! I guess I really was fishing for someone to make a recommendation and I see you have a lot of backend persistence options which I'm very excited about.

Another issue I have with a lot of the new local first products is that they tend to lock you into a particular database type on the backend, so this is refreshing.

arunaugustine [3 hidden]5 mins ago

TinyBase looks very promising! Is there a doc or reference you can guide me to, for using TinyBase with Preact.js instead of React?

parhamn [3 hidden]5 mins ago

WebSQL should've just been Sqlite and the whole offline-first (and general app storage) ecosystem would've been so much nicer.

Is there any hope of that happening? Instead of abstracting and over specifying sqlite, can the spec just specify a version of the SQLite API browsers should support and roll the version periodically?

emn13 [3 hidden]5 mins ago

"Rolling the version periodically" is probably quite problematic for browsers. Kind of a key point of the web is that stuff if at all possible keeps working. Breaking changes like that are hard.

Even if the spec just listed occasional version and the webpage could choose which one; that means a potentially tricky maintenance burden on browser to support old versions of a potentially no longer supported sqlite, and each version is another megabyte.

Why not then just choose this solution, and let each website pick its own poison?

If the concern is the repeated downloads of common resources, well, we've accepted that for other CDN's too, and a solution for shared caching of common dependencies would in any case be more valuable than merely for sqlite.

The current approach seems better than a browser-provided version.

simonw [3 hidden]5 mins ago

I for one am glad WebSQL didn't establish itself.

Now we get the most recent version of SQLite when we need it as a 410KB compressd WASM blob, as opposed to being stuck on browser-mandated versions of SQLite that might even be a decade old at this point.

xyc [3 hidden]5 mins ago

They are not mutually exclusive though. WebSQL doesn't prevent anyone from loading a WASM blob. And while moving slowly, the browsers does deprecate old stuff and update implementation.

brandonpollack2 [3 hidden]5 mins ago

I was trying to get this working in a rust ecosystem some time ago but none of the blessed.rs sql (rusqlite, sqlx) wrappers seem to take advantage of it yet and wrapping it yourself is a bit tricky since when I was trying I couldn't figure out a way to to get emscripten wasm code to play nice with wasm32-unknown-unknown without some kind of JS wrapper which then requires implementing the interface those crates expect and exposing it from JS. Once that is done in rust itll be great there too!

aabhay [3 hidden]5 mins ago

I have been working on one. If you're interested in working on it or contributing, feel free to chime in here:

https://github.com/rhashimoto/wa-sqlite/discussions/154

This essentially requires that we import the sqlite emscripten build via an extern C header in wasm bindgen, and then we need to re-implement the VFS in rust while compiling it in multi-threaded mode to allow for shared array buffer access. After that is all done, we will be able to access SQLite rows as raw wasm bytes. That gives us the ability to implement a rust-sqlite style wrapper or integration. There would still not be some of the niceties such as connection pooling, but in wasm you likely want to use the db in exclusive mode.

azakai [3 hidden]5 mins ago

> I couldn't figure out a way to to get emscripten wasm code to play nice with wasm32-unknown-unknown

There is good news there, some people plan to get Emscripten and Rust to work well together in Wasm:

https://github.com/rustwasm/wasm-bindgen/pull/4014#issuecomm...

insipx [3 hidden]5 mins ago

Got this SQLite build working with rusts diesel sqlite here: https://github.com/xmtp/diesel-wasm-sqlite

I'm gearing up for a 0.2 release this week which should iron out a few kinks, but otherwise you can use the diesel ORM as if its native

tonygiorgio [3 hidden]5 mins ago

Yeah I’ve been waiting awhile for this myself. A few PRs with work pending for a year or so. I’ve seen some proof of concepts but nothing anywhere close to usable.

insipx [3 hidden]5 mins ago

you should check out https://github.com/xmtp/diesel-wasm-sqlite

reliable so far, being dogfooded in production as we speak

catapart [3 hidden]5 mins ago

I wasn't able to tell from a quick look through the page: could someone help me understand the use cases here?

More specifically, would this be able to be a "replacement" for indexedDB? Does the data persist, or do I need to keep the sqlite file in the filesytemAPI (or indexedDB/localstorage) myself?

azangru [3 hidden]5 mins ago

From the about page:

> Specific Goals of this Project

> Insofar as possible, support persistent client-side storage using available JS APIs. As of this writing, that includes the Origin-Private FileSystem (OPFS) and (very limited) storage via the window.localStorage and window.sessionStorage backend.

catapart [3 hidden]5 mins ago

Right but, to my eyes, that's vague?

What I'm asking is if I need to manage the sqlite file, as I would on an OS's file system, or if accessing the sqlite library will automatically persist that data to those web-native storages, like the way indexedDB doesn't require me to load an "idb" file and then "save" or "commit" that save. I just access it and write.

To be clear: I'm not asking academically. I wrote a whole library for managing data in indexedDB for local-first apps, and while it works well enough for what I need, it's iDB so it's subject to data deletion (not common, but allowed in the spec if necessary), and it's a pain to work with just because of its nature and API. So I've been waiting to move to sqlite for a while with the only holdbacks being "is it too heavy?", and "how much has to change?". With WASM, I think we're about as lightweight as its going to get. So I'm just curious if this aims to be a drop-in replacement, or if it still expects you to use it like sqlite on a native platform.

sgbeal [3 hidden]5 mins ago

> Right but, to my eyes, that's vague?

We (the sqlite project, where the "vague" description comes from) do not define the use cases. Similarly, in the docs for the C library you won't find any more than passing references to specific use cases, and those are typically contrived for the sake of example. (One notable exception: <https://sqlite.org/appfileformat.html>)

> What I'm asking is if I need to manage the sqlite file, as I would on an OS's file system, or if accessing the sqlite library will automatically persist that data to those web-native storages, like the way indexedDB doesn't require me to load an "idb" file and then "save" or "commit" that save. I just access it and write.

That's all covered in the docs (of which there are well more than 100 lovingly-hand-written pages), but the short answer is "it just works." You have the _option_ of importing and exporting databases from the browser-native storage, but you don't have to.

For starters, see: <https://sqlite.org/wasm/doc/tip/persistence.md>

asdfman123 [3 hidden]5 mins ago

> do not define the use cases

I genuinely don't mean to sound rude, and maybe I misunderstand, but how do you build software if you're not doing it with use cases in mind?

manmal [3 hidden]5 mins ago

Usage is laid out well in the docs, I‘m not sure GP has read them. IMO it’s obvious that many use cases have been kept in mind. Defining a use case != accommodating a use case

TiredGuy [3 hidden]5 mins ago

So after downloading from the official downloads page and stripping away all the mjs files and "bundler-friendly" files, a minimal sqlite wasm dependency will be about 1.3MB.

For an in-browser app, that seems a bit much but of course wasm runs in other places these days where it might make more sense.

jsheard [3 hidden]5 mins ago

It's pretty compressible at least, sqlite3.js+wasm are 1.3MB raw but minifying the JS and then compressing both files with Brotli gets them down to 410KB.

rmbyrro [3 hidden]5 mins ago

A lot of HTML's nowadays have 100 - 300 kb. That's only the HTML (!!).

Adding 400 for such a high quality piece of DB actually borders reasonability.

And makes me think: what the hell are frontend devs thinking!? Multiple MB's in JS for a news website. Hundreds of KB's for HTML. It's totally unreasonable.

jsheard [3 hidden]5 mins ago

> A lot of HTML's nowadays have 100 - 300 kb. That's only the HTML (!!).

I think you can probably blame Tailwind for that.

yoavm [3 hidden]5 mins ago

Why? More often than not the classes are combined during post-processing to the most reusable unified classes, with very short classes names.

wahern [3 hidden]5 mins ago

> what the hell are frontend devs thinking!? Multiple MB's in JS for a news website. Hundreds of KB's for HTML. It's totally unreasonable

They're thinking, "adding [some fraction of existing total payload] for such a high quality [feature] actually borders reasonability". Wash. Rinse. Repeat.

Dylan16807 [3 hidden]5 mins ago

> They're thinking, "adding [some fraction of existing total payload] for such a high quality [feature] actually borders reasonability". Wash. Rinse. Repeat.

Context makes all the difference here. If you're considering a big chunk of size for a relational database engine, you need to ask: are you making a complex application, or a normal web page? If it's the latter, then it's not reasonable at all.

And anything that makes the HTML itself that big is almost certainly bloat, not "high quality", and shouldn't be used in any context.

coder543 [3 hidden]5 mins ago

1.3MB seems perfectly reasonable in a modern web app, especially since it will be cached after the first visit to the site.

If you’re just storing user preferences, obviously don’t download SQLite for your web app just to do that… but if you’re doing something that benefits from a full database, don’t fret so much about 1MB that you go try to reinvent the wheel for no reason.

If the other comment is correct, then it won’t even be 1.3MB on the network anyways.

telotortium [3 hidden]5 mins ago

A megabyte here, a megabyte there, pretty soon you’re talking about a really heavyweight app.

Dylan16807 [3 hidden]5 mins ago

By the time you have a good reason to add this library, I think you're already in heavyweight app territory.

zdragnar [3 hidden]5 mins ago

Given how hefty images are, a full database doesn't seem too bad for the purpose of an "app" that would benefit from it, especially when compression can being the size down even lower.

littlecranky67 [3 hidden]5 mins ago

We are past the stage where every piece of JS has to be loaded upfront and delay the first meaningful paint. Modern JS frameworks and module are chunked and can be eager/lazy loaded. Unless you make the sqlite DB integral part for your first meaningful page load, preloading those 1.3MB in the background/upon user request is easy.

pdyc [3 hidden]5 mins ago

That's correct, people in this thread are comparing single compressed dependency of sqlite+wasm of 400KB to the total size of web pages which run in MB. I did some actual tests while trying to use sqlite and it does adds noticeable delay on first page load on mobile due to big size+decompression+ additional scaffolding of wasm. Pages that run into MB have small files that are downloaded concurrently so the delay is not noticeable. I wrote about this and my other expriments with in browser db in my last article but it did not get any traction here.

flockonus [3 hidden]5 mins ago

It's a good consideration, together with the fact browsers already have IndexedDB embedded. One use case still for in-browser apps like Figma / Photoshop-like / ML apps, where the application code and data is very big anyway, 1.3Mb may not add that much

Also worth considering parsing of wasm is significantly faster than JS (unfortunately couldn't find the source for this claim, there is at lease one great article on the topic)

https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_A...

aidos [3 hidden]5 mins ago

When we built our frontend sync system we tried a few different options. We had a fairly simple case of just trying to store entities so we could pull incremental updates since you were last online. The one we ran in production for a while was IndexedDB but found the overhead wasn’t worth it.

I played around with warm sqlite too. That was really nice but I decided against it due to the fact that it was totally unsupported.

ncruces [3 hidden]5 mins ago

For server side, you'll likely need a different build of Wasm SQLite, that handles concurrency (and file locking) differently.

Also, WASI is very far from answer (so far). The SQLite amalgamation builds fine for WASI but concurrency is an unsolved issue.

I had to build a VFS from scratch to get my Wasm based SQLite driver into a usable shape.

https://github.com/ncruces/go-sqlite3/blob/main/vfs/README.m...

jt2190 [3 hidden]5 mins ago

The thing to keep in mind is that the WebAssembly sandbox model means that in theory the program (SqlLite in this case) can run wherever it makes sense to run it. That might mean running it locally or it might mean running on a central server or it might mean running nearby on the “edge”.

deskr [3 hidden]5 mins ago

Sadly, 1.3 MB is nothing on the modern web, especially for a static file. BBC's frontpage loads 3.78 MB.

https://www.bbc.co.uk/

sgbeal [3 hidden]5 mins ago

> BBC's frontpage loads 3.78 MB.

FWIW: Google Drive just downloaded 15.4mb to boot up for me and imdb dot com hit some 7+mb before it started auto-loading videos on top of that.

hawski [3 hidden]5 mins ago

Is there a way to statically compile an application with SQLite and the result WASM was smaller. So for example I have an app that would use only a specific subset of SQLite. Could the SQLite's WASM be built with this in mind cutting down on code that is not used? Or is there a way to prune it having the used API surface?

In a regular compiler/linker scenario it would just be a static compilation. Here we have a JS app and WASM library.

sgbeal [3 hidden]5 mins ago

> Could the SQLite's WASM be built with this in mind cutting down on code that is not used?

The pending 3.47 release has some build-side tweaks which enable a user to strip it down to "just the basics," but we've not yet been able to get it smaller than about 25-30% less than it otherwise is:

    cd ext/wasm
    make barebones=1 ; # requires GNU Make and the Emscripten SDK

Doing that requires building it yourself - there are no plans to publish deliverables built that way.

The build process also supports including one's own C code, which could hypothetically be used to embed an application and the wasm part of the library (as distinct from the JS part) into a single wasm file. Its primary intended usage is to add SQLite extensions which are not part of the standard amalgamation build.

> Or is there a way to prune it having the used API surface?

Not with the provided JS pieces. Those have to expose essentially the whole C library, so they will not be pruned from the wasm file.

However, you could provide your own JS bindings which only use a small subset of the API, and Emscripten is supposedly pretty good about stripping out C-side code which neither explicitly exported nor referenced anywhere. You'd be on your own - that's not something we'll integrate into the canonical build process - but we could provide high-level support, via the project's forum, for folks taking that route.

hoten [3 hidden]5 mins ago

Since SQL takes arbitrary strings as input, this would require explicit compiler flags to disable the knobs you don't want. Can't rely on excluding unused symbols really.

bhelx [3 hidden]5 mins ago

I used the wasm build of sqlite and the Chicory runtime to create a pure JVM executed sqlite library: https://github.com/dylibso/sqlite-zero

It's more of an experiment than an attempt to make something production ready, though I could see it being useful to bring dependency-less sqlite tooling to the JVM ecosystem.

ncruces [3 hidden]5 mins ago

What's the file system access like, WASI?

bhelx [3 hidden]5 mins ago

Chicory has some partial wasip1 support. https://github.com/dylibso/chicory/tree/main/wasi. We use jimfs to keep things simple and secure (and not worry about exposing the real filesystem): https://github.com/google/jimfs

When I did this experiment a few months ago, what we could accomplish was pretty limited. I could load and query databases, but not write to them. However the Chicory wasip1 implementation is advancing.

BTW, we've borrowed a few ideas from wazero so thanks for your work there :)

ncruces [3 hidden]5 mins ago

If the goal is to improve Chicory WASI support, this is the way.

If the goal was pure Java SQLite¹, a VFS from scratch would be better.

I think since I started my Go/wazero effort, WASI+SQLite improved a bunch. I had to start with the demo VFS; the Unix VFS now builds. But custom VFS is still the way to go, IMO.

And thanks! My contributions to wazero were tiny. Best of luck with Chicory!

1: strong NestedVM vibes here; 11 years ago… gosh, I feel old now. https://stackoverflow.com/questions/18186507/pure-java-vs-na...

bhelx [3 hidden]5 mins ago

> If the goal was pure Java SQLite¹, a VFS from scratch would be better.

agreed, though this was more an experiment to test Chicory once we built initial wasi support. I'd love to see it picked up and improved. I think that's the direction I'd go if i want some kind of production ready library.

jjcm [3 hidden]5 mins ago

As a general question, in what scenarios is it more beneficial to send the full DB and let the browser handle the queries? Maybe phrased a better way - when would I use this to improve a user experience over the traditional server-hosted db model?

ThatPlayer [3 hidden]5 mins ago

Personally I'm using it for a statically hosted website, so a server-hosted database was never an option. Also with the right driver, it's possible to stream the chunks of the database as needed rather than sending the full database: https://github.com/mmomtchev/sqlite-wasm-http

I can even do Sqlite's full text search without downloading the entire FTS database. Just most of it, if the search term is short enough.

bryanrasmussen [3 hidden]5 mins ago

>when would I use this to improve a user experience over the traditional server-hosted db model?

just my intuition when I read the headline of this post - something like the interplay between PouchDB and CouchDB for offline first apps

https://medium.com/offline-camp/couchdb-pouchdb-and-hoodie-a...

pdyc [3 hidden]5 mins ago

i am creating host of dashboards which directly talk to different services with very little data on my own server that is used for access control and token management only so actual data never comes to my servers. This kind of app is a good candidate for client side embedded db.

harrisi [3 hidden]5 mins ago

For offline use it can be good when dealing with large amounts of data. Anything from like an audio library to 3D modeling software. Changes can be made locally and persisted and then you can sync things server side regularly or when online again.

koolala [3 hidden]5 mins ago

The CORS restrictions / needing SharedArrayBuffer support kinda stinks.

There is no way to use Sqlite3 off-thread without memory sharing? Couldn't postMessage work to pass data to the sqlite thread by using the third Transfer argument?

Would postMessage transfer allow memory to be stored in a sqlite wasm database running a worker off-thread?

Refering to this implementation's docs: https://github.com/sqlite/sqlite-wasm

sgbeal [3 hidden]5 mins ago

> The CORS restrictions / needing SharedArrayBuffer support kinda stinks.

We have no CORS restrictions but one specific (and optional) VFS requires COOP/COEP for SharedArrayBuffer. If SharedArrayBuffer isn't available, that VFS won't load, but the rest of the library will plod along just fine:

https://sqlite.org/wasm/doc/tip/persistence.md

baudaux [3 hidden]5 mins ago

I definitely have to put sqlite in https://exaequOS.com

outlore [3 hidden]5 mins ago

i’ve been looking for a Tanstack Query style library that is backed by Sqlite (backed by OPFS or some other browser storage) and syncs with an API in the background. Does anything like that exist? i’ve seen ElectricSQL and other sync engines but they are a bit opinionated. I’m pretty new to local-first but i feel like the developer ergonomics are not quite there yet

Meanwhile for “local-only” it would be great to use sqlite in the browser + native file system API so that the db could be stored on the user’s file system and we wouldn’t have to worry about browser storage eviction. i think that could really open up a whole world of privacy preserving offline software delivered through the browser

gagik_co [3 hidden]5 mins ago

I have working to replicate TanStack query experience by writing my own queries wrapped around PowerSync, although actually on Flutter (using Flutter Hooks! which was cool to use coming from React). It’s a very internal design meant for my app tetr[1] right now (and actually being migrated over from Realm). I am hoping to potentially standardize it and publish a package once it’s mature enough but not too wrap to make your own hooks around them for your needs.

[1] https://tetr.app

outlore [3 hidden]5 mins ago

very cool product page!

gagik_co [3 hidden]5 mins ago

thank you!

ochiba [3 hidden]5 mins ago

Not sure if you've looked at PowerSync yet: https://www.powersync.com/ (I'm on the team)

For the read path it hooks into Postgres logical replication or MongoDB change streams (and MySQL binlog soon). It supports partial syncing using declarative rules. For the write path, it allows writing to the local SQLite database and also places writes into an upload queue, and then uses a developer-defined function to upload writes to the backend API.

We did a deep dive on current options for SQLite on the web, and are currently using an IndexedDB-based VFS, and looking to move to OPFS: https://www.powersync.com/blog/sqlite-persistence-on-the-web

We recently released an integration with TanStack Query to allow leveraging some of its features in conjunction with PowerSync: https://docs.powersync.com/client-sdk-references/js-web/java...

> Meanwhile for “local-only” it would be great to use sqlite in the browser + native file system API so that the db could be stored on the user’s file system and we wouldn’t have to worry about browser storage eviction. i think that could really open up a whole world of privacy preserving offline software delivered through the browser

Agreed. This is a limitation of IndexedDB and OPFS as persistent browser storage currently

outlore [3 hidden]5 mins ago

ooo i haven’t! will check PowerSync out :)

matlin [3 hidden]5 mins ago

I've made just this!

Docs for it: https://www.triplit.dev/docs/frameworks/tanstack-router#exam...

It by default uses IndexedDB but can also use SQLite but does real time, relational querying and (optionally) syncs with you server.

Re: developer ergonomics, this is our primary focus so I don't love to get your feedback on!

outlore [3 hidden]5 mins ago

thanks for sharing! will take a look :)

netghost [3 hidden]5 mins ago

ElectricSQL and friends seem to be the best option so far, but they all come with a lot of caveats. It feels like local-first is near, and it's so tantalizing, but I haven't seen anything that feels like it's done enough to build on just yet.

me551ah [3 hidden]5 mins ago

After years of being able to run SQLite on my mobile phone, my tv, my router and gaming consoles, I can finally run it on my browser. Which also happens to be running on the most powerful machine I own

ibash [3 hidden]5 mins ago

surprise! it's been there for decades: https://en.wikipedia.org/wiki/Web_SQL_Database

adregan [3 hidden]5 mins ago

It was there for a decade: https://caniuse.com/sql-storage

joemi [3 hidden]5 mins ago

I wonder why it was unmaintained/dropped. Was there something wrong with it, and if so, would that also apply to this kind of wasm implementation?

debugnik [3 hidden]5 mins ago

Mozilla refused to support it because then every implementation would have simply used SQLite, which would have promoted any implementation details to a de facto standard. (Even caniuse erroneously describes the feature as "allows SQLite database queries".)

From the latest spec [1]:

> The specification reached an impasse: all interested implementors have used the same SQL backend (Sqlite), but we need multiple independent implementations to proceed along a standardisation path.

[1]: https://www.w3.org/TR/webdatabase/

This won't be a problem for wasm SQLite because it isn't a standard being shipped by browsers, just another dependency.

xyc [3 hidden]5 mins ago

i have a feeling that it set back the web by a decade

https://x.com/chxy/status/1822858746307170640

akira2501 [3 hidden]5 mins ago

Did they really assume that they were going to be able to _restandardize_ SQL? No wonder IndexDB is hot useless garbage.

The standardization issues around SQL already exist, are already widely known, and where common workarounds are already in practice. It's also an open source project that could have _easily_ incorporated compatibility code for this specific use case anyways.

They made blind fealty to process more important than the outcome to end users.

What a waste.

sgbeal [3 hidden]5 mins ago

> Did they really assume that they were going to be able to _restandardize_ SQL?

The core issue was not the SQL, but the underlying storage engine.

Microsoft's office doc formats aside, standards bodies, as a general rule, require that a standard be built around multiple independent, conformant implementations. WebSQL did not measure up there because all vendors implementing it chose the only viable option they had for the underlying storage: SQLite.

akira2501 [3 hidden]5 mins ago

I've seen and understood that in hardware. I'm less convinced it's reasonable in this context. If there wasn't an _independent_ implementation it might barely make sense, but given that sqlite exists separately and can be installed separately, this is more of a "package management" or "system library access" problem than it is one of actual standardization by an independent body.

Which if you standardized this generic interface you could have multiple independent database engines available not just ones that are SQL based, although, SQL in particular was readily available through SQLite, and would clearly be exceptionally popular.

Huge miss when organizations stick the the routine rather then take an opportunity to explore and examine new ways forward. Meanwhile everyone is being held hostage by V2 to V3 manifest changes by _one_ vendor.

gnarbarian [3 hidden]5 mins ago

How long until we see WebAssembly/WebGPU become a platform independent choice for deploying server side code as well?

stackskipton [3 hidden]5 mins ago

SRE here, it's currently happening in a few parts but overall, it's not as attractive on server side. Server Side code running is mostly a solved problem and for very few organizations, the benefits of WASM don't outweigh any difficulties in getting it running.

6gvONxR4sf7o [3 hidden]5 mins ago

> Server Side code running is mostly a solved problem

I know what you mean here, but I think we're very limited in what we tend to run. Polyglot programming still isn't really a thing, and with things like WASI standardized (someday soon I hope), I could imagine it becoming a lot nicer.

stackskipton [3 hidden]5 mins ago

I feel like Polyglot programming in single app sounds like a nightmare but as non dev, meh, whatever.

evacchi [3 hidden]5 mins ago

Shameless plug for Dylibso's latest beta launch :) https://www.getxtp.com/blog/meet-xtp

disclaimer: I work there

ruined [3 hidden]5 mins ago

as soon as wasi is settled

gnarbarian [3 hidden]5 mins ago

https://wasi.dev/

wow I didn't know this was a thing. thanks for filling me in!

Thaxll [3 hidden]5 mins ago

It's kind of dying on the server, some people thought it would replace containers.

paulddraper [3 hidden]5 mins ago

Yesterday?

There's a number of WASM platforms/tools: Wasmer, wasmCloud, a few others that escape my memory.

jt2190 [3 hidden]5 mins ago

https://wasmer.io/

https://wasmcloud.com/

https://wasmtime.dev/

baq [3 hidden]5 mins ago

(Previously discussed 7 months ago: https://news.ycombinator.com/item?id=39477457)

benthecarman [3 hidden]5 mins ago

Whats needed is a rust-wasm compatible library that can use this.

aabhay [3 hidden]5 mins ago

If you're interested in contributing to that, here is a good place to start (in early discussion stages):

https://github.com/rhashimoto/wa-sqlite/discussions/154

runarberg [3 hidden]5 mins ago

I’m working on a hobby-project that uses IndexedDB for persistent client-side storage, and it really feels like W3C made some very bad design decision and than instead of fixing they they have just given up on the standard. Issues like not being able to index values in objects in arrays [1] (not even in fixed position e.g. "key.path.[0].value") despite almost a decade of developers asking for it, a very limited query syntax, and even the documentation on MDN seems of very lower quality than the rest of the web docs.

I’m happy that we are actually be able to use SQL in the browser now (although I would rather skip the MBs of the bundle bloat). But I feel like the standards committee will now have even less of a reason to fix the very broken state of IndexedDB.

1: https://github.com/w3c/IndexedDB/issues/35

dang-lover [3 hidden]5 mins ago

I like it

bsimpson [3 hidden]5 mins ago

And we loved you in Lethal Weapon.

akutlay [3 hidden]5 mins ago

Thankss for sharing

szundi [3 hidden]5 mins ago

We like that you like it