A Hands-on Exploration of SQLite for Production
SQLite is a versatile database, known for its simplicity and serverless architecture. It’s the go-to choice for countless applications, from mobile apps to small-scale web servers. However, deploying SQLite in a production environment, especially under concurrent workloads, requires a deeper understanding of its configuration. Simply using default settings can lead to significant performance bottlenecks or unexpected behavior.
This blog post will take you on a hands-on journey through various SQLite configurations, demonstrating their impact on throughput, concurrency, and backup strategies. We’ll explore why certain settings are considered “production-ready” and others are not, backing our insights with experiments.
The Throughput Experiment
The goal of this experiments is to demonstrate why PRAGMA journal_mode = WAL;
combined with PRAGMA synchronous = NORMAL; is the widely recommended
production default for high write throughput.
I performed 10,000 individual INSERT statements into a SQLite database under
different configurations of journal_mode and synchronous settings. Each
INSERT was executed in its own autocommitted transaction.. The key metric I
observed was “Inserts per second” (TPS).
| Journal Mode | Synchronous | Inserts/Sec (TPS) | Notes |
|---|---|---|---|
| DELETE | FULL | ~279 | Very bad performance. Each write forces a disk flush. High CPU overhead from constant I/O management. |
| DELETE | NORMAL | ~243 | Similar to FULL; still high disk I/O. |
| DELETE | OFF | ~1139 | Faster, but dangerous (risk of corruption/data loss on crash). |
| WAL | FULL | ~442 | Better than DELETE, but FULL sync overhead negates WAL benefits. Unnecessary CPU overhead. |
| WAL | NORMAL | ~33,135 | Optimal. Much faster. High performance, good durability, efficient OS batching of writes. |
| WAL | OFF | ~61,994 | Fastest, but dangerous (risk of corruption/data loss on crash). |
Default DELETE mode with synchronous=FULL is slow. It forces a physical disk
flush for every single write, leading to very low Transaction Per Second (TPS)
and high CPU overhead as the system constantly manages I/O.
WAL mode significantly boosts performance. In this mode, changes are appended
to a separate .wal file, which allows writers to proceed without blocking
readers.
In addition to the main .db file, two auxiliary files are created:
read_write_test.db-wal (the write-ahead log) and read_write_test.db-shm (the
shared-memory index). These files enable separate read/write paths and managing
concurrent access. Their size dynamically adjusts based on write activity and
checkpointing.
WAL with synchronous=NORMAL is the sweet spot. This combination achieved
~33,135 inserts/sec. That’s over 100 times faster than the default! NORMAL
synchronicity allows the operating system to batch disk writes, reducing CPU
overhead and providing a balance of high performance and data durability.
With synchronous=NORMAL, the database remains consistent after crashes, but
the most recent transactions may be lost during power failure.
synchronous=OFF offers the highest throughput but at an unacceptable risk of
data loss for most production scenarios.
So, the experiment proved that PRAGMA journal_mode = WAL; combined with
PRAGMA synchronous = NORMAL; is the optimal configuration for achieving high
write throughput in SQLite, with a big performance boost while still maintaining
a reasonable level of data safety. This combination takes advantage of the
append-only feature of WAL for writers and allows the OS to better manage disk
flushes. There’s lower CPU overhead compared to synchronous=FULL, making it
the industry standard for production deployments.
The Concurrency Experiment
The goal of this experiment is to understand the “database is locked” error and learn effective strategies to mitigate it in concurrent applications.
SQLite implements a “single writer, multiple readers” concurrency model. My experiments used two separate Python processes: one writer holding a lock, and another process (either a writer or a reader-then-writer) contending for access.
Baseline
In DELETE journal mode, when a writer had an active transaction, any other
process attempting to write to the database failed with an
sqlite3.OperationalError: database is locked. SQLite’s strict single-writer
policy means that writers obtain an exclusive lock, blocking other writers.
Patient Applications with PRAGMA busy_timeout
I configured the process with a busy_timeout of 3000ms.
Instead of failing instantly, the contending process hung for approximately 3
seconds before finally reporting the database is locked error. This 3-second
wait shows how busy_timeout impacts tail latency for contended write
operations.
busy_timeout instructs SQLite to retry acquiring a lock for the specified
duration. This is important for applications to handle transient locks,
preventing immediate errors during brief periods of high load.
Proactive Transactions with BEGIN IMMEDIATE vs. BEGIN DEFERRED
The default is BEGIN DEFERRED. Transactions acquire a lock only on the first
write operation. This allows other connections to read or even acquire locks if
the initial writer hasn’t yet performed a write.
BEGIN IMMEDIATE changes that. It acquires a RESERVED lock immediately upon
execution, signaling an intent to write. This prevents other connections from
starting their own write transactions, though read operations are still allowed.
My tests showed that a primary writer using BEGIN IMMEDIATE would cause a
contending reader-then-writer to successfully read, but then fail when
attempting to write (after a timeout).
BEGIN IMMEDIATE forces an earlier decision regarding locks. If a write lock
cannot be acquired at the very beginning of a transaction, the application knows
immediately, avoiding wasted computation on reads that would ultimately come
before a failed write.
You have to be thoughtful in your application design. Implementing PRAGMA busy_timeout allows applications to be “patient” and manage transient
contention, which controls tail latency. Alternatively, using BEGIN IMMEDIATE
makes applications proactive, claiming the write lock early and failing fast if
contention exists. Both are important for performant SQLite applications.
The Read-While-Writing Experiment
The goal of this experiment was to compare reader behavior during active writes
in DELETE versus WAL mode. It involved a writer continuously updating a
counter while a reader rapidly queried it.
In the test environment (Windows, Python sqlite3), readers did not raise
sqlite3.OperationalError: database is locked in either mode. However, further
analysis reveals that the mechanism for avoiding these errors differs
fundamentally between the two modes.
In DELETE mode, I didn’t get any errors, but I strongly suspect that was
because of the default behavior of the Python sqlite3 driver. The driver
includes a default timeout (5 seconds).
When the writer held an EXCLUSIVE lock to commit changes to the main database
file, concurrent readers received a SQLITE_BUSY status. Instead of failing
immediately, the Python driver silently paused execution until the writer
released the lock.
Readers were blocked during commit (although not throughout the entire write
transaction). They did not run in parallel with the writer; they waited their
turn. Readers saw a consistent “old” value only because they were forced to wait
until the writer finished, or they managed to slip in a read before the writer
committed. DELETE mode provides consistency through serialization. It ensures
valid data by preventing readers and writers from accessing the file at the
exact same time.
True Concurrency Through WAL Mode
In this mode, when the writer committed changes, it appended them to the -wal
file rather than locking the main database file. Concurrent readers were able to
continue reading the main database (and shared memory) without being blocked by
the writer.
Readers were non-blocking. They continued to execute reads simultaneously
while the writer was committing. Readers operated on a “snapshot” of the
database established at the start of their transaction. Even though the writer
was appending new data to the WAL, the readers ignored it, seeing a consistent
view of the past data without needing to wait. The separation of read and write
paths (via the .wal and .shm files) allows multiple readers and a single
writer to operate concurrently.
While both DELETE and WAL modes returned consistent data without throwing
exceptions, DELETE achieves consistency by stopping everything. On the other
hand, WAL achieves consistency by decoupling.
The Backup & Recovery Experiment
The goal of this experiment was to learn how to perform reliable “hot backups” of an SQLite database without stopping the application.
This involved a Python writer script continuously updating a database configured
with WAL journal mode, simulating a live application. While this writer was
active, I attempted two backup methods and verified their integrity.
Method 1: Operating System File Copy
A direct OS file copy (read_write_test.db to backup_os.db) passed PRAGMA integrity_check;.
Warning: Despite this result, direct OS file copying of a live SQLite database is still considered unreliable and unsafe. I wouldn’t run the risk of creating a corrupted backup if the copy, which might occur during active modification, especially without proper file system snapshotting.
In WAL mode, a correct backup requires copying the .db, .db-wal, and
.db-shm files together, ideally using filesystem snapshots. What I mean is,
you can do an OS file copy, but all three files must be copied atomically.
Method 2: SQLite’s VACUUM INTO Command
Executing VACUUM INTO 'backup_sqlite.db' on the live database resulted in
backup_sqlite.db also passing PRAGMA integrity_check;.
VACUUM INTO is a great built-in SQLite command specifically designed for
creating consistent copies of a database. It works by creating a new database
file and transactionally copying all content from the source, guaranteeing a
consistent snapshot even with ongoing writes.
For reliable hot backups of a live SQLite database, always use SQLite’s native
backup API or the VACUUM INTO command. It ensures transactional integrity and
guarantee a consistent snapshot of your database even while it’s actively being
modified. Avoid direct operating system file copies for live databases in
production.
Production-Ready SQLite Configuration Snippet
Here’s a snippet of recommended PRAGMA settings you can integrate into your application’s startup code. These should be run once when the application connects to the database.
-- Run these once when the app starts
PRAGMA journal_mode = WAL; -- Use Write-Ahead Logging for concurrency and performance
PRAGMA synchronous = NORMAL; -- Balance performance and durability
PRAGMA busy_timeout = 5000; -- Wait up to 5 seconds for locked tables (controls tail latency)
PRAGMA cache_size = -20000; -- Allocate 20MB for page cache (Negative = KiB, positive = number of pages)
PRAGMA foreign_keys = ON; -- Enforce relational integrity
Final Thoughts
My hands-on exploration showed me how powerful and flexible SQLite can be with the right configuration. Using the right journal modes, synchronous settings, concurrency models, and backup strategies means achieving high throughput, concurrency, and reliable data management. SQLite is more capable than some people think. It can definitely be used for many production applications.
Cover photo by Pawel Czerwinski on Unsplash.
Cover photo by Olli Kilpi on Unsplash .
Travis Horn