SQLite with Python
SQLite is a commonly misunderstood database. I myself first heard of
the database when I was experimenting with android app development. My
thoughts were that since the database was a single file and its name
had lite
in it, that it could not be taken seriously. But I could
not have been more wrong. Now I love sqlite and use it for much of my
scientific research. Sqlite is everywhere. It is on every android
phone, airplanes, and the storage for many computer applications. The
sqlite website details when using sqlite would be appropriate to use
and can be summarized with these three questions.
- Is the data separated from the application by a network? (nfs, glusterfs, ceph)
- Many concurrent writers? (> 100 writes / second)
- Big data? (> 140 Terabytes)
If you answered no to all of these questions then sqlite is a perfect
solution for you. Often times (1) is not an issue for a small number
of writers and I use it quite successfully with ceph. Now that you
have considered SQLite I want to show you how python makes it even
more awesome. SQLite is included with every python install so it is as
simple as import sqlite3
.
|
|
Next we connect to an sqlite database. The file does not have to
already exist. SQLite also allows an in memory database with the
filename :memory:
. The database is not persistent but will read and
write significantly faster. My tests show writes to be about 1000x
faster taking $10 \mu s$. The reason for the fast writes is that
the system call fsync does not need to be called. It has an overhead
of between $1 - 10 ms$.
|
|
Now lets create a simple table for demonstration. IF NOT EXISTS
is
an sqlite specific feature.
|
|
Insert myself the author into the database.
|
|
Next we might want to insert many people at once into the database under one transaction.
|
|
We can query using a simple select statement. But notice that it only returns tuples. This is not very efficient as we would like to get the column names as well.
|
|
('chris', 'ostrouchov', 27.0)
('john', 'smith', 32.0)
('jane', 'doe', 15.0)
To add column names to the cursor we can use the included row factory for sqlite. The python documentation states that it should be more performant than a custom implementation.
|
|
['first_name', 'last_name', 'age']
('chris', 'ostrouchov', 27.0)
['first_name', 'last_name', 'age']
('john', 'smith', 32.0)
['first_name', 'last_name', 'age']
('jane', 'doe', 15.0)
Once you are comfortable inserting, deleting, and querying rows of the
database the next step is to perform transactions. The python sqlite
interface provides a convenient way to do this with a context manager
for starting, committing, rolling back a transaction. Notice how
executing the failing sql statement this should fail
leads to bob
smith not being inserted into the database.
|
|
SQL statment failed with Operational error
('chris', 'ostrouchov', 27.0)
('john', 'smith', 32.0)
('jane', 'doe', 15.0)
Now finally to advanced features specific to SQLite and the python
interface. SQLite allows has five basic types NULL
, INTEGER
,
REAL
, TEXT
, and BLOB
. Suppose we would like to add datetime and
json support to the database. There are conversion functions you can
declare to and from the database.
For datetimes we declare this interface and add a birthday column to
person. Note that it is important to connect to the sqlite database
with the PARSE_DECLTYPES
option.
|
|
We can easily check that the columns have be added by using sqlite’s
special pragma statements. Here we print every column of the sqlite
table person
.
|
|
(0, 'first_name', 'TEXT', 0, None, 0)
(1, 'last_name', 'TEXT', 0, None, 0)
(2, 'age', 'FLOAT', 0, None, 0)
(3, 'birthday', 'DATETIME', 0, None, 0)
Lets look what happens when we insert a person with a birthday into the table.
|
|
('chris', 'ostrouchov', 27.0, None)
('john', 'smith', 32.0, datetime.datetime(1977, 4, 23, 0, 0))
('jane', 'doe', 15.0, None)
A commonly asked question is if it is possible to store json in an sql database. While this is a highly debated topic on if it is the right thing to do, it is certainly possible in sqlite. In fact using similar code with the adapters and converters we can create a simple json interface.
|
|
(0, 'first_name', 'TEXT', 0, None, 0)
(1, 'last_name', 'TEXT', 0, None, 0)
(2, 'age', 'FLOAT', 0, None, 0)
(3, 'birthday', 'DATETIME', 0, None, 0)
(4, 'interests', 'JSON', 0, None, 0)
Let us insert some interests for myself into the database. We now see
how adaptable sqlite is! However one thing to note is that sqlite has
no easy way of querying the json
column because to the library it is
just a binary blob. If you would like support directly from SQLite see
the json1 extension. Note that the json1 extension will have better
performance and allows expressions on paths within the json column. We
will show next how we can achieve the same scalabilty with Python and
perform complex expressions on paths within the json column.
|
|
(<class 'str'>, <class 'str'>, <class 'float'>, <class 'NoneType'>, <class 'dict'>)
('chris', 'ostrouchov', 27.0, None, {'books': ['C Programming Language', "Daniels' Running Formula"], 'hobbies': ['running', 'programming']})
(<class 'str'>, <class 'str'>, <class 'float'>, <class 'datetime.datetime'>, <class 'NoneType'>)
('john', 'smith', 32.0, datetime.datetime(1977, 4, 23, 0, 0), None)
(<class 'str'>, <class 'str'>, <class 'float'>, <class 'NoneType'>, <class 'NoneType'>)
('jane', 'doe', 15.0, None, None)
SQLite allow you to create functions and aggregate functions. The python interface to SQLite provides a convenient method of creating functions that will be directly executed within sqlite. These functions will run directly inside of sqlite.
Here we create a simple json path function to return subpath of a json data.
|
|
Using sqlite custom functions can really help performance for queries
that would normally return a large amount of data. In this example we
query what could be large json files. If perform json_path
on each
value we reduce the amount of data the query returns. Since a json
path can return another json object see SQLite CAST expression. This
allow you to state the type of the return expression. In this case
CAST json_path("books.1", interests) AS JSON
.
|
|
("Daniels' Running Formula",)
(None,)
(None,)
Overall python exposes a beautiful interface to sqlite. I use it on many of my projects including the advanced features such as the adapter functions and custom python functions. I hope this tutorial has been informative. Do share your views and corrections if I made a mistake.