October 20, 2021

Using SQLite with Haskell

As a Haskell programmer, it’s quite natural to think about type safety almost always. Even when it comes to doing simplest of the things. There’s a whole bunch of debate on why dead simple database packages like postgres-simple or sqlite-simple should be avoided generally. The inlcination is towards libraries which offer more type safety (not to mention complexity comes along with it) like Persistent. It offers a lot of features which guarantees type safety with SQL queries. Under the hood, it makes a ton of template haskell usage to produce a large amount of types and type classes to simplify serialization to and from the database. But the reality is, persistent is not as easy to use as *-simple libraries mentioned earlier. And it’s not always worth it to make things complex for a bit of type safety. If our use case is simple enough (and even in most of the code bases with heavy database usage, it really is) we can probably get away with writing plain SQL queries with enough type safety for things to not break at runtime.

All that being said, it’s a nice idea to have type safe queries floating around in your codebase. Or you can do what I do, start with something simple and then iterate to make things better.

The idea of this blog is the present a nice and concise way of writing SQL queries using Haskell without having to fight the type system unnecessarily. And these queries do not deviate a lot from writing standard SQL either.

Let’s first create a new experimental table in an existing database schema. We’ll define a few data types based on top of this and carry on from there.

{-# LANGUAGE OverloadedStrings  #-}
{-# LANGUAGE QuasiQuotes        #-}
{-# LANGUAGE TemplateHaskell    #-}

import Database.SQLite.Simple (Connection (..), execute_)
import Database.SQLite.Simple.QQ (sql)

import qualified Data.Text as T

createPersonTable :: Connection -> IO ()
createPersonTable conn = execute_ conn [sql|
    CREATE TABLE IF NOT EXISTS person
        ( id           INTEGER  PRIMARY KEY AUTOINCREMENT
        , name         TEXT     NOT NULL
        , email        TEXT     UNIQUE NOT NULL
        , phone_number TEXT     UNIQUE NOT NULL
        , role         TEXT     NOT NULL
        -- ^ This could even be an enum type at the DB layer
        )
    |]

This essentially captures some basic information of a person. The catch here would be to define role in our haskell code base as a sum type. The reasoning behind this is we don’t want random user roles to corrupt our database. We need to set a bound on the the values allowed on it. Once we have it, the real trouble is inserting and fetching the data itself. That is, navigation of data to and from the database.

If we were to define data type Person which corresponds to the person table, it would look like:

import qualified Data.Text as T

data Person = Person
    { pUUID      :: !Integer
    , pName      :: !T.Text
    , pEmail     :: !T.Text
    , pRole      :: !Role
    , pNumber    :: !T.Text
    } deriving stock (Eq, Show)

Note that the above is not going to compile just yet because we still need to go ahead and implement Role type. So let’s do that first.

{-# LANGUAGE LambdaCase #-}

-- We are going to limit to only 3 user roles
data Role
    = Hashira
    | Kakushi
    | DemonLord
    deriving (Eq, Show, Read, Enum, Bounded)

roleToText :: Role -> T.Text
roleToText = \case
    Hashira    -> "hashira"
    Kakushi    -> "kakushi"
    DemondLord -> "demon lord

Cool! What now? Well, we’ve only setup the required data types that represent a person. We haven’t implemented anything that can let person communicate with the database itself. We need something (function, instance, miracle etc) in order to convert a sequence of fields (that we will fetch from the DB) to our haskell data type. Similary, we’ll need something to convert our haskell data type to a sequnce of fields that sqlite can understand while inserting data into the person table. Fortunately, sqlite-simple offers FromRow and ToRow typeclasses which is build exactly for this purpose!

import Control.Applicative
import Database.SQLite.Simple.FromRow (FromRow (..))
import Database.SQLite.Simple.Internal (Field (..))
import Database.SQLite.Simple.ToRow (ToRow (..))

-- Why so many @field@ though?
-- According to the docs,
-- The number of calls to @field@ must match the number of fields returned
-- in a single row of the query result. Otherwise, a @ConversionFailed@
-- exception will be thrown.
instance FromRow Person where
    fromRow = Person <$> field <*> field <*> field <*> field <*> field

-- We ignore pUUID as it is set to auto increment at the DB layer and is already
-- managed by SQLite. Ideally, you should let UUIDs be handled at the DB layer
-- itself unless you have a very strong reason not to
instance ToRow Person where
    toRow (Person _pUUID pName pEmail pRole pNumber) =
        toRow (pName, pEmail, pRole, pNumber)

At this point, we should have an understanding of how data conversions are taking place between the code and database. Let’s write a function real quick to see if we are able insert and fetch a person or not.

import Database.SQLite.Simple (Connection (..), NamedParam (..), executeNamed_, query_)

createNewPerson :: Connection -> IO ()
createNewPerson conn = executeNamed conn [sql|
    INSERT INTO person
        (name, email, phone_number, role)
    VALUES
        (:name, :email, :phoneNumber, :userRole)
    |] [ ":name"        := ("Rengoku" :: T.Text)
       , ":phoneNumber" := ("+91234567899" :: T.Text)
       , ":email"       := ("rengoku@demonslayer.com" :: T.Text)
       , ":userRole"    := Hashira
       ]

getAllPerson :: Connection -> IO [Person]
getAllPerson conn = query_ conn [sql|
    SELECT id
         , name
         , email
         , role
         , phone_number
      FROM person
|]

getAllPerson is very straight forward. But what in the name of sweet type classes is going on inside createNewPerson !?

Let’s go step by step. There are 3 main components to the function:

Connection type in the type signature
Named Parameters ( Weird looking equalities)
Structure of the query itself

And this is going to be the case for most of the queries you write which needs some sort of parameter substitution. At least, I’d recommend writing queries using named parameters in order to keep things clean and much more readable. We could’ve written createNewPerson without named parameters like so:

import Database.SQLite.Simple (execute)

createNewPerson :: Connection -> IO ()
createNewPerson conn = execute conn [sql|
    INSERT INTO person
        (name, email, phone_number, role)
    VALUES
        (?, ?, ?, ?)
    |] ( "Rengoku" :: T.Text
       , "+91234567899" :: T.Text
       , "rengoku@demonslayer.com" :: T.Text
       , Hashira
       )

But the readability reduces as soon as the query grows and there are bunch of substitution happening. Also, there is a slight chance of messing up the order of the arguments (eg: type safety won’t save you if two consecute Text type substitutions are happening) leading to a corrupt DB state. Using named parameters remove these shortcomings.

Alright enough talk, let’s actually see if this works or not. We’ll write a main method and fire up the repl to test and spam the code.

import Database.SQLite.Simple (close, open)

flushPersonTable :: Connection -> IO ()
flushPersonTable conn =
    execute_ conn [sql|
        DROP TABLE IF EXISTS person;
    |]

-- Drops the person table and creates a fresh new one
-- This is done in order to play with fresh data every time we
-- run @main@ in the repl. Not that it's a rule.
resetDb :: Connection -> IO ()
resetDb conn = do
    flushPersonTable conn
    createPersonTable conn

main :: IO ()
main = do
  -- Open up a connection to the database and this will be passed to each query we run.
  -- In practice, we should maintain a pool of connections so that we can run SQl
  -- queries in parallel. Also, this practice enables you to use queries inside a
  -- DB transaction.
  conn <- open "test.db"
  resetDb conn
  createNewPerson conn
  rows <- getAllPerson conn
  mapM_ print rows
  close conn

Aw snap! We’ve run into errors. Out of all the things GHC is complaining, we focus ourselves on the following two first

 src/SqliteSimple.hs:54:56: error:
    • No instance for (FromField Role) arising from a use of ‘field’
    • In the second argument of ‘(<*>)’, namely ‘field’
      In the first argument of ‘(<*>)’, namely
        ‘Person <$> field <*> field <*> field <*> field’
      In the expression:
        Person <$> field <*> field <*> field <*> field <*> field
   |
54 |     fromRow = Person <$> field <*> field <*> field <*> field <*> field
   |                                                        ^^^^^

src/SqliteSimple.hs:58:56: error:
    • No instance for (ToField Role) arising from a use of ‘toRow’
    • In the expression: toRow (pName, pEmail, pRole, pNumber)
      In an equation for ‘toRow’:
          toRow (Person _pUUID pName pEmail pRole pNumber)
            = toRow (pName, pEmail, pRole, pNumber)
      In the instance declaration for ‘ToRow Person’
   |
58 |     toRow (Person _pUUID pName pEmail pRole pNumber) = toRow (pName, pEmail, pRole, pNumber)
   |                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Previously, we defined a FromRow and ToRow instance when we wanted to convert a collection of Haskell types to and from a collection of SQL types. But what happens when we introduce a new data type which is not understood by GHC natively. In our case, when we defined Role, we were good as GHC knew how to interpret sum types but it doesn’t know the intentions behind it’s usage. In other words, how we want Role to intereact with other data types (internally and from other packages) is left entirely to us. We tell GHC our intentions by impementing different instances for it. Therefore, we’ll have to define ToField and FromField instances so that the compiler knows how to convert Role to an SQL data type. Also, these implementations are used by ToRow and FromRow. What we had earlier was just a step in the right direction but it was incomplete.

import Database.SQLite.Simple.FromField (Field (..), FromField (..), returnError)
import Database.SQLite.Simple.Ok (Ok (..))
import Database.SQLite.Simple.ToField (ToField (..))

-- This prepares a value for substitution into a query string.
instance ToField Role where
    toField = SQLText . roleToText

instance FromField Role where
    fromField (Field (SQLText "hashira") _) = Ok Hahsira
    fromField (Field (SQLText "kakushi") _) = Ok Kakushi
    fromField (Field (SQLText "demon lord") _) = Ok DemonLord
    fromField f = returnError ConversionFailed f "role does not exist"

One of the question from above can be regarding the definition of ToField and FromField instances of Role. How did we come up with that? Well, the secret ingredient here is reading the docs. According to the docs, The way ToField is defined for Text types are:

instance ToField T.Text where
    toField = SQLText
    {-# INLINE toField #-}

instance FromField T.Text where
    fromField (Field (SQLText txt) _) = Ok txt
    fromField f = returnError ConversionFailed f "need a text"

It is quite clear from the above on how we can define our own version of instance ToField Role given that we want a text type in the database. If we try reloading our REPL now and run main, we can see that we have a successful transaction between Haskell layer and DB layer:

*SqliteSimple> main
Person {pUUID = 1, pName = "Rengoku", pEmail = "rengoku@demonslayer.com", pRole = Hashira, pNumber = "+91234567899"}
*SqliteSimple>

This was a high level tutorial on how to use sqlite-simple (or any library like postgres-simple and mysql-simple) for that matter. Note that postgres-simple does not ship named parameters natively. postgres-simple-named would be a suitable choice for that as it is built on top of postgres-simple. Anyways, we’ve only seen how to play with very basic data types. Let’s get a bit more creative. We can play with –

timestamps
Joins
JSON structures

to see the usefulness and ease of writing SQL queries with haskell.

Let’s introduce a created_at value for a person. Let’s make the following changes,

import Data.Time (UTCTime (..))

createPersonTable :: Connection -> IO ()
createPersonTable conn = execute_ conn [sql|
    CREATE TABLE IF NOT EXISTS person
        ( id           INTEGER  PRIMARY KEY AUTOINCREMENT
        , name         TEXT     NOT NULL
        , email        TEXT     UNIQUE NOT NULL
        , phone_number TEXT     UNIQUE NOT NULL
        , role         TEXT     NOT NULL -- This could even be an enum type at the DB layer
        -- Changed --
        , created_at   IMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
        ------------
        )
    |]

data Person = Person
    { pUUID      :: !Integer
    , pName      :: !T.Text
    , pEmail     :: !T.Text
    , pRole      :: !Role
    , pNumber    :: !T.Text
    -- Changed --
    , pCreatedAt :: !UTCTime
    -------------
    } deriving stock (Eq, Show)

-- Now we need to handle the added field in our instances
-- Compiler will scream at you that certain fields are missing
instance FromRow Person where
    fromRow = Person <$> field <*> field <*> field <*> field <*> field <*> field

-- when inserting a new person, ignore UUID and created_at (Handled by SQLite).
instance ToRow Person where
    toRow (Person _pUUID pName pEmail pRole pNumber _pJoinedAt) =
        toRow (pName, pEmail, pRole, pNumber)

Now if we reload the REPL and run main, we get:

*SqliteSimple> :r
[1 of 1] Compiling SqliteSimple     ( src/SqliteSimple.hs, interpreted )
Ok, one module loaded.
*SqliteSimple> main
Person {pUUID = 1, pName = "Rengoku", pEmail = "rengoku@demonslayer.com", pRole = Hashira, pNumber = "+91234567899", pJoinedAt = 2021-10-25 07:21:54 UTC}
*SqliteSimple>

Notice the pJoinedAt at the end and the fact that we didn’t have to write a custom FromField instance for a timestamp value. That’s because sqlite-simple ships instance FromField UTCTime which gets imported while importing FromField (..). This reasoning holds true even for other Text and Int types.

Now let’s create another entity so we can experiment with SQL joins.

-- A data type that represent the favourite anime and character of a user
data FavouriteAnime = FavouriteAnime
    { faUUID               :: !Integer
    , faAnimeName          :: !T.Text
    , faFavouriteCharacter :: !T.Text
    , faUserName           :: !T.Text
    } deriving (Eq, Show)

instance FromRow FavouriteAnime where
    fromRow = FavouriteAnime <$> field <*> field <*> field <*> field

createAnimeTable :: Connection -> IO ()
createAnimeTable conn = execute_ conn [sql|
    CREATE TABLE IF NOT EXISTS favourite_anime
        ( id                  INTEGER PRIMARY KEY AUTOINCREMENT
        , anime_name          TEXT    NOT NULL
        , favourite_character TEXT    UNIQUE
        , person_id           INTEGER NOT NULL

        -- We reference the person_id but we need to fetch the
        -- person name as per our Haskell data type
        , FOREIGN KEY(person_id) REFERENCES person(id)
        )
    |]

insertFavouriteAnime :: Connection -> IO ()
insertFavouriteAnime conn = executeNamed conn [sql|
    INSERT INTO favourite_anime
        (anime_name, favourite_character, person_id)
    VALUES
        (:animeName, :favouriteCharacter, :personId)
    |] [ ":animeName"          := ("Attack on titans" :: T.Text)
       , ":favouriteCharacter" := ("Levi Ackerman" :: T.Text)
       , ":personId"           := (1 :: Int)
       ]

-- This example demonstrates just one LEFT JOIN for simplicity purposes.
-- But, if you can write a plain SQL query in SQLite, you can write it here.
-- So it can be as complicated as you want it to be.
-- RECOMMENDED: Keep things simple.
getAllFavAnime :: Connection -> IO [FavouriteAnime]
getAllFavAnime conn = query_ conn [sql|
    SELECT fa.id
         , fa.anime_name
         , fa.favourite_character
         , p.name
    FROM favourite_anime fa
    LEFT JOIN person p ON p.id = fa.person_id
    |]

flushAnimeTable :: Connection -> IO ()
flushAnimeTable conn =
    execute_ conn [sql|
        DROP TABLE IF EXISTS favourite_anime;
    |]

-- And the corresponding changes to @resetDb@ and @main@
resetDb :: Connection -> IO ()
resetDb conn = do
    -- drop all the tables
    flushPersonTable conn
    flushAnimeTable conn

    -- create fresh ones
    createPersonTable conn
    createAnimeTable conn

main :: IO ()
main = do
    conn <- open "test.db"
    resetDb conn

    createNewPerson conn
    insertFavouriteAnime conn

    pRows <- getAllPerson conn
    faRows <- getAllFavAnime conn
    mapM_ print pRows
    mapM_ print faRows
    close conn

Let’s reload the REPL and run the main function,

*SqliteSimple> :r
[1 of 1] Compiling SqliteSimple     ( src/SqliteSimple.hs, interpreted )
Ok, one module loaded.
*SqliteSimple> main
Person {pUUID = 1, pName = "Rengoku", pEmail = "rengoku@demonslayer.com", pRole = Hashira, pNumber = "+91234567899", pJoinedAt = 2021-10-25 10:46:47 UTC}
FavouriteAnime {faUUID = 1, faAnimeName = "Attack on titans", faFavouriteCharacter = "Levi Ackerman", faUserName = "Rengoku"}

Just as expected!! These are just some basic stuff I wanted to write about. These can surely be extended to do a lot of complicated stuff as well if it fits the relational data model. So feel free to experiment as much as you want.

One concern now is that these queries are unsafe in nature. These will result into runtime errors and we want to avoid that. One of the ways of doing that is to wrap the queries provided by sqlite-simple into some Maybe or Either types to catch these kind of errors when running tests.

p.s. Always write tests.

← → Top