Database Setup So Easy, Your Cat Could Do It: Docker and Flyway Edition

https://easy-peasy.ai/ai-image-generator/images/curious-black-persian-cat-exploring-python-coding

Image source

Alright, folks, unless you’re one of those rare people who own a genius cat that can code (and if you are, we need to talk), setting up a local database might seem like a daunting task. Fear not! With Docker and Flyway, it’s so straightforward that even your cat could do it — well, theoretically. So let’s dive into it!

The need

If an application is using a database for persistence, then it will need one which it can connect to locally, in order to run itself or its (integration) tests. The question is, what’s a convenient and efficient way to set a database up like that?

Ideally we would have a database setup which:

  • is only used locally
  • has the same schema and data every time
  • can be built up and broken down whenever we want
  • is easy to re-create every time

Let’s take a closer look at these statements:

Only used locally

It is important that the tasks we perform in local development do not affect our other environments (like staging or production). Data of each environment should only come from that environment to avoid pollution and potential confusion.

Has the same schema and data every time

The local database needs to be a reliable representation of our real database. The code expects a certain state and we need to guarantee it will find that state every time our database is created. Otherwise we can have anything from compilation failures to broken tests.

Can be built up and broken down whenever you want

The more control we have over this, the cooler the things we can do. How nice would it be if we could easily fire up the setup before a build and then break it down? And how nicer would it be if that was automatically happening by simply running the build?

Easy to re-create every time

The easier it is to re-create, the more likely we are to use it. I’m sure many of us have the experience of avoiding to run that terrible app locally because it’s just too much hassle.

Now, if only there was a setup that could guarantee all of the above…

https://easy-peasy.ai/ai-image-generator/images/cat-flying-cape-image

Image source

Flyway and Docker to the rescue!

Fortunately, we live at a time when tools such as these exist, that can help us in our valiant efforts to create a convenient local database setup.

Docker will help us set up a container that will host our database without hassle.

Flyway is going to make sure that every time we need to use our database, it will have the state we need it to.

Runthrough of the example project

I have created an example project using Kotlin-SpringBoot, with a simple database layer and an integration test for it.

We have a straightforward Person class

package com.example.lps.domain

import java.util.UUID

data class Person(
    val id: UUID,
    val name: String,
    val age: Int
)

and a Person repository with just two methods: insert and findById

package com.example.lps.repo

import com.example.lps.domain.*
import org.springframework.jdbc.core.*
import org.springframework.jdbc.core.namedparam.MapSqlParameterSource
import org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate
import org.springframework.stereotype.Repository
import org.springframework.transaction.annotation.*
import java.sql.*
import java.util.*

@Repository
class PersonRepository(
    private val jdbcTemplate: NamedParameterJdbcTemplate
) {

    fun insertPerson(person: Person) {
        val sql = "INSERT INTO Person (id, name, age) VALUES (:id, :name, :age)"
        jdbcTemplate.update(
            sql,
            MapSqlParameterSource(
                mapOf(
                    "id" to person.id,
                    "name" to person.name,
                    "age" to person.age
                )
            )
        )
    }

    fun findPersonById(id: UUID): Person? {
        val sql = "SELECT * FROM Person WHERE id = :id"
        return jdbcTemplate.query(sql, mapOf("id" to id), PersonRowMapper).firstOrNull()
    }

    private object PersonRowMapper : RowMapper<Person> {
        override fun mapRow(resultSet: ResultSet, rowNum: Int): Person = with(resultSet) {
            Person(
                id = UUID.fromString(getString("id")),
                name = getString("name"),
                age = getInt("age")
            )
        }
    }
}

And then, we have the integration test which will attempt to insert a Person by a certain id and then find it:

package com.example.lps.repo

import com.example.lps.domain.*
import org.junit.jupiter.api.Test
import org.springframework.beans.factory.annotation.*
import org.springframework.boot.test.context.*
import org.springframework.transaction.annotation.*
import java.util.*
import kotlin.test.*

@SpringBootTest
class PersonRepositoryIT {

    @Autowired
    private lateinit var personRepository: PersonRepository

    @Test
    @Transactional // using transactional so that the data is rolled back after the test is done
    fun `insert and find by id`() {
        val id = UUID.randomUUID()
        val person = Person(id, "John", 35)
        personRepository.insertPerson(person)

        val result = personRepository.findPersonById(id)
        assertEquals(person, result)
    }
}

Since this is a SpringBootTest, it will need to start up the app and create a real connection to a database based on our configuration.

Let’s see how we can configure that database locally.

Configuring the docker-maven plugin

For this guide, we will use the docker-maven plugin. The plugin (code part below) acts a bit like a docker-compose file which allows you to select the image for your database (1) , define a database name (2), method for authentication (3) and the port which the container will expose for the database (4).

<plugin>
    <groupId>io.fabric8</groupId>
    <artifactId>docker-maven-plugin</artifactId>
    <configuration>
     <images>
      <image>
       <name>postgres</name> (1)
       <run>
        <env>
         <POSTGRES_DB>postgres</POSTGRES_DB> (2)
         <POSTGRES_USER>my_user</POSTGRES_USER> (3)
         <POSTGRES_HOST_AUTH_METHOD>trust</POSTGRES_HOST_AUTH_METHOD> (3)
        </env>
        <ports>
         <port>5432:5432</port> (4)
        </ports>
        <wait>
         <log>database system is ready to accept connections</log>
        </wait>
       </run>
      </image>
     </images>
    </configuration>
    <executions>
     <execution>
      <id>start</id> (5)
      <phase>pre-integration-test</phase>
      <goals>
       <goal>stop</goal>
       <goal>start</goal>
      </goals>
     </execution>
     <execution>
      <id>stop</id> (6)
      <phase>post-integration-test</phase>
      <goals>
       <goal>stop</goal>
      </goals>
     </execution>
    </executions>
   </plugin>

The above configuration makes it so that when we trigger a maven build that includes an integration step (like verify or validate), then a container will be created before the tests (5), the tests will run, and then the container will be destroyed. (6)

It’s handy to also include a docker-compose file in case we want to connect to the database locally ourselves. The equivalent docker-compose file is included in the project and looks like this:

services:
  postgres:
    image: postgres
    restart: always
    environment:
      POSTGRES_USER: my_user
      POSTGRES_HOST_AUTH_METHOD: trust
      POSTGRES_DB: postgres
    ports:
      - "5432:5432"

I’ll quickly mention that since we are using SpringBoot, the app needs to have the following configuration in the application.yml so that it can connect to the database we are firing up:

spring:
  datasource:
    url: jdbc:postgresql://localhost:${pg.port:5432}/postgres
    username: my_user
    driver-class-name: org.postgresql.Driver

Configuring Flyway

Now we have the container set up, but we still need to make sure that every time it gets created, the database has the same schema and data. This is where we utilise Flyway migrations.

In a nutshell, Flyway works like this:

  • We define migration scripts which will run on the database one by one and in order
  • We have a flyway migrations table in our database that keeps track of the migration scripts that have already ran
  • When a migration is triggered, Flyway will check if the migration scripts in our code are matching the ones its aware of in the Flyway table.
  • If the migrations match, it will do nothing (up to date!)
  • If they don’t (and we have not messed up with the order or made changes to files already ran), it will run the new scripts in order

To achieve the above in our setup, we need to have the following configuration. First of, the Flyway dependency itself.

<dependency>
   <groupId>org.flywaydb</groupId>
   <artifactId>flyway-core</artifactId>
   <version>${flyway.version}</version>
  </dependency>
  <dependency>
   <groupId>org.flywaydb</groupId>
   <artifactId>flyway-database-postgresql</artifactId>
   <version>${flyway.version}</version>
   <scope>runtime</scope>
  </dependency>

We then need to define a table for Flyway to use to keep track of the migrations (1). Also, we need to tell Flyway to start a migration every time the app boots up (2). Both of these are done in our application.yml :

spring:
  datasource:
    url: jdbc:postgresql://localhost:${pg.port:5432}/postgres
    username: my_user
    driver-class-name: org.postgresql.Driver
  flyway:
    table: schema_version (1)
    enabled: true (2)

Lastly, we need to place migration files in the source/main/resources/db/migrations directory in an ascending manner.

Picture2

Each file name needs to start with an indication of the file order (could be date, could be versions, could be both or anything that makes sense).

We have two simple migration scripts:

  • One that creates the person table:
create table person
(
    id                    UUID       not null primary key,
    name                  text,
    age                   integer
);
  • And one that populates it with some initial data (not really necessary):
INSERT INTO person (id, name, age) VALUES (gen_random_uuid(), 'John Doe', 30);
INSERT INTO person (id, name, age) VALUES (gen_random_uuid(), 'Jane Smith', 25);
INSERT INTO person (id, name, age) VALUES (gen_random_uuid(), 'Alice Johnson', 28);
INSERT INTO person (id, name, age) VALUES (gen_random_uuid(), 'Bob Brown', 35);
INSERT INTO person (id, name, age) VALUES (gen_random_uuid(), 'Charlie Davis', 22);

Once the migrations are ran, we will see the schema_version table get populated:

Picture3

With this configuration, we should have achieved that every time the app starts (manually or in integration tests during builds), Flyway will make sure our database has the right schema and data by running all the migration files in order.

Now let’s see this all in action!

The result

First, let’s run the build with something that will trigger the integration tests:

mvn clean verify

We can take a look at some logs to see the effect of our setup:

Picture4

As we can see, before the ITs are ran, a container starts up to host our postgres database:

Picture5

Since for ITs the app will boot up, Flyway gets triggered and runs the migrations:

Picture6

After the tests are done, the container is brought down.

Success!

Final Words

Thank you very much for reading this article through to the end. I hope you learned something new and/or have been triggered to try this setup out for yourself.

Special thanks to Eryk Lewinson for inspiring me to write this and for reviewing it.

As a next step, I will look into test containers and share my findings, so stay tuned!