Running integration tests with docker-compose in Google Cloud Build

This post is a direct follow-up to Microservices test architecture where I’ve introduced new kinds of tests to our example project.

Wild Workouts uses Google Cloud Build as CI/CD platform. It’s configured in a continuous deployment manner, meaning the changes land on production as soon as the pipeline passes. If you consider our current setup, it’s both brave and naive. We have no tests running there that could save us from obvious mistakes (the not-so-obvious mistakes can rarely be caught by tests, anyway).

In this article I will show how to run integration, component, and end-to-end tests on Google Cloud Build using docker-compose.


This is not just another article with random code snippets.

This post is part of a bigger series where we show how to build Go applications that are easy to develop, maintain, and fun to work with in the long term. We are doing it by sharing proven techniques based on many experiments we did with teams we lead and scientific research.

You can learn these patterns by building with us a fully functional example Go web application – Wild Workouts.

We did one thing differently – we included some subtle issues to the initial Wild Workouts implementation. Have we lost our minds to do that? Not yet. 😉 These issues are common for many Go projects. In the long term, these small issues become critical and stop adding new features.

It’s one of the essential skills of a senior or lead developer; you always need to keep long-term implications in mind.

We will fix them by refactoring Wild Workouts. In that way, you will quickly understand the techniques we share.

Do you know that feeling after reading an article about some technique and trying implement it only to be blocked by some issues skipped in the guide? Cutting these details makes articles shorter and increases page views, but this is not our goal. Our goal is to create content that provides enough know-how to apply presented techniques. If you did not read previous articles from the series yet, we highly recommend doing that.

We believe that in some areas, there are no shortcuts. If you want to build complex applications in a fast and efficient way, you need to spend some time learning that. If it was simple, we wouldn’t have large amounts of scary legacy code.

Here’s the full list of 14 articles released so far.

The full source code of Wild Workouts is available on GitHub. Don’t forget to leave a star for our project! ⭐

The current config

Let’s take a look at the current cloudbuild.yaml file. While it’s pretty simple, most steps are being run several times as we keep 3 microservices in a single repository. I focus on the backend part, so I will skip all config related to frontend deployment now.

  - id: trainer-lint
    name: golang
    entrypoint: ./scripts/
    args: [trainer]
  - id: trainings-lint
    name: golang
    entrypoint: ./scripts/
    args: [trainings]
  - id: users-lint
    name: golang
    entrypoint: ./scripts/
    args: [users]

  - id: trainer-docker
    entrypoint: ./scripts/
    args: ["trainer", "$PROJECT_ID"]
    waitFor: [trainer-lint]
  - id: trainings-docker
    entrypoint: ./scripts/
    args: ["trainings", "$PROJECT_ID"]
    waitFor: [trainings-lint]
  - id: users-docker
    entrypoint: ./scripts/
    args: ["users", "$PROJECT_ID"]
    waitFor: [users-lint]

  - id: trainer-http-deploy
    entrypoint: ./scripts/
    args: [trainer, http, "$PROJECT_ID"]
    waitFor: [trainer-docker]
  - id: trainer-grpc-deploy
    entrypoint: ./scripts/
    args: [trainer, grpc, "$PROJECT_ID"]
    waitFor: [trainer-docker]
  - id: trainings-http-deploy
    entrypoint: ./scripts/
    args: [trainings, http, "$PROJECT_ID"]
    waitFor: [trainings-docker]
  - id: users-http-deploy
    entrypoint: ./scripts/
    args: [users, http, "$PROJECT_ID"]
    waitFor: [users-docker]
  - id: users-grpc-deploy
    entrypoint: ./scripts/
    args: [users, grpc, "$PROJECT_ID"]
    waitFor: [users-docker]

Notice the waitFor key. It makes a step wait only for other specified steps. Some jobs can run in parallel this way.

Here’s a more readable version of what’s going on:

We have a similar workflow for each service: lint (static analysis), build the Docker image, and deploy it as one or two Cloud Run services.

Since our test suite is ready and works locally, we need to figure out how to plug it in the pipeline.

Docker Compose

We already have one docker-compose definition, and I would like to keep it this way. We will use it for:

  • running the application locally,
  • running tests locally,
  • running tests in the CI.

These three targets have different needs. For example, when running the application locally, we want to have hot code reloading. But that’s pointless in the CI. On the other hand, we can’t expose ports on localhost in the CI, which is the easiest way to reach the application in the local environment.

Luckily docker-compose is flexible enough to support all of these use cases. We will use a base docker-compose.yml file and an additional file with overrides just for the CI. You can run it by passing both files using the -f flag (notice there’s one flag for each file). Keys from the files will be merged in the provided order.

docker-compose -f docker-compose.yml -f up -d


Typically, docker-compose looks for the docker-compose.yml file in the current directory or parent directories. Using the -f flag disables this behavior, so only specified files are parsed.

To run it on Cloud Build, we can use the docker/compose image.

- id: docker-compose
  name: 'docker/compose:1.19.0'
  args: ['-f', 'docker-compose.yml', '-f', '', 'up', '-d']
  waitFor: [trainer-docker, trainings-docker, users-docker]

Since we filled waitFor with proper step names, we can be sure the correct images are present. This is what we’ve just added:

The first override we add to makes each service use docker images by the tag instead of building one from docker/app/Dockerfile. This ensures our tests check the same images we’re going to deploy.

Note the ${PROJECT_ID} variable in the image keys. This needs to be the production project, so we can’t hardcode it in the repository. Cloud Build provides this variable in each step, so we just pass it to the docker-compose up command (see the definition above).

    image: "${PROJECT_ID}/trainer"

    image: "${PROJECT_ID}/trainer"

    image: "${PROJECT_ID}/trainings"

    image: "${PROJECT_ID}/users"

    image: "${PROJECT_ID}/users"


Many CI systems use Docker today, typically running each step inside a container with the chosen image. Using docker-compose in a CI is a bit trickier, as it usually means running Docker containers from within a Docker container.

On Google Cloud Build, all containers live inside the cloudbuild network. Simply adding this network as the default one for our is enough for CI steps to connect to the docker-compose services.

Here’s the second part of our override file:

      name: cloudbuild

Environment variables

Using environment variables as configuration seems simple at first, but it quickly becomes complex considering how many scenarios we need to handle. Let’s try to list all of them:

  • running the application locally,
  • running component tests locally,
  • running component tests in the CI,
  • running end-to-end tests locally,
  • running end-to-end tests in the CI.

I didn’t include running the application on production, as it doesn’t use docker-compose.

Why component and end-to-end tests are separate scenarios? The former spin up services on demand and the latter communicate with services already running within docker-compose. It means both types will use different endpoints to reach the services.


For more details on component and end-to-end tests see the previous article. The TL;DR version is: we focus coverage on component tests, which don't include external services. End-to-end tests are there just to confirm the contract is not broken on a very high level and only for the most critical path. This is the key to decoupled services.

We already keep a base .env file that holds most variables. It’s passed to each service in the docker-compose definition.

Additionally, docker-compose loads this file automatically when it finds it in the working directory. Thanks to this, we can use the variables inside the yaml definition as well.

      context: docker/app
      # The $PORT variable comes from the .env file
      - "$PORT"
      # All variables from .env are passed to the service
      - .env
    # (part of the definition omitted)

We also need these variables loaded when running tests. That’s pretty easy to do in bash:

source .env
# exactly the same thing
. .env

However, the variables in our .env file have no export prefix, so they won’t be passed further to the applications running in the shell. We can’t use the prefix because it wouldn’t be compatible with the syntax docker-compose expects.

Additionally, we can’t use a single file with all scenarios. We need to have some variable overrides, just like we did with the docker-compose definition. My idea is to keep one additional file for each scenario. It will be loaded together with the base .env file.

Let’s see what’s the difference between all scenarios. For clarity, I’ve included only users-http, but the idea will apply to all services.

ScenarioMySQL hostFirestore hostusers-http addressFile
Running locallylocalhostlocalhostlocalhost:3002.env
Local component testslocalhostlocalhostlocalhost:5002.test.env
CI component
Local end-to-end tests--localhost:3002.e2e.env
CI end-to-end

Services ran by docker-compose use ports 3000+, and component tests start services on ports 5000+. This way, both instances can run at the same time.

I’ve created a bash script that reads the variables and runs tests. Please don’t try to define such a complex scenario directly in the Makefile. Make is terrible at managing environment variables. Your mental health is at stake.

The other reason for creating a dedicated script is that we keep 3 services in one repository and end-to-end tests in a separate directory. If I need to run the same command multiple times, I prefer calling a script with two variables rather than a long incantation of flags and arguments.

The third argument in favor of separate bash scripts is they can be linted with shellcheck.

set -e

readonly service="$1"
readonly env_file="$2"

cd "./internal/$service"
env $(cat "../../.env" "../../$env_file" | grep -Ev '^#' | xargs) go test -count=1 ./...

The script runs go test in the given directory with environment variables loaded from .env and the specified file. The env / xargs trick passes all variables to the following command. Notice how we remove comments from the file with grep.


Testing cache

go test caches successful results, as long as the related files are not modified.

With tests that use Docker, it may happen that you will change something on the infrastructure level, like the docker-compose definition or some environment variables. go test won’t detect this, and you can mistake a cached test for a successful one.

It’s easy to get confused over this, and since our tests are fast anyway, we can disable the cache. The -count=1 flag is an idiomatic (although not obvious) way to do it.

Running tests

I have the end-to-end tests running after tests for all the services passed. It should resemble the way you would usually run them. Remember, end-to-end tests should work just as a double-check, and each service’s own tests should have the most coverage.

Because our end-to-end tests are small in scope, we can run them before deploying the services. If they were running for a long time, this could block our deployments. A better idea in this scenario would be to rely on each service’s component tests and run the end-to-end suite in parallel.

- id: trainer-tests
  name: golang
  entrypoint: ./scripts/
  args: ["trainer", ""]
  waitFor: [docker-compose]
- id: trainings-tests
  name: golang
  entrypoint: ./scripts/
  args: ["trainings", ""]
  waitFor: [docker-compose]
- id: users-tests
  name: golang
  entrypoint: ./scripts/
  args: ["users", ""]
  waitFor: [docker-compose]
- id: e2e-tests
  name: golang
  entrypoint: ./scripts/
  args: ["common", ""]
  waitFor: [trainer-tests, trainings-tests, users-tests]

The last thing we add is running docker-compose down after all tests have passed. This is just a cleanup.

- id: docker-compose-down
  name: 'docker/compose:1.19.0'
  args: ['-f', 'docker-compose.yml', '-f', '', 'down']
  waitFor: [e2e-tests]

The second part of our pipeline looks like this now:

Here’s how running the tests locally looks like (I’ve introduced this make target in the previous article). It’s exactly the same commands as in the CI, just with different .env files.

    @./scripts/ common .e2e.env
    @./scripts/ trainer .test.env
    @./scripts/ trainings .test.env
    @./scripts/ users .test.env

Separating tests

Looking at the table from the previous article, we could split tests into two groups, depending on whether they use Docker.

Docker databaseNoYesYesYes

Unit tests are the only category not using a Docker database, while integration, component, and end-to-end tests do.

Even though we made all our tests fast and stable, setting up Docker infrastructure adds some overhead. It’s helpful to run all unit tests separately, to have a first guard against mistakes.

We could use build tags to separate tests that don’t use Docker. You can define the build tags in the first line of a file.

// +build docker

We could now run unit tests separately from all tests. For example, the command below would run only tests that need Docker services:

go test -tags=docker ./...

Another way to separate tests is using the -short flag and checking testing.Short() in each test case.

In the end, I decided to not introduce this separation. We made our tests stable and fast enough that running them all at once is not an issue. However, our project is pretty small, and the test suite covers just the critical paths. When it grows, it might be a good idea to introduce the build tags in component tests.

Digression: A short story about CI debugging

While I was introducing changes for this post, the initial test runs on Cloud Build kept failing. According to logs, tests couldn’t reach services from docker-compose.

So I started debugging and added a simple bash script that would connect to the services via telnet.

To my surprise, connecting to mysql:3306 worked correctly, but firestore:8787 didn’t, and the same for all Wild Workouts services.

I thought this is because docker-compose takes a long time to start, but any number of retries didn’t help. Finally, I decided to try something crazy, and I’ve set up a reverse SSH tunnel from one of the containers in docker-compose.

This allowed me to SSH inside one of the containers while the build was still running. I then tried using telnet and curl, and they worked correctly for all services.

Finally, I spotted a bug in the bash script I’ve used.

readonly host="$1"
readonly port="$1"

# (some retries code)

telnet "$host" "$port"

The typo in variables definition caused the telnet command to run like this: telnet $host $host. So why it worked for MySQL? It turns out, telnet recognizes ports defined in /etc/services. So telnet mysql mysql got translated to telnet mysql 3306 and worked fine, but it failed for any other service.

Why the tests failed though? Well, it turned out to be a totally different reason.

Originally, we connected to MySQL like this:

config := mysql.Config{
   Addr:      os.Getenv("MYSQL_ADDR"),
   User:      os.Getenv("MYSQL_USER"),
   Passwd:    os.Getenv("MYSQL_PASSWORD"),
   DBName:    os.Getenv("MYSQL_DATABASE"),
   ParseTime: true, // with that parameter, we can use time.Time in mysqlHour.Hour

db, err := sqlx.Connect("mysql", config.FormatDSN())
if err != nil {
    return nil, errors.Wrap(err, "cannot connect to MySQL")

I looked into environment variables, and all of them were filled correctly. After adding some fmt.Println() debugs, I’ve found out the config’s Addr part is completely ignored by the MySQL client because we didn’t specify the Net field. Why it worked for local tests? Because MySQL was exposed on localhost, which is the default address.

The other test failed to connect to one of Wild Workouts services, and it turned out to be because I’ve used incorrect port in the .env file.

Why I’m sharing this at all? I think it’s a great example of how working with CI systems can often look like. When multiple things can fail, it’s easy to come to wrong conclusions and dig deep in the wrong direction.

When in doubt, I like to reach for basic tools for investigating Linux issues, like strace, curl, or telnet. That’s also why I did the reverse SSH tunnel, and I’m glad I did because it seems like a great way to debug issues inside the CI. I feel I will use it again sometime. 😄


We managed to keep a single docker-compose definition for running tests locally and in the pipeline. The entire Cloud Build run from git push to production takes 4 minutes.

We’ve used a few clever hacks, but that’s just regular stuff when dealing with CI. Sometimes you just can’t avoid adding some bash magic to make things work.

In contrast to the domain code, hacks in your CI setup shouldn’t hurt you too much, as long as there are only a few of them. Just make sure it’s easy to understand what’s going on, so you’re not the only person with all the knowledge.

With this post, we wrap up the first refactoring session of Wild Workouts. We’re going to move on to strategic patterns now. Thanks for reading, and see you soon! 👋
Last update:
  • April 1, 2021
comments powered by Disqus