Automatic Semantic Versioning in GitLab CI

In the previous post I showed how to keep all the scripts used in the CI in one repository. Let’s see what more advanced scripts you could put in there.

This time I’d like to show how to add automatic versioning to your pipeline. You will also see how to push commits to your repository within the CI jobs. But first, let’s start with some background.

Picking your flow

One of the things I love about GitLab is its flexibility for setting up your own CI workflow. Whether you’re doing one release each week, every other month, following classic gitflow or practicing continuous deployment, it’s just a matter of the config.

As a proponent of continuous delivery principles, I usually stick to a protected master branch and short-lived feature branches that are merged after code review. Each commit on the master is automatically deployed to some sort of staging environment (preferably identical with prod) and then can be manually promoted to production. If you have a solid suite of multi-layer tests, this workflow will let you deploy multiple times a day.

Versioning

Whatever flow you pick, versioning of your application becomes essential in many steps of the pipeline. For example, it is crucial to know precisely to which version you should rollback if something goes wrong. You should always be able to find the commit for a given version number quickly.

The versioning system is up to you and will depend on your workflow. I usually stick with Semantic Versioning, as it feels natural and it is already a standard for many projects, especially for libraries. So the first commit on the master branch would get version 1.0.0, the next one 1.0.1, then 1.0.2, and so on (alternatively, you can start with 0.0.1 and treat 1.0.0 as first public release).

Where to keep the versions? Some good practices from my experience:

Don’t keep it in the code or a file committed to the repository.
Git tags are a perfect fit for keeping versions tied with commits.
Automate it. Don’t waste developers' time on wondering which version number to pick.

So basically, you want to have some automated version tagging wired into your CI system.

Implementation

The essential piece handling versioning is a script that runs on every job on the master branch. This script should take the most recent version, bump it, add a tag and push it to the remote repository.

This time I’m going to use Python and the python-semver library. If you’d prefer to stick with bash, take a look at the semver-tool.

Let’s see what the script should do.

1. Pick the most recent version and bump it.

This is simple enough. You can extract the most recent tag by running git describe --tags. The semver library can then be used to bump the version.

Our main function might look something like this (full source):

def main():
    try:
        latest = git("describe", "--tags").decode().strip()
    except subprocess.CalledProcessError:
        # No tags in the repository
        version = "1.0.0"
    else:
        # Skip already tagged commits
        if '-' not in latest:
            print(latest)
            return 0

        version = bump(latest)

    tag_repo(version)
    print(version)

    return 0

There is one catch, though - how will you decide whether to bump patch, minor or major version?

def bump(latest):
    # TODO decide what to bump
    # Then use bump_patch, bump_minor or bump_major
    return semver.bump_patch(latest)

You will have to mark it in the commit somehow. I won’t implement this in the example to keep it simple, but here are some ideas that could work:

Make bumping patch the default behavior, as it is the most common operation.
An intention to bump minor or major version can be marked in the commit message, e.g., by using some phrase like #minor or bump-minor.
Similarly, you could attach labels to the Merge Requests called bump-minor and bump-major.
You could come up with a script that detects whether any breaking changes were introduced or new functionalities added.

2. Add a new tag and push it to the remote repository.

Authentication

This step requires the CI job to have write access to the repository. Sadly, as of now, GitLab still doesn’t support pushing changes back to the repository out of the box. Deploy tokens shown in the previous post can’t be used here, since they allow only read-only access. So we’re left with Deploy Keys.

First, generate a new key on your local machine (no passphrase):

ssh-keygen -t rsa -b 4096

Add the public part as a new Deploy Key in the Settings -> Repository section. Make sure to check the “Write access allowed” option.

Add the private part as a new Variable in the CI/CD section. The name is up to you; I’ll stick with SSH_PRIVATE_KEY.

After you’ve saved the keys in GitLab, it is a good idea to delete the private key file (or better yet, shred it).

All that’s left is to add the SSH key to the CI definition. There are several ways to do it, one of them looks like this (change gitlab.com with your hostname if you’re using self-hosted GitLab):

script:
  - mkdir -p ~/.ssh && chmod 700 ~/.ssh
  - ssh-keyscan gitlab.com >> ~/.ssh/known_hosts && chmod 644 ~/.ssh/known_hosts
  - eval $(ssh-agent -s)
  - ssh-add <(echo "$SSH_PRIVATE_KEY")

Pushing the tag

A simple git push won’t work right away, as the repository is cloned using HTTPS, not SSH. The simplest solution is to transform the remote push URL by using a regex.

def tag_repo(tag):
    url = os.environ["CI_REPOSITORY_URL"]

    # Transforms the repository URL to the SSH URL
    # Example input: https://gitlab-ci-token:xxxxxxxxxxxxxxxxxxxx@gitlab.com/threedotslabs/ci-examples.git
    # Example output: git@gitlab.com:threedotslabs/ci-examples.git
    push_url = re.sub(r'.+@([^/]+)/', r'git@\1:', url)

    git("remote", "set-url", "--push", "origin", push_url)
    git("tag", tag)
    git("push", "origin", tag)

3. Pass information about the version to the next build steps.

It’s very likely that more than one job in your pipeline will need to know the generated version. The idiomatic way to achieve this is to pass a file with the version as an artifact.

version:
  image: python:3.7-stretch
  stage: version
  script:
    - pip install semver
    - $SCRIPTS_DIR/common/gen-semver > version
  artifacts:
    paths:
      - version
  only:
    - branches

build:
  image: golang:1.11
  stage: build
  script:
    - export VERSION="unknown"
    - "[ -f ./version ] && export VERSION=$(cat ./version)"
    - $SCRIPTS_DIR/golang/build-semver . example-server main.Version "$VERSION"
  artifacts:
    paths:
      - bin/
  only:
    - branches

The next steps can now read the ./version file. You can also make it more convenient with a one-liner placed in before_script:

before_script:
  - [ -f ./version ] && export VERSION=$(cat ./version)

Don’t run on tags

Remember to put proper only setting in your step definition or your automatic tag pushes will trigger new pipelines. Limiting it to branches should be easy enough:

only:
  - branches

What about changelogs?

Some workflows generate the version based on a CHANGELOG file in the repository. I don’t recommend this approach, as this forces developers to come up with versions, duplicates the commit messages and is prone to merge conflicts.

Instead, you can treat the git log itself as a changelog (focus on great commit messages). With the automated versioning set up, you have everything you need in the commit - author, date, version, and message. Have the CI generate a changelog file for you and upload it somewhere with each new version.

Summary

This is just a basic setup that can be tweaked for more specific cases. Hit me on Twitter if you have any questions or if you’d like to share your own versioning process.

See the full examples here:

External links:

Semantic versioning
Take a look at the semantic-release tool to compare some ideas.

Picking your flow

Versioning

Implementation

1. Pick the most recent version and bump it.

2. Add a new tag and push it to the remote repository.

Authentication

Pushing the tag

3. Pass information about the version to the next build steps.

Don’t run on tags

What about changelogs?

Summary

Did you like this article?

Go With The Domain Three Dots Labs