A former coworker recently told me that a few weeks ago Docker Hub dropped the Autobuild service from its free tier due to cryptominer abuse. Having been maintaining a repository of personal Docker images on GitLab with an Autobuild-like pipeline for about 4 years I thought I could write a guide for people looking for other options. Let’s hope they are not the next ones dropping CI/CD from their free tier.

Project Structure

Code-wise, I recommend having a directory for each Dockerfile so that you can have separate build contexts for each image. The root directory will obviously have the .gitlab-ci.yml that describes the pipeline.

my-dockerfiles
├── .gitlab-ci.yml
├── alpine
│   └── Dockerfile
├── php
│   └── Dockerfile
├── postgresql
│   └── Dockerfile
├── sqlite
│   └── Dockerfile
...

Since GitLab runs all jobs in the same stage in parallel you’ll want to run as much build jobs as possible in the same stage. My approach is to build all the images that don’t depend upon other images of my repository on a “Tier 1” stage. Then the images on the “Tier N” stage depend on images from “Tier N-1” and so on, thus there’s maximum parallelism while respecting these relationships.

The .gitlab-ci.yml file

stages:
  - tier1
  - tier2
  - tier3
  - tierN...

services:
  - docker:dind

image: docker:latest

variables:
  DOCKER_TLS_CERTDIR: /certs

sanity-check:
  stage: .pre
  script:
    - docker info
    - docker version
    - docker login -u $DOCKER_HUB_USER -p $DOCKER_HUB_PASS

This is the minimal pipeline definition to get up and running. It defines all the stages upfront, the default image that all jobs will run on (docker:latest) and the “services” they need, in this case docker:dind. This is because the docker:latest container only has the command-line part of Docker. The bit that will actually build the images is that docker:dind service. The DOCKER_TLS_CERTIDIR variable is there to force the Docker client and server to communicate over TLS, you’ll get warnings on all jobs otherwise. You’ll also need to store your Docker Hub user name and password as CI/CD variables in your project’s settings.

That first sanity-check job in the builtin .pre stage is useful to abort the pipeline if it is not properly set up, or Docker Hub happens to be down. It’s also useful to output the info and version commands, as these images will get updated from time to time and you want to know exactly what version you ran each time the pipeline triggers. I can remember of at least one instance when a new version of Docker broke my pipeline and a few jobs started hanging randomly. The fact that I could see the version bump in the sanity-check log helped me pinpoint the problem.

Once the sanity-check job works we can start adding the real build jobs. Notice how each docker build command uses a different build context (the last argument, which is a relative path from the root of the project).

alpine:
  stage: tier1
  script:
    - docker build -t 1maa/alpine:3.14 -f alpine/Dockerfile alpine
    - docker login -u $DOCKER_HUB_USER -p $DOCKER_HUB_PASS
    - docker push 1maa/alpine:3.14

php-8.0:
  stage: tier2
  script:
    - docker build -t 1maa/php:8.0 -f php/8.0/Dockerfile php/8.0
    - docker login -u $DOCKER_HUB_USER -p $DOCKER_HUB_PASS
    - docker push 1maa/php:8.0
...

That’s basically all there is to it, but there’s still a few neat tricks I learned along the years:

Allow failures

Maybe some of your images are not very “deterministic”. For instance one of mine builds the latest commit of PHP at the time, whatever that might be. In this case you don’t want these jobs to stop your entire pipeline when they fail from time to time. To do that you can use the allow_failures: true modifier on these jobs.

php-master:
  stage: tier2
  allow_failure: true
  script:
    - docker build -t 1maa/php:latest -f php/master/Dockerfile php
    - docker login -u $DOCKER_HUB_USER -p $DOCKER_HUB_PASS
    - docker push 1maa/php:latest

The needs keyword

By default GitLab won’t start the jobs from a stage until all the jobs from the previous stage have finished. This can be bothersome if some job takes a long time to complete and delays the next batch.

With the needs keyword you can define explicit dependencies between jobs of different stages. When you do that GitLab will start these next jobs as soon as their dependencies finish. In this example we start building 1maa/php:8.0 on Tier 2 as soon as 1maa/alpine:3.14 from Tier 1 finishes (even if other jobs of Tier 1 take a long time):

alpine:
  stage: tier1
  script:
    - docker build -t 1maa/alpine:3.14 -f alpine/Dockerfile alpine
    - docker login -u $DOCKER_HUB_USER -p $DOCKER_HUB_PASS
    - docker push 1maa/alpine:3.14

php-8.0:
  stage: tier2
  needs:
    - alpine
  script:
    - docker build -t 1maa/php:8.0 -f php/8.0/Dockerfile php/8.0
    - docker login -u $DOCKER_HUB_USER -p $DOCKER_HUB_PASS
    - docker push 1maa/php:8.0

Another nicety is that GitLab pretty prints these dependencies on its interface:

Only build changes

By default GitLab will launch all jobs on each push. This is fine if you don’t have much images, or they build very fast. But if that’s not the case you can leverage the only keyword to run only the jobs whose underlying code has changed, like this (beware of the weird wildcard syntax):

php-8.0:
  stage: tier2
  needs:
    - alpine
  only:
    changes:
      - php/**/*
  script:
    - docker build -t 1maa/php:8.0 -f php/8.0/Dockerfile php/8.0
    - docker login -u $DOCKER_HUB_USER -p $DOCKER_HUB_PASS
    - docker push 1maa/php:8.0

Scheduled runs

Code “rots” when it doesn’t run, and your pipeline may break due to updates somewhere else. To avoid having your pipeline inactive for long stretches of time you can instruct GitLab to trigger it on a certain schedule. I do it once a week.

Source

Here is my own repo if you want to check out anything else.

Addenda (2021-08-10)

Just as I hoped, making the effort of writing this post lead to a few discussions that yielded a couple more tricks.

Enable BuildKit

Docker BuildKit is described as a “new backend” for the docker build command, and has been available (though not turned on by default) since Docker 18.09. It’s supposed to run marginally faster, and has at least a couple of interesting features: the --secret flag to handle secrets securely while building images and external image cache sources, which are covered in the next section.

At the time of writing turning BuildKit on is simply a matter of setting a new environment variable for all build jobs:

variables:
  DOCKER_BUILDKIT: 1
  DOCKER_TLS_CERTDIR: /certs

I expect this to be unnecessary at some point in the future. You can check that BuildKit is enabled by peeking at the log of any build job, the output format should be quite different.

Leverage external cache sources

This feature requires BuildKit, and it is another (I believe neater) way to accomplish the same thing as the only keyword, i.e. build images only when their code changes.

Instead of using only as described above, pass the arguments --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from $IMAGE_NAME to all build commands. The first argument will embed a new metadata cache into the image, and the second one will instruct Docker to use it while building the image. This allows Docker to treat the registry as an image cache, like when you build the same image on your local machine repeatedly and Docker is capable of reusing some layers.

alpine:
  stage: tier1
  script:
    - docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from 1maa/alpine:3.14 -t 1maa/alpine:3.14 -f alpine/Dockerfile alpine
    - docker login -u $DOCKER_HUB_USER -p $DOCKER_HUB_PASS
    - docker push 1maa/alpine:3.14

The time savings when building a handful of (mostly) unchanged images can be huge:

Note that the first commit only added --build-arg ... and pushed a fresh set of metadated images to the registry, the last one added the --cache-from ... argument to use that metadata.

In essence this has exactly the same behaviour as if you pulled the old image from the registry before doing the build, but without even paying a price for downloading the layers that the build process cannot reuse.