Symbiosis: Testing in gitlab

Imported from https://www.github.com/BytemarkHosting/symbiosis/issues/53

At the moment the install / dist-upgrade / upgrade tests get weirdly-far in gitlab-ci then fails. Here's a quick summary of how the tests used to work on maker2 (as I understand it), and then later I'll go into detail on why the tests fail in gitlab-ci

The Current Situation

autotest sets up a VM using schroot magic i don't fully understand
the VM boots up with systemd and all that jazz, uses DHCP & SLAAC to configure its networking, and automatically runs all the scripts in the autotest folder, keeping all the output as a log file. Once done, it shuts down
autotest cracks open the VM's filesystem and reads the logfile. Somehow it detects failures and fails if there was a failure then it exits with a nonzero error code so that maker2 knows

Some Feelings About The Current Situation

Patrick said something about autotest using the console to talk to the tests, and something else much scarier about the VM sshing into the host to run something.

This doesn't work on gitlab-ci, and is also kinda hacky, for a few reasons.

the scripts in the autotest folder aren't particularly focussed. In addition to actually running tests, they do these and probably others:
- add an admin user
- install all the packages needed by symbiosis from a big list of packages
- install symbiosis
opening up the filesystem of the VM so you can prod it is pretty gross

On the plus side it works, and it would only take a bit of effort to port the whole schroot setup over to gitlab-ci (but would have to run using a shell runner)

Why the tests fail in gitlab-ci

When gitlab-ci runs a container it starts bash in the context of the container. Effectively, bash is PID 1 for the container. There's no init-system to talk to to get stuff going. I believe the apt-get install step for some packages starts them using /etc/init.d (probably something about the package detecting a lack of systemd and putting a proper sysvinit script in) which would explain why a lot of the tests actually succeed. BUT SOME OF THEM FAIL, and we should really be doing a much more realistic test than running our symbiosis full-system tests in a docker container that isn't a full symbiosis system.

With that in mind:

A More Realistic Test Proposal

We're still going to want to run symbiosis in a VM, I think. To do a realistic full-install / dist-upgrade test we need to have a realistic system, which the docker container environment isn't. We need a systemd to talk to so we can schedule restarts, that sort of thing.

We will need some test-specific configurations (particularly repo URLs) too. And we'll need to be able to orchestrate the testing and fail the build when the tests fail.

We could create an image prior to the testing which would have a user account with passwordless sudo and a .ssh/authorized_keys . The private key would be kept in the secret variables section of the project on gitlab, and so would be presented to the gitlab-ci script as an env var.

In the gitlab-ci script we'd start the VM with qemu, as we do for bytemark/bytemark-packer-templates, then use ansible to copy over the tests, install the symbiosis packages, and run the tests. We could write our ansible playbook so that it captures the logs and copies them back to the runner and have the gitlab-ci script spit the logs out, then exit with ansible's exit code.

This would make our test output more readable and shorter, not be quite as weird the current autotest setup on maker2, probably not require also running a DHCP server.

The work we'd need to do:

add an ansible layer to docker-images/layers
rewrite the autotest/ scripts as ansible playbooks
make a base VM image with the necessary networking & ssh setup

Thoughts @pcherry , @jcarter ?

Edited Apr 14, 2019 by Paul Cammish