(At least everyone who is using the snap package…)

A spilled mug of tea

I’ve been using Gitea to host most of my private and public git repositories for some time. I have been employing the Snap package, which is listed on their website install guide as “Unofficial”, but is packaged in tree in the official repository. The Snap package is convenient: it can be installed with a single command; it is kept up-to-date automatically; and when it works, it “Just Works”. But, when it breaks, it “Really Breaks”.

What are Snap packages?

Snap is a packaging format developed by Canonical for the Ubuntu operating system. The original plan was to become a universal packaging format for all Linux distributions, but this goal did not succeed. Snaps are compressed folders, similar in concept to MacOS’s “.app” application format, which bundle programs, supporting libraries, resources and metadata to run an application or applications. They are downloaded from a central, proprietary “Snap Store” using specific tools which install the package and set up a secure, confined environment in which to run. The Snap Store supports multiple release channels for a package; where many users may prefer the “Stable” channel, testers and developers may install from the “Edge” channel. Whilst these channels are hierarchical to some extent, they are arbitrarily defined - a packager may employ different channels as they see fit.

Most Snaps are created by the Snapcraft tool, which automates a lot of the process of building the program and bundling the appropriate libraries and metadata into the Snap package. It follows a recipe in a snapcraft.yaml file which is conceptually similar to a Dockerfile. It is common for the build-section of the snapcraft.yaml to employ various hacks to the build process, environment, binary or source to make the application run happily under the secure Snap confinement.

These packages can be built and uploaded to the Snap Store by the upstream developers, Canonical employees or third parties. When new versions are uploaded to a specific channel, they will be automatically downloaded and installed on users’ machines. There are perils to this: services can be interrupted at inopportune times; data may require migration; breakages propagate quickly; and the window for mischief is wide open. But when it works well, the Snap distribution model obviates the burdens of legacy versions with unpatched bugs and vulnerabilities.

My recent experience with the Gitea Snap

Up until the 29th April, I had no complaints about the Gitea Snap at all. It was hosting my Git repositories and a couple of container packages without a hiccough. On that date, however, I pinged a friend to test a new container package, and he replied to say my repository was refusing unauthenticated access to the package. Previously this had not been a problem. After a few hours of combing online documentation, configuration files and logs I stumbled across the release notes for version 1.26.1 which outlined “Fix container auth for public instance”. This is a bug which had crept in with version 1.26.0 and had now been fixed.

Running snap info gitea I saw I was tracking the stable channel which was on version 1.26.0, whereas the edge channel had 1.26.1. Aha! I had my culprit. I switched to the edge channel and, lo and behold, my package repository was working normally again. I made a mental note to switch back to stable when 1.26.1 was promoted to that channel. Snaps are brilliant!

On the 7th of May my Gitea instance disappeared without warning.

Getting stuck back into the logs, I saw failures in the gitea-web component related to a database issue. What had changed? snap info gitea showed I was still tracking the edge channel with version 1.26.1 and that stable was still stuck on 1.26.0. Looking closer, I saw that the edge channel had been updated that day, albeit the v1.26.1 label had not been changed. An autoupdate had broken my install. Reluctantly, I switched back to the stable channel and filed a bug about the current state of affairs.

It turns out that edge was tracking HEAD, despite being labelled with the v1.16.1 tag, and that HEAD was currently broken upstream. This was fair enough - it is reasonable that an edge channel moves fast and breaks from time to time. It would have been nice if the packagers had labelled the package in that channel with a git reference rather than a release tag but packaging is a thankless task and nothing is perfect in life.

As fixes were being proposed in that bug report, it became clear that the core developers were uncomfortable with the hacks required in the Snapcraft build instructions to get the application running properly as a Snap package. This looked as if it was going to take some consideration to fix properly, and consideration takes time.

How I broke it for everyone

I was left in the situation where I was running version 1.26.0 from the stable channel, which meant I had a largely-functioning application but a broken container repository. I asked, in the bug report, whether it would be possible to promote the version of 1.26.1 which had been in the edge channel prior to 7th May to the stable channel. The packagers responded by (mistakenly) promoting the broken version of 1.26.1 which had been released on the 7th May.

Silently but violently, this broken package was installed - automatically - on the machines of everyone who was using the Gitea Snap. All of a sudden, their service had been removed from them without their knowledge or consent. When Snap breaks, it “Really Breaks”.

Fortunately all was not lost. Snap has a rollback function, using the snap revert command. By default, Snap keeps both the current and the prior installed version of the Snap package on disk to facilitate rollbacks. I could switch back to the 1.26.0 version and switch off updates to prevent the faulty version being reinstalled.

At the time of writing, this is the only recourse for anyone who has been affected by this update. It looks as if the packagers are not going to produce a working Snap package until the 1.26.2 release. Sorry. I should have kept my bug-reporting mouth shut.

(As an aside, it would have been nice if Snap kept a few more prior versions on disk. That would have allowed me to snap revert back to the working version of 1.26.1. Sadly, it defaults to only 2 versions and that, to some extent, is my fault, too.

In the old days, Snap used to keep more old versions around. Now, I used to publish a Snap of an action game, which had almost a gigabyte of assets. Every time I released a new version, the old 1GB Snap would be filed away and a new 1GB version downloaded and installed. I had complaints from users that my game was eating away their disk capacity with old versions, so I mentioned this to the Snap developers and it was one of the triggers to reduce the default archive to only 2 versions. I didn’t realise this would come back and bite me. Once again, sorry!)

What I’ve learned

Snaps are convenient, but their convenience is also their downfall. Automatic updates will break things without warning and for everyone. The road to hell is paved with good intentions.

If you’re using Snaps to provide a service in production, switch off automatic updates. Set a routine to manually check for new versions and test them thoroughly in the lab before pushing them to production. Keep an archive of versions for rollback. Treat them as you would any other package, and trust yourself rather than upstream.

Me? I’ll switch back to tracking stable with automatic updates when 1.26.2 is released and get on with my life because convenience is quick and life is too short. I learn nothing!