An Opinionated Guide to Operating a BookWyrm Instance

Everything mentioned in this article is open-sourced as an Infrastructure as Code repository at chdorner/bookwyrm-infra.

I’ve been running my BookWyrm instance for some time now, mainly as a means to get operating and admin experience while contributing to the development of the project. BookWyrm is a social reading and reviewing application, a sort of federated GoodReads alternative which is interoperable with other ActivityPub software like Mastodon.

The recommended guide to running your BookWyrm instance is by using containers via a Docker Compose setup. But I wanted to go back to basics as much as possible. Dealing with Cloud and Kubernetes infrastructure all day at work I wanted to keep this fun, even though you can’t really compare Kubernetes to a small Docker Compose setup.

The guide to run without containers works well, but I wasn’t a big fan of some of the choices made, first and foremost running all three different services (web, background jobs processing, scheduler) that make up BookWyrm in one systemd service via a bash script. systemd services are cheap, and I like to have the option to configure them separately and be able to stop and restart them individually.

Requirements

As I was saying, the main theme was back-to-basics, using cloud-based features only where it makes sense. So my requirements are:

  • Run all of it on one VPS, until it makes sense to separate
  • Use Infrastructure as Code to set everything up, and avoid having to manually do one-off configurations on the server. The tool of choice, in this case, is good old Ansible (the community one)
  • Use cloud-based block storage for assets, anything S3 compatible
  • Daily database backups to offsite block storage

Installation

First things first

A few things are worth doing right away when starting a new server, namely turning off SSH access for the root user, and generally turning off password authentication with SSH. DigitalOcean’s initial server setup guides available for many different Linux operating systems are a great resource for this, additionally, have a read through this guide about disabling SSH access for root.

Drawing the rest of the owl

I will talk through a few highlights here, but a complete Infrastructure as Code setup is available in this repository: chdorner/bookwyrm-infra.

S3-compatible external assets storage
External block storage services are usually fairly cheap, so cheap in fact that it’s usually not worth the hassle of handling local asset storage yourself. BookWyrm supports any S3-compatible service of which there are plenty of non-AWS ones to choose from. Given there are many services available I didn’t include the bucket creation in the infrastructure setup. You will need to create 2 buckets yourself, one for static and image assets which need to be publicly accessible, and one where database backups are stored which should not be publicly accessible.

Single, passwordless Redis
The Docker Compose setup uses two separate Redis containers, one for cache/feed storage and a separate one for the background job queue. Redis usually struggles with available memory way before CPU becomes a problem, so as long as we logically separate these two use cases in different Redis databases there is no need to run multiple servers.
Additionally, given that we block outside access to the Redis port via the firewall, there are no gains from using Redis’ authentication given that the password is configured in a plain-text file on the same machine.

SSL certificates via Let’s Encrypt
EFF’s certbot handles SSL certificates automatically, including their renewal of them. If you plan to put something like Cloudflare in front of BookWyrm you might want to configure their origin server certificates manually.

systemd services
All three services which comprise a BookWyrm instance can be separately managed via systemd, for example restarting each of them:

systemctl restart bookwyrm-web
systemctl restart bookwyrm-worker
systemctl restart bookwyrm-scheduler

Prioritizing web processes over background workers
I’ve noticed that sometimes when running all three services on the same machine the celery worker process can experience sustained CPU usage spikes to a point where the web process is not responding anymore. I rather want the background job processing to be slightly slower at processing jobs but still have a snappy web server handling requests.
You can use systemd to limit how much CPU the worker process is allowed to use, feel free to play around with this value. I give the worker only 50% of the CPU on a 2-VCPU machine.
In the Ansible setup, you can easily configure this with the bookwyrm_worker_cpu_quota variable by setting it to for example "50%".

Considerations

Upgrading BookWyrm

This infrastructure setup doesn’t automatically update BookWyrm to newer releases. Having gone through a few upgrades myself I think this should be possible to handle automatically with Ansible. But I prefer doing this manually, this way I don’t end up accidentally upgrading when I didn’t want to.

It’s also worth mentioning that things like new celery background queues, or Nginx configuration updates need to be configured manually when upgrading.

Finding logs

All services are running via systemd, so inspecting logs should just work via journalctl. For example to view the logs of the gunicorn web server you can run journalctl -u bookwyrm-web (the value of -u is the service name).

Observability & alerting

It’s a good idea to add at least a few ways of monitoring the instance, this however is a very personal choice with numerous solutions available. I would advise at least having alerting in place when database backups fail (i.e. cronitor), and for the usual system resources like CPU/load, memory, and disk space.

What’s next

It’s time to pat yourself on the back, well done setting up a well-managed BookWyrm instance! You’ve been sitting in front of the computer for a while now, go and enjoy a cup of tea and read a book (and comment, quote, and review it on your BookWyrm instance after!).


P.S. while I’m not planning to run a big public BookWyrm instance, I’m happy to run an instance for friends-of-friends. Feel free to reach out or request an invite directly to the Secret Bear Library.