Notes
In this episode, we baked the Celery worker and beat scheduler tool into the Shiv app. This is one more step on the path to simplifying the set of tools on the production server.
I started the stream
by reviewing the refactoring
that I did to conductor/main.py
.
The main file is used to dispatch
to different tools
with the Shiv bundle.
The refactored version can pass control to Gunicorn, the Django management tools, or Celery.
To make Celery work,
we hooked worker_main
to the Celery dispatch function.
Once worker_main
was set,
I generated a new local Shiv app
to test things out.
Since testing could require building the app multiple times,
we took a detour to make the package.sh
easier to use locally.
Before we started,
the packaging script only worked
in Continuous Integration.
I modified some paths to look somewhere only when CI runs.
Remembering how to do Bash if
statements is surprisingly difficult
for me.
if [ -n "${CI}" ]; then
VENV=venv/bin/
fi
To create a new version of the package to test, we ran:
$ rm -f conductor-abcd.pyz && CIRCLE_SHA1=abcd package.sh
By chaining the commands together, it’s very fast to fetch that command from history and run all the steps in a single line.
With our testable package available, we started Celery with a variety of options. When I finished, we settled on an invocation that looked like:
$ /srv/apps/conductor.pyz celery \
--loglevel INFO \
--concurrency 2 \
--beat \
--schedule /tmp/celerybeat-schedule \
--pidfile /tmp/celerybeat.pid
This set of options lets us run a Celery worker along with the beat scheduler tool.
To finish the GitHub issue, we deployed this to staging and checked on the processes. We looked at the Celery processes with:
$ ps -ef | grep celery
From that output, I explained how process identifiers (PIDs) and parent process identifiers (PPIDs) work, and I covered how Celery creates multiple processes to increase concurrency.
Next time, we need to do a similar process with the Python-based Postgres backup tool, wal-e. We’ll call wal-e from the Shiv app to eliminate the last usage of the installed virtual environment.