In my last post, I talked about how a modern web app needs background
worker processes. One way or another, you’ll have some things you need
to do that are slower than you can do in a request/response cycle, and
so you’ll want to handle them out of band. Have the API return a simple
202 ACCEPTED and move on with your life, right?
Well, sometimes you want to tell users about the state of those background processes. You might want to say “it’s done!”, or “it’s failed!”, or even just to acknowledge that it’s taking a long time, but still going. And just saying “they can refresh the page” isn’t always enough. (Though, sometimes it is!)
I’m going to talk about different ways you can do this.
First I’ll talk about how I would do it using Celery. But
django-channels provides some cool new options for handling background
processes, so I’ll cover that too.
Imagine we have a long-running background process, something that can take up to a minute under normal circumstances, maybe more under exceptional load. We make a Celery task to handle it, and now we want to let the user know what state it’s in.
The most simple (or simplistic) approach can be to use the database to store state. You can do this with Celery’s database result backend, or a custom task state model that you periodically update. Imagine something like this:
# some unique identifier, that you can recover outside the task:
task_id = get_task_id_based_on_arg(some_arg)
state, _ = TaskState.objects.get_or_create(task_id=task_id)
total = len(some_arg)
for i, elem in enumerate(some_arg):
# Every 100 elements, update the percentage processed:
if i % 100 == 0:
state.percent_done = (i / float(total)) * 100
Then in your views, you can retrieve the appropriate TaskState and show how much has been processed. Sometimes that’s a good approach, but usually I think that’s pretty clunky. It can thrash the database, leave records lying around if things die halfway through, and still doesn’t give you a smooth experience; your user has to refresh to see updates.
As an aside, it might be tempting to do something like this with the
Django messages framework. However, adding and retrieving messages
request object. Even the pickle serializer can fail to
serialize the request object. I would strongly recommend saving yourself
the time and trouble, and using anything but messages for this.
So what if you want real-time updates? What if that page refresh is bumming you out?
A nice option is something like Pusher. They provide a service that you can push to from inside your app (in the request/response cycle, or in a background task) using a nice Python library, and some JS to get your users talking to their realtime websocket-y servers, to get those updates. Their JS library even includes sensible fallbacks for when websockets aren’t available. The one caution is that their prices take a curve that can be a bit steep for some situations; if developer time is cheaper than ongoing service costs, then it might be worth rolling your own websocket solution. Which brings us to our next section.
If you are using the newer Django Channels package for background tasks, this has the added benefit of making it possible for you to make and manage your own websockets connections.
I find it helps to think of Channels as a generalization of Django’s
view system. Instead of a
urls.py with a
urlpatterns attribute, you
routing.py with a
channel_routing attribute. Instead of
mapping paths to views, it maps channel types to consumers. Channel
types can include well-known ones like websocket events, or ad-hoc ones
like custom background tasks.
(All your views and URLs can still be there in your project, untouched, too. This isn’t instead of all that, it’s in addition.)
Because Channels operate outside the usual request/response cycle,
sending a reply on a channel is a little harder. It can’t operate simply
through a function’s
return. Instead, you have to
.send on Channels,
or more flexibly, Groups. (A Group just allows you to send to multiple
consumers at once, if necessary.)
So, for our purposes, your
channel_routing should have, at a minimum,
channel_routing = [
The first three are consumers for handling basic websocket operations. The last one is whatever long-running task you want to run in the background.
You can then call the background task in a view:
Be sure that there’s some stable way to identify the
Group that you
need to send to. It might be as simple as passing in the username of the
logged-in user who kicked off the task, or it might be based on a
process UUID that’s in the view’s path, or something else. Whatever it
is, when the user’s browser makes a websocket connection on page load,
you’ll want to add that reply channel to the
# Accept connection
On the front-end, you should have something like this:
socket = new WebSocket("ws://" + window.location.host);
socket.onmessage = show_some_toast_for(message);
// Call onopen directly if socket is already open
if (socket.readyState == WebSocket.OPEN) socket.onopen();
And now you can push messages to users yourself:
(Header image from Tekniska Museet.)