Simplifying Local Prefect 2 Testing

One thing I don't like about a lot of modern orchestration systems is that they do not have easy local testing setup instructions. Prefect 2 is no exception to this. One thing that Prefect does have going for it, is a nice Docker Hub image. Though this image is not very well documented, at least it gives us a starting point for local Prefect docker setup.

Recently, I was working on a FastAPI app that uses prefect-client to talk to the Prefect Cloud API, and I needed a way to test locally to ensure that my app could kick off a Flow Run, wait for the Flow to finish and then retrieve the results back from the Prefect API. The best way to test this would be to have a local Prefect 2 server running with a dummy flow deployed to it that I can use to test with.

Usually when there is a lack of documentation online for a solution, everyone comes up with their own bespoke way of doing it. This was true here as well. I have some colleagues who also needed a local Prefect setup, and they created a nice README on how to set up a local Prefect server running in a Python venv. Their tutorial had me spinning up a local kubernetes cluster inside Docker Desktop and Prefect would run the flows on it, just like it does in production. It was a really cool local setup, and I'm sure that it is worth it for certain purposes where you need to test compute-heavy, complicated workflows.

However, this was not the case for me, and the kubernetes approach was highly complicated. It also had the drawback of not being containerized and you had to run a bunch of different commands. If something got messed up, you had to run more commands to try and reset it. This didn't jive with how I like to work. When I write an app, I want my fellow developers to be able to just pull my repo, docker compose up, and boom, working local app. So I set out to make my own bespoke way of running Prefect 2 locally, and my intention was to make it as simple as possible.

The approach I will describe below is available on my gitlab page here.

Docker Compose

I love docker compose. It's the ultimate way to set it and forget it. So, the first thing I did when looking to fix this problem was to Google if there were any existing Prefect 2 docker compose setups online. Indeed there are, and I found this one to be particularly helpful. You'll find that the components I use in my docker-compose.yml are almost identical to that example, but there are some notable differences that I should explain.

The FastAPI app

web:
    build: 
      context: .
      target: web
    command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload
    ports:
      - 8001:8000
    env_file:
      - .env
    volumes:
      - ./:/app
      - ./.prefect:/root/.prefect

This looks pretty much like your typical FastAPI docker compose. We use the Dockerfile to install our Python dependencies, and uvicorn to host the server. The only thing to note here is that we are mounting the .prefect folder to this container. You will see that we will mount this same folder to all of the prefect containers as well. This is because, since we want to keep this setup as simple as possible, I opted to just use the local disk for all storage. If you have bucket storage readily available, that would work too, but this tutorial assumes that you are working with nothing other than your local system. The other option would be to use another s3 container like MinIO, as you'll see in the rpeden example, but again, that adds more complexity to our setup that we don't really need.

The Prefect 2 Server

prefect-server:
    image: prefecthq/prefect:2-python3.11
    ports:
      - 4200:4200
    command: prefect server start
    volumes:
      - ./.prefect:/root/.prefect
    environment:
      - PREFECT_UI_URL=http://127.0.0.1:4200/api
      - PREFECT_API_URL=http://127.0.0.1:4200/api
      - PREFECT_SERVER_API_HOST=0.0.0.0

The prefecthq docker hub has a bunch of images, but most of them have gone years without updates. There is now only 1 image that can be used as a server, a worker, and even a client, and that's prefecthq/prefect. Be sure to choose the tag wisely though as latest may not point to what you're looking for. Here we are using 2-python3.11 which will update to any 2.* prefect version but pin the Python version since everything else in our project is on Python 3.11.

Port 4200 allows us to access the Prefect 2 dashboard while we are running the app. Again, we are mounting the same .prefect volume, and we have extra environment variables for the server. This configuration is pretty much a copy from the rpeden example.

The Prefect 2 Worker

prefect-worker:
    build:
      context: .
      target: worker
    command: prefect worker start --pool test-process-pool
    depends_on:
      - prefect-server
    environment:
      - PREFECT_API_URL=http://prefect-server:4200/api
    volumes:
      - ./flows:/root/flows
      - ./.prefect:/root/.prefect
    restart: always

Same volume mount here, but we also mount the flow code. This is because we are using the disk as the storage for the flow code as well as the output. The flow code is in the flows directory, and the flow output will be dumped into .prefect/storage. That's why .prefect needs to be available to the fastapi app, so that it can be read by prefect-client. The reason we have a build step here, is just to install any Python dependencies that are needed inside the flow.

Two Commands and That's It

I wanted to be able to just docker compose up, and maybe that is still possible, but for now 2 commands will suffice. Just like provisioning a database, you don't really want to be redeploying your flow every time that you docker compose up, so I wrote a short bash script that needs to be run to spin up the worker pool and deploy the flow.

echo creating worker pool
prefect work-pool create -t process test-process-pool

echo deploying test flow
prefect --no-prompt deploy \
  --name "return a df" \
  --pool test-process-pool /root/flows/test_flows.py:return_a_df

The prefect deploy command is pretty nice. It will infer most stuff, so you don't have to explicitly define too much, especially for a basic setup like this.

So then, all one needs to do to run this whole thing is:

docker compose run prefect-worker bash /root/flows/deploy.sh
docker compose up

That's just how I like it. Nice and simple. No kubernetes, no giant yaml files, and no "it works on my machine" issues.

Conclusion

I'm not saying that this is the right setup for everybody. If you are testing a large system that spins up a bunch of nested flows and tasks on k8s, then you probably want to replicate that locally. But if you are testing something that just talks to Prefect 2, then I would argue that this is all you really need. Just add those 2 prefect containers, make sure they all share that .prefect directory, and you're pretty much good to go.

Feel free to reach out if you found this helpful, have questions, or if you want to share an even better setup :)