Running Dagster+ agents on Kubernetes
This page provides instructions for running the Dagster+ agent on a Kubernetes cluster.
Installation
Prerequisites
You'll need a Kubernetes cluster. This can be a self-hosted Kubernetes cluster or a managed offering like Amazon EKS, Azure AKS, or Google GKE.
You'll also need access to a container registry to which you can push images and from which pods in the Kubernetes cluster can pull images. This can be a self-hosted registry or a managed offering like Amazon ECR, Azure ACR, or Google GCR.
We recommend installing the Dagster+ agent using Helm.
Step 1: create a Kubernetes namespace
kubectl create namespace dagster-cloud
Step 2: Create an agent token secret
Generate an agent token and set it as a Kubernetes secret:
kubectl --namespace dagster-cloud create secret generic dagster-cloud-agent-token --from-literal=DAGSTER_CLOUD_AGENT_TOKEN=<token>
Step 3: Add the Dagster+ agent Helm chart repository
helm repo add dagster-cloud https://dagster-io.github.io/helm-user-cloud
helm repo update
Step 4: Install the Dagster+ agent Helm chart
helm --namespace dagster-cloud install agent --install dagster-cloud/dagster-cloud-agent
Upgrading
You can use Helm to do rolling upgrades of your Dagster+ agent
# values.yaml
dagsterCloudAgent:
image:
tag: latest
helm --namespace dagster-cloud upgrade agent \
dagster-cloud/dagster-cloud-agent \
--values ./values.yaml
Common configurations
You can customize your Dagster+ agent using Helm values. Some common configuration include
Configuring your agents to serve branch deployments
Branch deployments are lightweight staging environments created for each code change. To configure your Dagster+ agent to manage them:
# values.yaml
dagsterCloud:
branchDeployment: true
helm --namespace dagster-cloud upgrade agent \
dagster-cloud/dagster-cloud-agent \
--values ./values.yaml
High availability configurations
You can configure your Dagster+ agent to run with multiple replicas. Work will be load balanced across all replicas.
# values.yaml
dagsterCloudAgent:
replicas: 2
helm --namespace dagster-cloud upgrade agent \
dagster-cloud/dagster-cloud-agent \
--values ./values.yaml
Work load balanced across agents isn't not sticky; there's no guarantee the agent that launched a run will be the same one to receive instructions to terminate it. This is fine if both replicas run on the same Kubernetes cluster because either agent can terminate the run. But if your agents are physically isolated (for example, they run on two different Kubernetes clusters), you should configure:
# values.yaml
isolatedAgents: true
helm --namespace dagster-cloud upgrade agent \
dagster-cloud/dagster-cloud-agent \
--values ./values.yaml
Troubleshooting tips
You can see basic health information about your agent in the Dagster+ UI:
View logs
kubectl --namespace dagster-cloud logs -l deployment=agent