Hi everyone I m testing the following deployment strategy lo dagster #deployment-kubernetes

Hi everyone, I'm testing the following deployment ...

Quentin Gaborit

08/16/2023, 1:46 PM

Hi everyone, I'm testing the following deployment strategy: • local: 4 services docker-compose deployment using the

DockerRunLauncher

- that allows to quickly test Dagster code syntax and resources definitions parsing • dev: 4 services deployed via Helm chart on k8s using the

K8sRunLauncher

- that allows to configure at both jobs and ops level the image to use and the resources to allocate In order to have a nice abstraction between local and remote deployments (i.e. docker-compose vs kubernetes), I define the image to use by the

docker_executor

and

k8s_job_executor

at job-level via definitions such as:

Copy code

job_config = get_executor_config(
    image_name="dagster-user-code",
    image_tag="0.5",
    resources=dict(
        requests=dict(
            cpu="200m",
            memory="2000Mi",
        ),
        limits=dict(
            cpu="500m",
            memory="2500Mi",
        ),
    ),
)
extract_github = define_asset_job(
    name="extract_github",
    description="Extract data from Github API for demonstration purposes.",
    selection=AssetSelection.assets(*github_assets),
    executor_def=executor.configured(job_config),
)

For kubernetes executions, this can be seen as opposed to the tag method specified in the per-job/per-op k8s configuration. The rational is that by using executor configurations rather then

dagster-k8s/config

job tags, we can have a single interface to define job images in both local docker compose deployments and remote k8s ones. However, I am currently facing a dependency conflict between

meltano

(any version requires

jsonschema>=4.0

) and the latest

dbt-semantic-interfaces

(that requires

jsonschema<4.0

). I wrapped the generation of both meltano and dbt assets in factory functions that use respectively the multi_assets and the dbt_assets decorators to generate bundles of SDA with minimal code. In order to bypass my dependency conflict, I thought I could create dedicated image for my meltano projects and my dbt projects, and specify these images from within the factory functions. Although this could work with

dagster-k8s/config

ops tags in kubernetes, it seems that wouldn't work in the local docker deployment because we can't specify images at ops level. It however seem that the _launch_container_with_command_ method takes in a docker_image. Therefore: 1. Do you guys think of any ways to specify image to be used for multi_assets and/or dbt_assets? 2. Is is even smart to use a local deployment for short test feedback loops rather than local k8s deployment via kind or K3D?

Quentin Gaborit

08/16/2023, 1:51 PM

Having full control over the image configuration for jobs and ops/assets both in Kubernetes and Docker would also allow me to define custom lightweight images (

pex

based built via Pants Build System) per pipelines or ops type (

meltano

dbt

datahub pull-based

...).

prha

08/16/2023, 5:56 PM

Yeah, this is interesting. You’d probably need to implement your own version of the docker executor that lets you plug in a custom implementation of the

DockerStepHandler

, varying the image used based on the step context. I think using fast feedback loops locally is good engineering practice, as long as you keep in mind the operational differences between local/prod.

Quentin Gaborit

08/16/2023, 9:30 PM

Would you mind pointing me to the permalink of where you see the modifications happening in the DockerStepHandler? Looking through the docker executor library, implementing a tag-based ops configuration system similar to what's been done for k8s_job_executor with the get_user_defined_k8s_config() that is used in every launch_step call would basically mean adding a

UserDefinedDagsterDockerConfig

class to parse ops tags and add a call to it here to add tags config to the container context no?

prha

08/17/2023, 6:05 PM

Hi Quentin (apologies for the delayed response)… I was thinking that you could override the behavior here: https://github.com/dagster-io/dagster/blob/7474560962e0486e7e320fad95e0260f5a568c2[…]ules/libraries/dagster-docker/dagster_docker/docker_executor.py

Quentin Gaborit

08/22/2023, 12:23 PM

Thank you @prha, I'll put that on the back burner for now but I'll remember to let you know if I make any progress on tweaking that docker executor

2 Views

Open in Slack

Previous Next