Hi everyone, I'm testing the following deployment ...
# deployment-kubernetes
q
Hi everyone, I'm testing the following deployment strategy: • local: 4 services docker-compose deployment using the
DockerRunLauncher
- that allows to quickly test Dagster code syntax and resources definitions parsing • dev: 4 services deployed via Helm chart on k8s using the
K8sRunLauncher
- that allows to configure at both jobs and ops level the image to use and the resources to allocate In order to have a nice abstraction between local and remote deployments (i.e. docker-compose vs kubernetes), I define the image to use by the
docker_executor
and
k8s_job_executor
at job-level via definitions such as:
Copy code
job_config = get_executor_config(
    image_name="dagster-user-code",
    image_tag="0.5",
    resources=dict(
        requests=dict(
            cpu="200m",
            memory="2000Mi",
        ),
        limits=dict(
            cpu="500m",
            memory="2500Mi",
        ),
    ),
)
extract_github = define_asset_job(
    name="extract_github",
    description="Extract data from Github API for demonstration purposes.",
    selection=AssetSelection.assets(*github_assets),
    executor_def=executor.configured(job_config),
)
For kubernetes executions, this can be seen as opposed to the tag method specified in the per-job/per-op k8s configuration. The rational is that by using executor configurations rather then
dagster-k8s/config
job tags, we can have a single interface to define job images in both local docker compose deployments and remote k8s ones. However, I am currently facing a dependency conflict between
meltano
(any version requires
jsonschema>=4.0
) and the latest
dbt-semantic-interfaces
(that requires
jsonschema<4.0
). I wrapped the generation of both meltano and dbt assets in factory functions that use respectively the multi_assets and the dbt_assets decorators to generate bundles of SDA with minimal code. In order to bypass my dependency conflict, I thought I could create dedicated image for my meltano projects and my dbt projects, and specify these images from within the factory functions. Although this could work with
dagster-k8s/config
ops tags in kubernetes, it seems that wouldn't work in the local docker deployment because we can't specify images at ops level. It however seem that the _launch_container_with_command_ method takes in a docker_image. Therefore: 1. Do you guys think of any ways to specify image to be used for multi_assets and/or dbt_assets? 2. Is is even smart to use a local deployment for short test feedback loops rather than local k8s deployment via kind or K3D?
Having full control over the image configuration for jobs and ops/assets both in Kubernetes and Docker would also allow me to define custom lightweight images (
pex
based built via Pants Build System) per pipelines or ops type (
meltano
,
dbt
,
datahub pull-based
...).
p
Yeah, this is interesting. You’d probably need to implement your own version of the docker executor that lets you plug in a custom implementation of the
DockerStepHandler
, varying the image used based on the step context. I think using fast feedback loops locally is good engineering practice, as long as you keep in mind the operational differences between local/prod.
q
Would you mind pointing me to the permalink of where you see the modifications happening in the DockerStepHandler? Looking through the docker executor library, implementing a tag-based ops configuration system similar to what's been done for k8s_job_executor with the get_user_defined_k8s_config() that is used in every launch_step call would basically mean adding a
UserDefinedDagsterDockerConfig
class to parse ops tags and add a call to it here to add tags config to the container context no?
p
Hi Quentin (apologies for the delayed response)… I was thinking that you could override the behavior here: https://github.com/dagster-io/dagster/blob/7474560962e0486e7e320fad95e0260f5a568c2[…]ules/libraries/dagster-docker/dagster_docker/docker_executor.py
q
Thank you @prha, I'll put that on the back burner for now but I'll remember to let you know if I make any progress on tweaking that docker executor