https://dagster.io/ logo
Title
o

Oren Lederman

04/28/2023, 4:34 PM
Good timing 🙂 We are starting to use Dagster for our ML pipelines, and one of the questions that came up was whether we should implement them as Jobs or Assets. In many cases, we run various experiments and control the run parameters manually. I can easily see how to use Jobs for that, but I can’t see how to use assets since these are not pre-defined partitions or anything like that. Any thoughts on this?
f

Félix Tremblay

04/28/2023, 5:38 PM
Hello @Oren Lederman, If I understand correctly, you would like to have one partition per experiment (where each experiment is defined/configured with multiple parameter values)? If that's the case, you can check out this GitHub Discussion to find some workarounds. Feel free to comment if you would like Dagster to provide better support for this use case.
o

Oren Lederman

04/28/2023, 6:17 PM
Thanks Felix! I'll take a look. And I should probably read more about dynamic partitions. I don't think these existed when I first read about assets
d

Daniel Gafni

04/29/2023, 9:32 AM
Both assets and jobs can take runtime configs
👍 1
c

Charles Lariviere

05/02/2023, 9:09 PM
We create an asset representing the model, and then create a job that includes that asset (and optionally more steps, such as preprocessing, evaluation, etc.)
o

Oren Lederman

05/02/2023, 9:51 PM
@Charles Lariviere is it a simple asset, or partitioned? For some of the models, we try different settings (hyper params, or the data we choose for the model). If I understand the idea of dynamic partitions, it means that I can store each experiment as a partition. There is some overlap here with other tools we read/try/evaluate that are more MLOPs focused. For example, Weights & Biases, ClearML, etc. In these tools, everything is about managing experiments and datasets.
c

Charles Lariviere

05/02/2023, 10:59 PM
Simple asset! The way we handle experimentation is through branching in the version-control system -- if someone wants to experiment on hyperparameters, dataset, preprocessing, and so on, that goes through a new commit which gets deployed to an isolated environment for that branch. The model gets logged to the experiment tracking and model registry tools with the commit hash as metadata -- that directly gets us visibility on the state of the code, dataset, and parameters that produced the artifact. Once an experiment is reviewed/approved, it gets merged to our main branch.
🤔 1
👍 1
m

martin o leary

05/09/2023, 8:27 AM
Hi @Charles Lariviere - that sounds like a really elegant solution! Could you share a little on the architecture of your deployment that allows this type of approach? I love that! It is super clean to have experiments associated with a commit for repeatability!
c

Charles Lariviere

05/24/2023, 3:53 PM
Hey @martin o leary 👋 Glad to hear! This approach works well for situations where each new model iteration involves having a developer in the loop. Though, you could also have automated retraining on your main branch and include something like a timestamp in addition to the commit hash to identify models. In terms of deployment, we use Kubernetes. Within a branch, we can trigger a deployment that starts up its own instance of Dagit, Daemon, and our Dagster project on our Kubernetes cluster. Developers can then connect to that specific Dagit instance and run their experimentation (which still gets logged to our main experiment tracking and model registry tools). Once a branch is merged, that deployment is removed from our cluster.
🙌 1
🙏 2
o

Oren Lederman

05/24/2023, 4:22 PM
Thanks @Charles Lariviere, this is very helpful. Do you spin up a full-blown Dagster deployment for each branch (with helm charts, code location deployment, etc) or run more of a self contained instance, similar to a local run?
🙌 1
c

Charles Lariviere

05/24/2023, 4:32 PM
It's a full-blown deployment with the addition of a postgres instance (contrary to our main deployment which uses a database instance deployed outside of k8s). Those pods are pretty lightweight so it doesn't make a big difference as far as I can tell.
o

Oren Lederman

05/24/2023, 4:38 PM
I’m less worried about the resources, more about the complexity 🙂 . I was thinking that it’d be easier to pack the code and all into a container as-is, and just use
dagster dev
(and maybe set it to execute jobs ad k8s jobs). Are you using anything fancy for managing these deployments (ArgoCD?), or just some scripts for setup and teardown?