Mycchaka Kleinbort
03/09/2023, 4:17 PMfrom .repository import my_dagster_project
and repository.py
was:
from dagster import load_assets_from_package_module, repository
from my_dagster_project import assets
@repository
def my_dagster_project():
return [load_assets_from_package_module(assets)]
I changed that to
from .repository import my_dagster_project
and
from dagster import Definitions, load_assets_from_package_module
from my_dagster_project import assets
all_assets = load_assets_from_package_module(assets)
defs = Definitions(
assets=all_assets,
schedules=[assets.dg_test_schedule.update_job_schedule]
)
And now things work
However, this took me far too long 😂 - it should have been flagged as a gotcha somewhere IMODaniel Gafni
03/09/2023, 7:04 PMBrian Pohl
03/09/2023, 8:52 PMdagit_base_url
to generate a link to the job run (which is great!), but this link only works with whatever the default deployment is. i use Dagster Hybrid, so the URL mycompany.dagster.cloud/instance/runs/abcd...
redirects to our one non-branch deployment, called prod
, so the URL becomes mycompany.dagster.cloud/prod/runs/abcd...
. there doesn't seem to be a way to link directly to my branch deployment.josh
03/09/2023, 9:37 PMStephen Bailey
03/12/2023, 11:38 AMNone
default but that still results in a requirement to specify a string.
from dagster import Config, asset, RunConfig, materialize
class AssetConfig(Config):
name: str = "stephen"
optional_adverb: str = None
@asset
def foo(config: AssetConfig) -> str:
msg = f"{config.name} is {config.optional_adverb} cool."
print(msg)
return msg
# these should be valid configs
materialize([foo], run_config=RunConfig({"foo": AssetConfig(optional_adverb="really")}))
materialize([foo], run_config=RunConfig({"foo": AssetConfig()}))
error
Error 1: Value at path root:ops:foo:config:optional_adverb must not be None. Expected "(String | { env: String })"
Chris Histe
03/13/2023, 12:50 PMMark Fickett
03/13/2023, 2:06 PMMatt Clarke
03/13/2023, 2:50 PM❯ poetry add --group dev black
Using version ^23.1.0 for black
Updating dependencies
Resolving dependencies... (0.2s)
Because no versions of black match >23.1.0,<24.0.0
and black (23.1.0) depends on packaging (>=22.0), black (>=23.1.0,<24.0.0) requires packaging (>=22.0).
And because dagster-cloud (1.2.1) depends on dagster-cloud-cli (1.2.1) which depends on packaging (>=20.9,<22), black (>=23.1.0,<24.0.0) is incompatible with dagster-cloud (1.2.1).
So, because ***** depends on both dagster-cloud (1.2.1) and black (^23.1.0), version solving failed.
I can get around it by pinning to the final 22.* release of black, but am wondering if there is any reason that the packaging module in dagster-cloud-cli
needs to be pinned to <22
?Leo Qin
03/13/2023, 4:25 PMMatt Clarke
03/14/2023, 1:52 AMWORKDIR /opt/dagster/app
COPY . /opt/dagster/app
which copies the contents of the repo in to the docker image. If this is used alongside the github actions for building/deploying, it results in the actions-repo
path being included in the dockerfile (verified using dive).
In my case, adding that folder to .dockerignore
knocked about 90MB off the total image size.Matt Clarke
03/15/2023, 3:53 PMcloud-branch-deployments-action
is deprecated, but the https://docs.dagster.io/guides/dagster/branch_deployments refers to it when setting up the github actions for cloning of the production db,Jordan Wolinsky
03/15/2023, 9:20 PMThomas
03/16/2023, 2:11 PMMycchaka Kleinbort
03/16/2023, 2:13 PMVytautas Mickus
03/16/2023, 3:02 PMGuillaume Onfroy
03/17/2023, 2:31 PMMultithreadedExecutor
which would allow executing steps in parallel within the same process. Currently, when running on Kubernetes using the MultiprocessExecutor
, there's a massive cold start delay for each subprocess being started because, from what I understand, it has to reload the entire project and assets, which can lead to extremely long jobs even when the tasks are very lightweight.Daniel Gafni
03/18/2023, 8:33 AMAssetIn
behave like pydantic’s Field?
It would be possible to make assets more readable:
@asset
def my_asset(upstream: pd.DataFrame = AssetIn(metadata={"columns": ["A", "B"]})):
...
@sandy @schrocknRubén Briones
03/18/2023, 1:11 PMCasper Weiss Bang
03/20/2023, 12:18 PMTobias Pankrath
03/20/2023, 2:47 PMLeo Qin
03/20/2023, 4:52 PM__ASSET_JOB_0
, not very clear. Reporting the asset names or something similar (i understand there can be a lot of assets having runs initiated by the sensor) would be clearer.Pablo Beltran
03/20/2023, 11:22 PMAlfie Johnson
03/21/2023, 4:48 PMGuillaume Onfroy
03/21/2023, 4:59 PMmonitored_jobs
param, sensors are triggered for all jobs across all deployments.
• When specifying the monitored_jobs
, if several jobs have the same name (e.g. daily
) across different deployments, then the sensor will be triggered for all of those jobs.
When specifyingPablo Beltran
03/21/2023, 7:44 PMTobias Pankrath
03/22/2023, 1:36 PMLucas Gabriel
03/22/2023, 2:45 PMMycchaka Kleinbort
03/28/2023, 1:22 PM"-"
& " "
as characters in the group names. Sorry to be a pain 😅Emir Karamehmetoglu
03/28/2023, 11:18 PMdbt tests
as bona-fide Dagster Assets. We've thought about it and they seem ideal as assets, with a catch. I want them to materialize the test result on failure (ignore_handled_error
in dbt_resource
works for this). But currently tests are hard coded in dagster_dbt
to not be assets, and my hacky workarounds have short-comings 😛.
But it is really nice to be able to put tests, which are really just sql models like dbt, into an asset lineage graph, see the dependencies, assign freshness policies and have a reconciliation sensor handle it all, plus the occasional asset job for batch tests.Vinnie
03/30/2023, 2:52 PMrequired_resource_keys
parameter, and experimenting a lot with the build_*_context
functions. What triggered this message was trying to add resources to a few schedules I have running and having to go through the process again.
Maybe I’m trying to adopt everything a little too early, but I feel like doing that would drastically reduce the hurdle others might feel to make the jump.