https://dagster.io/ logo
#ask-community
Title
# ask-community
m

Male

06/27/2023, 1:36 PM
Hi team, I am expecting my assets to be materialized using this specified Job as shown below. However, when I manually materialize my asset from dagit UI, it somehow shows job name as "__ASSET_JOB".
my_job = define_asset_job(
name="my_job",
selection=AssetSelection.groups("my_group")
)
defs = Definitions(
assets=all_assets,
jobs= [my_job],
)
However, when I create a schedule for asset materialization using the same job and the asset is materialized by a schedule it shows the correct Job name. Is this an expected behavior of Dagster?
b

Brian Ren

06/27/2023, 1:43 PM
We also encountered this. I’m curious to know the answer
s

sean

06/27/2023, 1:53 PM
Hi Salman,
__ASSET_JOB
is an implementation detail. You are not required to specify any jobs in your definitions-- you can provide just a list of assets, which will be visible in the global asset graph. However, internally Dagster requires all runs to use a job.
__ASSET_JOB
is a “hidden” job created by Dagster internals that includes all the assets passed to your definitions object. This is the job that will be used when materializing from the global asset graph. The reason we don’t use a job that you have included the asset in is that an asset can be included in multiple jobs, so it wouldn’t be clear which one to use from the global asset graph. If you want to materialize in dagit using
my_job
instead of
__ASSET_JOB
you should select the assets in the asset graph that is show when viewing
my_job
instead of the global asset graph.
m

Male

06/27/2023, 3:00 PM
Thank you! That makes a lot of sense.
b

Brian Ren

06/27/2023, 3:50 PM
@Gerben van der Huizen ^^
However, internally Dagster requires all runs to use a job.
__ASSET_JOB
is a “hidden” job created by Dagster internals that includes all the assets passed to your definitions object. This is the job that will be used when materializing from the global asset graph. The reason we don’t use a job that you have included the asset in is that an asset can be included in multiple jobs, so it wouldn’t be clear which one to use from the global asset graph.
Hi @sean, I have another question regarding this. Currently we have some tags in our defined jobs, and these tags can be passed to the runs if we materialise a partition from job page. But if I materialise a partition from the asset graph,
__ASSET_JOB
doesn’t seem to have these tags and won’t pass tags to the run ofc. All of our assets only have a single job attached. How can we solve this? Can we pass tags to assets?
s

sean

06/28/2023, 2:06 PM
Hey Brian, You can set the
op_tags
param on
@asset
. I’m not sure if that will support all the tags you need though (as some tags are scoped to runs and need to be defined at the job level).
👍 1
b

Brian Ren

06/28/2023, 3:25 PM
@sean, is there a way to replace the
__ASSET_JOB
with our defined job?
s

sean

06/28/2023, 3:33 PM
In the left sidebar of the UI, you should see any asset jobs you’ve defined. If you click one, it will show an asset graph view including only the assets defined in the job. If you launch from this view it will use the specified job instead of
__ASSET_JOB
.
m

Male

06/28/2023, 4:12 PM
@sean, we have another use case where our downstream assets are materialized with the auto materialize policy based on upstream assets. Basically, we have scheduled materialization of upstream assets on a specific job. We need a custom job name for downstream assets as well. Is there a way to achieve this with this design?
b

Brian Ren

06/28/2023, 4:13 PM
@sean Oh yeah, I’m aware of that. But then there’s still the chance that people will go to the asset view and materialize/backfill partitions from there.
s

sean

06/28/2023, 4:15 PM
@Male IIUC you are asking if it is possible to set the job used by the auto-materialize policy? @Brian Ren We don’t currently have a way to set a “preferred job” to be used when materializing an asset from the global graph-- if you think this would be useful, can you please open a GH issue: https://github.com/dagster-io/dagster/issues
m

Male

06/28/2023, 4:26 PM
@sean Yes. Even if we are able to set some tags to these auto-materialized assets we are good if not job name. As per Dagster documentation, these auto-materialized assets will be assigned a tag by default as {"auto-materialize":true}. If we need a custom tag we need to add in the global.yaml file. But the problem here as well is that this yaml file will be associated with a Dagster instance. We have only one Dagster instance but multiple code location. The idea here is to keep a unique tag/job name to downstream assets created from each code locations.
And I believe now that if assets are directly materialized from the graph there is no way a custom job will be ran. Its quite intuitive for users to materialize assets from graph directly as opposed to first selecting a job and then asset materialization.
s

sean

06/28/2023, 4:43 PM
@owen is there any way to control the job used to materialize an asset by the auto-materialize daemon? @Male looks like you have the same concern as Brian regarding use of a “preferred” job for materialization from the global asset graph. If either of you can open an issue this will help us track this potential feature.
o

owen

06/28/2023, 4:57 PM
Hi @Male! Unfortunately, there's not currently a way to customize the job that's executed when an asset is auto-materialized (I believe that sort of functionality would likely fall under a hypothetical "preferred job" concept, which I agree would be useful in such situations) But to get back to the point of
The idea here is to keep a unique tag/job name to downstream assets created from each code locations.
Is this just an organizational tool? If so, dagster automatically sets a
.dagster/repository
tag on runs (which would be the name of the code location that the job lives in). This would allow you to query for runs that were launched from a given code location. Would this be sufficient for your purposes?
m

Male

06/29/2023, 2:40 PM
Hi @owen There are 3 unique scenarios now: 1. Upstream assets getting materialized on schedule – There is no problem here as its leveraging the custom job which we have created for the schedule. 2. Downstream assets getting auto materialized – Here as well Dagster adds a tag {“auto-materialized”:true} 3. Ad-Hoc materialization of assets from the asset graph In scenario number 3, only option to use a particular job is to first select the job from Dagit and then materialize it. But this is not intuitive. Is there a way at least where we can differentiate these assets which got materialized from the asset graph (without first selecting the job) from those which were either auto materialized or materialized on a schedule. This will solve our use case.
o

owen

06/29/2023, 8:45 PM
by "differentiate these assets", are you talking about differentiating the materializations? and at what layer? for a given materialization event, the DagsterEvent will have a
job_name
property which will tell you the name of the job the asset was materialized in (which will start with
__ASSET_JOB
if this was an ad-hoc / auto materialize run)
13 Views