Hey there! We're currently working with the Dagst...
# integration-dbt
j
Hey there! We're currently working with the Dagster dbt integration and have encountered some unexpected behavior related to asset materialization. When we manually trigger the materialization of our assets (dbt models),
context.dagster_run.asset_selection
correctly returns the corresponding assets. However, when the same assets are materialized by a scheduled job,
context.dagster_run.asset_selection
unexpectedly returns None instead of the asset keys. Our understanding is that
context.dagster_run.asset_selection
should behave consistently, regardless of whether the assets are materialized manually or by a scheduled job. We have ensured that there are no differences in code or environment configurations between the manual and scheduled runs that could account for this disparity. Is this difference in behavior expected? If not, could someone please guide us towards potential solutions?
r
@owen this looks like an underlying Dagster framework issue — is there a workaround here?
o
Hi @Joseph Marcus -- The
asset_selection
is not a public property of the
DagsterRun
object, it's more of an internal implementation detail. There's not a hard-set reason for this difference in behavior, but the basic idea is that the asset_selection generally represents a subselection of the total set of assets in a job (so if you're materializing all assets in a job, there is no subselection, it's just all the assets). What's your usecase for accessing this? There's likely another way of doing this that would be more consistent
j
Hi @owen, Thanks for your response. Our use case for accessing asset_selection is a bit unique. We have set up an alerting mechanism to notify our team on an internal Slack channel whenever a Dagster job fails. To provide more granular context around each failure, we aim to include the specific assets that failed to materialize in our alerts. We have been leveraging asset_selection to fetch this information. Given that asset_selection is not a public property and its behavior may not be consistent, could you suggest an alternate way to get this information? We still need a method to identify the specific assets involved in a failure during a scheduled job run. Thanks!
d
Hi @owen! Thanks for the response. I work with @Joseph Marcus, here are some examples of what we’re doing. It’s pretty cool!!!
r
Could you do the same thing by accessing the dbt run_results.json? You should be able to retrieve dbt artifacts in the op after invoking the dbt command
d
thanks @rex! we’ll look into it 🙂
j
Thanks @rex!
d
Hi @rex! Update from our side. We are able to use the
run_results.json
, perfect suggestion! We have it implemented locally. In our cluster though, our user code deployment spins up a new ephemeral pod for every run. How can we capture the
run_results.json
from this pod before it gets spun down?