wave I m using 1 3 1 with `AutoMaterializePolicy` more or e dagster #ask-community

:wave: I'm using 1.3.1 with `AutoMaterializePolicy...

Philippe Laflamme

04/24/2023, 9:00 PM

👋 I'm using 1.3.1 with

AutoMaterializePolicy

more or else like so:

Copy code

@asset(freshness_policy=FreshnessPolicy(max_lag_minutes=24*60, cron_schedule="2 15 * * *"), auto_materialize_policy=AutoMaterializePolicy.lazy())
def upstream_a():
  pass

@asset(partitions_def=some_partitions)
def upstream_b:
  pass

@asset(partitions_def=some_partitions, auto_materialize_policy=AutoMaterializePolicy.eager())
def downstream(upstream_a, upstream_b):
  pass

I've seen

downstream

be launched at least once with

created_by:auto_materialize

so this works at least sometimes. But today, I've noticed a run for

upstream_a

be automatically triggered and a sensor updating

upstream_b

more or less concurrently. Both runs produced their corresponding

ASSET_MATERIALIZATION

events, but no run for

downstream

has been scheduled (this was more than 30 minutes ago). What are my options to debug this?

owen

04/24/2023, 11:36 PM

hi @Philippe Laflamme! a few bits of information that might help debug on our side: • is

some_partitions

a TimeWindowPartitionsDefinition (e.g. DailyPartitionsDefinition)? • when you say that

upstream_b

is updated by a sensor, what does that sensor look like? • can you confirm if the partition that you want to get kicked off for

downstream

is present in

upstream_b

? the main reason why a run of

downstream

would not be kicked off even though it's eager and a parent has updated is that the corresponding upstream partition is missing (and so if a run was kicked off, it'd likely have incomplete data), so that's the first thing I'd want to check

Philippe Laflamme

04/24/2023, 11:54 PM

• is
some_partitions
a TimeWindowPartitionsDefinition (e.g. DailyPartitionsDefinition)?

Yes. It is a

DailyPartitionsDefinition

with a

start_date

timezone

and

end_offset=2

• when you say that
upstream_b
is updated by a sensor, what does that sensor look like?

It's a function annotated with

@sensor

which uses

some_job.run_request_for_partition(...)

where

some_job

contains 2 assets, the "second" asset being

upstream_b

(i.e.:

upstream_b

has its own upstream asset that runs in the same job)

• can you confirm if the partition that you want to get kicked off for
downstream
is present in
upstream_b
?

Yes, looking at the dagster UI, I can see the partition, its materialization event and the job that materialized it (i.e.:

some_job

) Another datapoint (which hopefully is relevant),

upstream_a

was materialized again 2h later for seemingly no apparent reason. It auto-materialized at 4:20pm (which I expected) and then again at 6:02pm but no other job kicked off around that time.

owen

04/24/2023, 11:55 PM

gotcha -- and are these the only relevant assets in this part of the graph (i.e. nothing else is upstream of these, and there are no freshness policies downstream of these?)

owen

04/24/2023, 11:56 PM

ah just read the above (that upstream_b has its own upstream asset)

Philippe Laflamme

04/24/2023, 11:57 PM

Right,

upstream_a

is the only one with a

FreshnessPolicy

downstream

is a leaf.

owen

04/24/2023, 11:57 PM

is upstream_b's parent also partitioned?

Philippe Laflamme

04/24/2023, 11:57 PM

Yeah, it's in the same job that's being kicked off by the

sensor

Philippe Laflamme

04/24/2023, 11:57 PM

so it needs the same partitionsdef

owen

04/24/2023, 11:58 PM

ah technically an unpartitioned asset can run in the same job as a partitioned one (it's just that if they're both partitioned, they need the same def)

Philippe Laflamme

04/24/2023, 11:59 PM

ah, yeah that makes sense... in this case, it's the same partitions_def and

upstream_b

consumes its "parent" as a dependency

owen

04/24/2023, 11:59 PM

gotcha, makes sense -- I'm going to see if I can replicate this, it's definitely weird behavior

Philippe Laflamme

04/25/2023, 12:04 AM

great, thanks! Maybe the last datapoint that might be relevant: the sensor kicked off

some_job

at 41945pm, `upstream_b`'s materialization event happened at 42057pm.

upstream_a

was kicked off as an ad-hoc materialization (no job) at 42005 and its materialization event happened at 42018pm. So `upstream_a`'s event occurred before

upstream_b

according to this, but still relatively close to one-another if that matters.

Philippe Laflamme

04/25/2023, 12:06 AM

I'm not sure why

upstream_a

was kicked off though to be honest. It's freshness was definitely out of date and it's

lazy

so I figure that's why?

Philippe Laflamme

04/25/2023, 12:07 AM

(it'd be great to know "why" an auto materialization occurred)

owen

04/25/2023, 12:08 AM

yep -- when it's lazy, materializations will just be kicked off to try to keep the asset (or any downstream assets) in line with their freshness policies that last data point is interesting, thanks for sharing that -- the fact that two "sources" of information that might trigger a refresh are arriving around the same time might contribute to this weird behavior. I'll try to replicate that specific setup

owen

04/25/2023, 12:10 AM

and also totally agree on knowing why an automaterialization occurred! this is high priority for us, and we're actively working on the changes necessary to populate the backend / UI with this info

👍 1

Philippe Laflamme

04/25/2023, 2:29 AM

2 more datapoints: • I forgot to mention that the partition in question was for "today" (

end_offset=1

) I don't think it should matter, but it's relevant for the next point • tomorrow's partition (

end_offset=2

) was correctly auto-materialized just now (~10:20pm). The sequence was as expected: the sensor detected the data, kicked off a

some_job

run which materialized

upstream_b

(and its parent) and then something kicked off a run for

downstream

As expected

upstream_a

was not materialized since it was within its freshness policy

Philippe Laflamme

05/29/2023, 2:42 PM

Any updates on this? I'm now on

1.3.6

and still seeing this problem on occasion. One thing to note is that my daemon doesn't run 24/7. When I leave things running the problem doesn't seem to occur. This usually occurs when I start the daemon after it hasn't been running for several hours (say 12 to 16 hours). In that situation, when the daemon starts, my sensors start scheduling runs for various assets, the "auto-materialize for freshness" runs get scheduled, but the "eager auto-materialize" assets do not get kicked off after their parent asset gets materialized (which were runs scheduled by the sensors).

owen

06/05/2023, 11:52 PM

sorry for the late reply on this! I think this notification got lost when I was scrolling through my other threads... as of 1.3.7, there is a new auto materialize evaluation UI that you can access on the Asset Details page of each asset. You'll need to run

dagster instance migrate

before any data gets written, but I think this would help a ton in understanding what's going on here. whenever a parent of an eager auto-materialize asset is materialized, that asset will be evaluated to see if it makes sense to materialize it as well. in most cases it will, but there are some exceptions, for example if any of its parents are missing, or if any of its parents have out-of date data. regardless of if it's materialized or skipped, a reason will be recorded and visible in the UI

owen

06/05/2023, 11:53 PM

hard to say exactly why it might be skipping, but the info of why it thinks it should skip materializing that asset will be very helpful to debug it on our side (as I still haven't been able to replicate this behavior on my end)

Philippe Laflamme

06/06/2023, 2:05 PM

Thanks, I’ve enabled all of this and will be keeping an eye on it. FWIW: I might be hitting the problem that the partition that should be materialized is not the latest (it’s the one for “yesterday”). I don’t think it’s configurable right now; is there an intention to allow considering more than the latest partition?

owen

06/06/2023, 4:15 PM

ah yep, removing that limitation is definitely part of the plan

👍 1

4 Views

Open in Slack

Previous Next