Hello! I am starting with dagster and i have the first pipeline defined by several assets. one of the building blocks use an external api that works asynchronously and eventually places a file in gcs. how can i make dagster "poll" for when the async job finish? should i just write that in the code or can i use a sensor for that?
02/19/2023, 7:08 PM
I think it depends on how long you expect the async operation to complete. If it’s relatively short, it’s probably easier to poll as part of your code so you keep all of your logic in one place. If it’s a long wait, maybe a sensor is a better option.
Another option to consider is using Cadence/Temporal for managing async operations. Can’t say much on how it would interact with Dagster because we only just started looking at these tools.
02/21/2023, 8:41 PM
+1 to Oren's comments above on polling within an asset if it's a relatively short wait, or using a sensor if it's a longer wait.
One thing to note is that if you poll within a sensor instead of within the asset, downstream dependencies of the asset may not work as expected (since the downstream asset will believe the upstream asset has already materialized, so thus the downstream asset will begin execution). So you may have to define downstream assets in a separate job and kick off those assets using the sensor.