geoHeil
12/08/2022, 12:10 PMowen
12/08/2022, 6:22 PMdagster-slack
in today's release. Essentially:
make_slack_on_freshness_policy_status_change_sensor(
asset_selection=..., # assets to alert on
channel="#foo",
slack_token=os.getenv("SLACK_API_TOKEN"),
)
geoHeil
12/09/2022, 9:55 AMfreshness_policy_sensor
and make_slack_on_freshness_policy_status_changed_sensor
is the Slack one only a special sub-kind of the first one?make_slack_on_freshness_policy_status_changed_sensor
in code or in any dagster documentation. Can you provide more documentation on how it should be used? In particular 1) do the assets require a freshness policy? 2) when I have a Sensor which hourly checks if a new file has arrived (which usually will only arrive daily) how can I bridge/combine the notion of freshness (schedule) with the dynaic sensor (a file has arrived)? In particular, I want to use the freshness sensor as a workaround to detect issues (operation takes longer than expected to run) and wonder how this could work.owen
12/09/2022, 5:16 PMfreshness_policy_sensor
, but saves you from having to write the slack-specific code yourself (with some extra options thrown in on top). I accidentally typo'd the name (changed
should have been change
), edited above. It's in the dagster-slack
library version 0.17.6
, and you can find the API docs here: https://docs.dagster.io/_apidocs/libraries/dagster-slack#dagster_slack.make_slack_on_freshness_policy_status_change_sensorgeoHeil
12/11/2022, 11:02 PMowen
12/12/2022, 5:46 PMFreshnessPolicy(maximum_lag_minutes=???, cron_schedule="0 9 * * *")
would basically work for you. this says "by 9AM, there should be a materialization of this asset which incorporates all upstream data from at least ???
minutes ago".
the main friction with the current implementation (vs. the ideal version-based implementation) is selecting a good value for ???
. Right now, this will depend on when you expect the file to come in. If you set maximum_lag_minutes to be too low (let's say 60
minutes), then if the file comes in before that time (let's say 7AM) and everything runs as expected, then the current logic will interpret your downstream asset as "has all of the upstream data up until 7AM
". When 9AM rolls around, the freshness policy will demand that the asset has all the data from up until 60 minutes before 9AM (so 8AM
), meaning it will fail its freshness check, even though everything went as expected.maximum_lag_minutes=9*60
)maximum_lag_minutes=0
, meaning "by 9AM, all available upstream data should be incorporated into this asset"geoHeil
12/12/2022, 9:33 PMowen
12/12/2022, 9:41 PMgeoHeil
12/13/2022, 2:35 PM