Simon Ammer
03/06/2024, 9:23 AMScout
03/06/2024, 9:23 AMpython from dagster import AssetSelection, define_asset_job, ScheduleDefinition asset_job = define_asset_job("asset_job", AssetSelection.groups("some_asset_group")) basic_schedule = ScheduleDefinition(job=asset_job, cron_schedule="0 9 * * *")
Sensors, on the other hand, are designed to trigger jobs based on some external change or event, such as a new file in an S3 bucket or a system status change. They periodically check for these changes and can be used to kick off jobs when a specific condition is met. For a use case where you need to poll an API every few minutes, you could technically use either a schedule or a sensor. A schedule would be appropriate if you want to ensure that the polling happens at exact, regular intervals, regardless of whether there is new data. A sensor could be used if you want to check for new data and only trigger a job if there is new data to process. If the polling is simply meant to happen at regular intervals and is not dependent on an external event, a schedule might be more straightforward to implement. However, if you want to add logic to only trigger the job when there is new data, or if the polling needs to be more dynamic based on other conditions, a sensor might be more appropriate. The documentation provides examples of both schedules and sensors, but it does not explicitly state which to use for polling every x minutes. The choice between using a schedule or a sensor for polling a data source every x minutes would depend on the specific requirements of your workflow and whether you need the flexibility to trigger based on conditions other than time.