I'm trying to use Dagster to control ETL flow for ...
# dagster-feedback
u
I'm trying to use Dagster to control ETL flow for data warehouse projects but... I've read through the doc and searched for the answer. It seems the scheduler CANNOT support
incremental refresh
. Like this case: 1. I need the scheduler run ONCE A DAY 2. The scheduler runs job which contains ASSETS that are PARTITIONED BY MONTH 3. I need the scheduler refresh the MOST RECENT TWO MONTHS of data I've tried the
build_schedule_from_partitioned_job
function - it automatically runs the job once a month because the assets are partitioned by month. I've tried using
ScheduleDefinition
- It just won't work and returned error. Please. Any workaround?
c
Check out the
@schedule
decorator - you can define a once-a-day cron string, then define your asset separately that is partitioned by month. You can define the scheduler to send a run request for a time window of the last two months of data.