https://dagster.io/ logo
#integration-bigquery
Title
# integration-bigquery
d

Dennis Hendricks

06/01/2023, 7:29 AM
We want to store newsletter data in BigQuery. Since the statistics change even after sending (open rate, click rate), we need to update the data retroactively. Currently, we have defined the data as a partitioned asset in Dagster, which we export to BQ via BigQueryPandasIOManager. Using the UI or the CLI (via dagster job backfill), backfilling the data works fine. However, we currently fail to schedule the job. build_schedule_from_partitioned_job would only fill the last partition, even per Freshness Policy and Auto Materialization we got nowhere. Is there a way to schedule the backfill of a partitioned asset in BigQuery? Maybe by triggering a CLI action via scheduled job?
j

jamie

06/01/2023, 3:30 PM
hey @Dennis Hendricks I think you will need to define a more custom schedule that yields RunRequests. Here’s an example from the docs https://docs.dagster.io/concepts/partitions-schedules-sensors/schedules#schedules-that-provide-custom-run-config-and-tags. In your case, you’ll want to specify the
partition_key
parameter for
RunRequest
with the partition you want to execute https://docs.dagster.io/_apidocs/schedules-sensors#dagster.RunRequest
d

Dennis Hendricks

06/02/2023, 11:56 AM
Thanks Jamie! However, single runs would not be possible on that way, correct? partition_key only accepts a datetime string and not an array.
3 Views