https://dagster.io/ logo
Title
p

Pablo Beltran

05/03/2023, 11:24 PM
Hey there! I am using the bigquery pandas io manager to periodically sync small tables to bigquery. However I am noticing that there are small periods of time where the table is empty. It seems like the operation is not atomic so there is a period of time when refreshing the table in which there is no data. Is there a way to make this atomic?
2
j

jamie

05/04/2023, 4:54 PM
hey pablo. This is likely because the IO manager drops the table before loading the new data. I can look into making this happen in a single transaction, but it’ll involve changes to the base DB IO manager so i’ll need to consider the effects on the other io managers that use the DB IO manager. I’ll create a GH issue to track https://github.com/dagster-io/dagster/issues/14091
p

Pablo Beltran

05/05/2023, 7:47 PM
Thanks! I went ahead and did it via a custom bq io manager for now so no rush on my end.
d

Dan Meyer

05/29/2023, 9:42 PM
@jamie is there a way you'd recommend to architecture around this? Specifically, we're thinking of serving some data directly out of Snowflake - but if a query happens during a Dagster write operation, we could get funky results. Maybe a blue-green deployment of our terminal tables?