Hey all, beginner question here but is there a way...
# integration-bigquery
j
Hey all, beginner question here but is there a way to let dagster update the schema of output asset tables in BQ just based on code changes? I’m developing some models as assets and when I change the model in dagster code I currently have to delete the output table in BQ and then re-materialized the asset. Would be great if there’s a configuration I’m missing that can just overwrite the output table regardless.
j
Hey @Josh Kutsko are you using the bigquery resource or the bigquery io manager?
j
I’m using the bq io manager to save the asset, but using the bigquery resource occasionally for direct sql queries.
j
ok! the default behavior of the bq io manager is to drop/truncate the table corresponding to an asset and then upload the data for the re-materialized asset. If you’re using partitions we just delete the rows corresponding to the partition that’s deleted. Is that not the behavior you see? if you have some code that replicates the issue i can help debug
j
interesting, ok. Truncating the table is the behavior I want, but for some reason instead what I’m seeing is this error:
Copy code
google.api_core.exceptions.BadRequest: 400 POST <https://bigquery.googleapis.com/upload/bigquery/v2/projects/{{projectId}}/jobs?uploadType=multipart>: Provided Schema does not match Table {{projectId}}:models.domain_map. Cannot add fields (field: TEST_NEW_FIELD)
My code is just returning the results of a sql query: @asset( io_manager_key=“bq_io_manager”, key_prefix=“models”, ) def domain_map( bigquery: BigQueryResource, ) -> pandas.DataFrame: with bigquery.get_client() as client: return ( client.query( “”" select domain, uuid, ‘test’ as test_new_field from raw_data.domains”“” ) .to_dataframe() .drop_duplicates(subset=[“domain”]) ) ``````
For context, this is the pandas bq io manager, if that has different default behavior. i’ve looked through the code and can’t find a way to configure the table write method.
j
ah yeah, if you’ve added a new column to your asset then you need to delete the whole table before re-materializing. it’s annoying. fwiw this is an issue i’m actively thinking about, it’s just quite complex to solve so i haven’t come up with anything yet
j
Makes sense, I can see why its complex. I will do the delete and recreate myself for now then, but is there a place I can follow (maybe that issue) for updates if this behavior changes?
j
yep that issue will be a good spot! I also post any updates in this channel
j
great, thanks!