Hello! I would like to ask if it's possible to use...
# integration-bigquery
a
Hello! I would like to ask if it's possible to use
yield
instead of
return
to push data into
BigQuery
with
BigQueryPandasIOManager
? I have a loop that pushes data into
BigQuery
every iteration, so theoretically I would need to use
yield
so that the asset won't terminate. However, I would get this error when using
yield
any help would be very helpful!!
o
hi @Akira Chang! Good question. This discussion covers the basics of what you're trying to accomplish here, but you'd need to write your own IOManager (which could be based off of the BigQueryPandasIOManager, and likely share most of the code) to allow it to process the generator function output
a
Hello! Thank you for your reply! Could you elaborate on this as I am kind of new to
BigQueryIOManager
?
o
Hi! The main work that
BigQueryIOManager
is doing is happening here: https://sourcegraph.com/github.com/dagster-io/dagster/-/blob/python_modules/librarie[…]gster_gcp_pandas/bigquery/bigquery_pandas_type_handler.py?L49 (basically just a way to write a dataframe into a bigquery table), so you'd want to adapt that to your use case. The function signature there is a bit different from a regular IOManager's type signature, and rather than try to get into the specifics of why that is (it can get somewhat complicated), I'd recommend just writing your own IOManager from scratch: https://docs.dagster.io/_apidocs/io-managers
for that, you'll just need to implement two functions (handle_output and load_input)
(sorry had this in drafts, forgot to hit send) the gist is that the handle_output function will look similar to what was posted in the original discussion, where it accepts as input a generator function, then consumes the generator, writing each chunk using the same method that bigquery_pandas_type_handler uses. for load_input, you'd want to do something similar to: https://sourcegraph.com/github.com/dagster-io/dagster/-/blob/python_modules/librarie[…]gster_gcp_pandas/bigquery/bigquery_pandas_type_handler.py?L78
a
Okk I see! Thank you! Due to the complexity, Ive decided to use the big query API provided from Google. Thank you for the help again!! 😊