Hi Marc - sorry I haven't got back to you sooner. I appreciate the kind words and I am glad it could help you on your Dagster journey!
Both methods work perfectly fine and it comes down to your use case but I can try to explain my thought process. When you load data into BigQuery you can either overwrite the existing table or append to the table.
If you want to load directly into BigQuery, you need to append the data to the table as overwriting the table would require you to extract all previous gameweeks data each time you want to append a new gameweek into BigQuery. Now that we have appended the data, we have to think of the consequences of rerunning any of the Dagster partitions at a later date such as running backfills. If we are not careful there would be two versions of data for the same gameweek in the BQ table. A solution to overcome this, is creating another table which queries the first table and selects the "correct" version of the data.
I decided against this and in turn decided to overwrite the table each time. To prevent having to extract all of the previous gameweeks when running the pipeline for a new gameweek I decided to store the extracted data in GCS first. So for each gameweek, I extract that gameweeks data into GCS and then load all of the previous gameweeks stored in the bucket into a BigQuery table.
A better approach than mine is actually creating a partitioned BigQuery table (these partitions are different from Dagster partitions) as then you can simply overwrite the associated partition in BigQuery rather than the whole table. I didn't use this method because the size of my data is very small so there is no point partitioning the data in BigQuery. Note I am not saying my method is the best but I found it to be the simplest for my use case.
Hopefully this helps and I am happy to clarify or answer any other questions you may have.