Tobias Macey
01/18/2023, 3:43 PMload_assets_from_airbyte_instance
) which creates the raw data tables which my dbt project relies on. I also have it loading the dbt project using load_assets_from_dbt_project
, which loads all of those assets. Unfortunately the staging dbt models are disconnected in the lineage view from the airbyte streams. If I follow things correctly it seems that I need to define an IO manager that will serve to map those two sets of assets together? Or is there another way to tell Dagster that the two groupings of assets are related without having to write a custom IO manager that communicates with AWS Glue.Jonathan Neo
01/18/2023, 3:49 PMsources.yml
to define the airbyte sources?
That might be what's missing.
Here's a toy project that I put together with airbyte and dbt: https://github.com/jonathanneo/data-aware-orchestrationTobias Macey
01/18/2023, 3:52 PMJonathan Neo
01/18/2023, 3:54 PMversion: 2
sources:
- name: trino
database: trino
schema: public
tables:
- name: airbyte_asset_name # this is what (1) dagster will use to create the global DAG, and (2) what dbt source() macro will use
identifier: trino_table_name # this is what dbt will physically use to run the model
Tobias Macey
01/18/2023, 4:21 PMkey_prefix
in the Airbyte asset loader so that it was scoped to the full asset key that dbt was looking at. Thanks!