Hello all,
I am looking at moving above the software defined asset layer for dbt executions. I just don't see the SDA approach scaling as we have Python scripts which ingest hundreds of tables in a single execution, and multiple dbt projects downstream from that. Even if I go to the trouble of refactoring our dbt projects for key linking between them, I don't see how a single ingestion script can generate the requisite hundreds of source table SDAs for the dbt assets to link to. Ideally, I would like to define an op as a run/build/test execution of a dbt project through the dbt CLI. I believe I can then build a graph of these executions, link them to the source data ingestion step also as an op, define a graph, and execute the graph in a single job. Does anyone have an example of a dbt CLI execution within an op? DBT 1.5 is now out that has the Python library to execute DBT within Python, which could be done within an op. Not many code examples to go on yet. Any input, guidance, or sanity check is appreciated.