https://dagster.io/ logo
#dagster-support
Title
# dagster-support
g

geoHeil

07/11/2022, 7:51 AM
When working with DBT why does the IO manager try to perform WRITE (handle_output call) operations whilst steps take place only inside DBT? I do not want to write the asset a 2nd time from python.
y

yuhan

07/11/2022, 7:20 PM
i think the io manager is persisting metadata that’s passed in between ops cc @owen to confirm this^ and could we make io managers work better with the dbt integration such that we could make the side effects that are happening in dbt land part of io manager (/ would it be a good practice)?
g

geoHeil

07/11/2022, 8:50 PM
no the regualr handle_output method is called
o

owen

07/11/2022, 11:10 PM
yeah this is definitely a bit weird, and it's because dagster doesn't have a concept to encapsulate "this output is stored inside the body of the op". for these cases (if you're using a custom io manager), the best practice would just be to have handle_output() be a no-op. If you're using this custom io manager to handle non-dbt outputs in addition to dbt outputs, you could potentially branch the handle_output behavior based on the type of the output object. The operation generated by load_assets_from_dbt_project just returns
None
as the `output_value`for each of the dbt models, so a pattern I've used is to check if the output is None and if it is, do nothing inside of handle_output.
another option that might be interesting to look into here would be https://docs.dagster.io/concepts/io-management/io-managers#asset-input-io-managers. This lets you decide how to load the dbt model as input independently of how it was stored as output. So you could create a
noop_io_manager
that you apply when you call
load_assets_from_dbt_project
, then supply an input manager key to any downstream python assets that consume a dbt asset. this feature is mostly intended to help for situations where you might want to have a different loading behavior per dbt model, but I figured I'd mention it
g

geoHeil

07/12/2022, 5:45 AM
Would you have a code example for this? So far I am specifying an pandas IO manager where handle output defaults to noop and the input can then be read again. The link you share annotates the assets with inputmanagerkey: But the DBT assets are not part of the python code (implicitly created from DBT). So I am not really sure where to put the inputmanager key
42 Views