Can you dynamically define asset dependencies / na...
# ask-community
m
Can you dynamically define asset dependencies / names based on data being processed? I have a db with primary keys for different experiments being run. Experiments have a raw-data gathering op, and a result-computing op. Most experiments only depend on their own raw-data output, but some read from other related experiments -- something I can figure out by reading the db. Right now, I just do all the raw-data steps, and then all the result-computing steps. Could I read some db state, and then dynamically generate assets for all the raw-data and result-computing steps, with appropriate dependencies, and then let Dagster figure out the right order to run the ops in? What I saw in the asset docs only showed asset names declared statically.
j
you might be able to do something like the asset factory pattern here . probably the cleanest way to do it would be to read the db state outside of an asset. If you need read the db from within an asset you might be able to do something like this, but it required a definitions reload. Within an asset, read the db and then append the required info to a yaml file, and then in your python file you make assets based on the contents of the yaml file. Then when you materialize the first asset, the info to make the next assets is written to the yaml file. then you would reload the definitions, which would make all the assets from the yaml file and load them into the definitions
đź‘€ 1
m
Interesting. I would want to generate the asset definitions while the Dagster job is running -- there is some other work around refreshing metadata that needs to happen first. So reading the db state when creating the Definitions probably wouldn't work. Is there "outside of an asset" that is still inside the Dagster job execution, or would that just be doing a definitions reload?
j
is there “outside of an asset” that is still inside the Dagster job execution,
not really. the issue with creating the assets while inside a dagster run is adding them to the definitions. I can’t think of a way to do that that doesn’t require a reload. definitely play around with it with some small examples, there could be some python tricks to make it work, i just dont know of any
👍🏻 1