:wave: Hello, team! I'm wondering how I can loop o...
# ask-community
a
đź‘‹ Hello, team! I'm wondering how I can loop over the output of an OP in a job. The OP structure is as follows: 1. Fetch data from API --> List 2. Loop through that List to execute API calls I get this error when I try to do that in a job:
Copy code
Attempted to iterate over an InvokedSolidOutputHandle. This object represents the output "result" from the solid "extract_game_ids_to_list". Consider defining multiple Outs if you seek to pass different parts of this output to different solids.
a
https://docs.dagster.io/concepts/ops-jobs-graphs/dynamic-graphs the
@job
funciton is evaluated at init time to determine the dependency structure of the graph, it isn’t used at “run” time
a
Thanks @alex, so I would want to create a third OP to loop through the output?
a
i think you want to convert
extract_game_ids_to_list
to use a “dynamic output” instead of a list, if the goal is to run down stream work separately for each id
a
extract_game_id_to_list outputs a game_id that is used to call an api and return a json object basically
I would then load that json to s3 (using another OP I guess?)
Maybe I haven't fully grasped the idea of a job just yet
a
basically, all actual code execution has to happen in
op
s . So you could either • do the iteration inside
extract_game_data_to_json
, making it
List[id]
->
List[json]
• use dynamic outputs, which will effectively clone the
extract_game_data_to_json
that goes
id
->
json
op for each
id
that is determined at runtime
a
Is it common to handle the entire extraction/load to s3 in one op?
Assuming yes from the docs:
I'm basically trying to do bullet # 4
a
ya one way to look at it is checkpointing, if something fails how much do you want to re-do you could just do everything all in one big op, but you have to start all the way over if any thing fails similarly, operating on whole lists, you have to re-do the whole list if just one item fails something like splitting extract and load comes down to how expensive the extract is, if the load fails do you care if you have to re-extract?
a
the idea then is to encapsulate ops within jobs within graphs then right?
a
a dependency graph of ops is a
graph
a
job
is an executable
graph
b
I use DynamicOutput for a very similar operation: Retrieve list if IDs from API --> Fan out to individual API requests for each ID in the list --> Collect all the results and pass them on for downstream processing