https://dagster.io/ logo
Title
r

Remi Gabillet

09/03/2021, 2:30 PM
👋 everyone. I'm very new to Dagster as I'm trying to convert an existing script to a Dagster pipeline for the first time. This is not working: • I have a solid returning a dict • I have a downstream solid attempting to iterate through the dict • =>
Error loading repository location glue.py:dagster.core.errors.DagsterInvariantViolationError: Attempted to iterate over an InvokedSolidOutputHandle. This object represents the output "result" from the solid "load_partitions". Consider yielding multiple Outputs if you seek to pass different parts of this output to different solids.
I'm not using any AssetMaterialization, simply returning a Python dict. Any help would be appreciated.
d

Darren Ng

09/03/2021, 2:33 PM
New to dagster too and still learning. By any chance are any your solids returning multiple outputs? If so you'll have to define Output Definitions and return them by yielding the outputs.
c

chris

09/03/2021, 2:33 PM
Would you mind posting your code? Makes it easier to tell what's going on.
r

Remi Gabillet

09/03/2021, 2:38 PM
thanks for the reply, here is the code:
select_tables
returns a dict. With this exact code, I get the following when running `dagit`:
/Users/remi/.pyenv/versions/3.8.9/lib/python3.8/site-packages/dagster/core/workspace/context.py:510: UserWarning: Error loading repository location glue.py:AttributeError: 'InvokedSolidOutputHandle' object has no attribute 'items'
I'm assuming that dagster's
mem_io_manager
is converting the dict output into something else. I'm very new to Dagster (5 minutes of experience lol)
c

chris

09/03/2021, 2:43 PM
So I think the main issue here is that pipelines aren't intended to take inputs like this. database_name, input_table_names, new_table_name should all probably be specified via config
Is select_tables also a solid?
r

Remi Gabillet

09/03/2021, 2:46 PM
Using the context.config makes sense for these. yes
select_tables
is a solid. It pulls a list of tables from a data source.
in this case it's from a Glue Data Catalog
c

chris

09/03/2021, 2:51 PM
Okay, gotcha. Yea so for example, you could parameterize select_tables like so:
@solid(config_schema={"database_name" str})
def select_tables(context):
    db_name = context.solid_config["database_name"]
    ....
And then in dagit, you can navigate to the playground, and provide this config there when executing. If using
execute_pipeline
, you can provide via the
run_config
argument. For more information on these aspects: Execution docs Run config docs (although I just found an error in these docs, the
config_example_solid
should have a config_schema defined lol. Gonna fix that.)
r

Remi Gabillet

09/03/2021, 3:02 PM
Got it @chris I'm trying to use context.config, how do I share config across solids?
I've gotten much further btw, dagit works and I'm able to run the pipeline, yay. Now I wonder how to set pipeline-level config 🤔 I'm repeating the same config values across solids:
p

prha

09/03/2021, 3:48 PM
@Remi Gabillet You may want to check out resources, which you can configure once and access from your solids https://docs.dagster.io/concepts/modes-resources
c

chris

09/03/2021, 4:30 PM
^ this is a super clean option as well. If you need to be able to change on a per-run basis, you can also use the
composite_solid
abstraction, which allows you to map config to internal solids: https://docs.dagster.io/concepts/solids-pipelines/composite-solids#configuration-mapping
r

Remi Gabillet

09/04/2021, 8:45 AM
thanks for the help everyone!