:wave: everyone. I'm very new to Dagster as I'm tr...
# ask-community
r
👋 everyone. I'm very new to Dagster as I'm trying to convert an existing script to a Dagster pipeline for the first time. This is not working: • I have a solid returning a dict • I have a downstream solid attempting to iterate through the dict • =>
Error loading repository location glue.py:dagster.core.errors.DagsterInvariantViolationError: Attempted to iterate over an InvokedSolidOutputHandle. This object represents the output "result" from the solid "load_partitions". Consider yielding multiple Outputs if you seek to pass different parts of this output to different solids.
I'm not using any AssetMaterialization, simply returning a Python dict. Any help would be appreciated.
d
New to dagster too and still learning. By any chance are any your solids returning multiple outputs? If so you'll have to define Output Definitions and return them by yielding the outputs.
c
Would you mind posting your code? Makes it easier to tell what's going on.
r
thanks for the reply, here is the code:
select_tables
returns a dict. With this exact code, I get the following when running `dagit`:
/Users/remi/.pyenv/versions/3.8.9/lib/python3.8/site-packages/dagster/core/workspace/context.py:510: UserWarning: Error loading repository location glue.py:AttributeError: 'InvokedSolidOutputHandle' object has no attribute 'items'
I'm assuming that dagster's
mem_io_manager
is converting the dict output into something else. I'm very new to Dagster (5 minutes of experience lol)
c
So I think the main issue here is that pipelines aren't intended to take inputs like this. database_name, input_table_names, new_table_name should all probably be specified via config
Is select_tables also a solid?
r
Using the context.config makes sense for these. yes
select_tables
is a solid. It pulls a list of tables from a data source.
in this case it's from a Glue Data Catalog
c
Okay, gotcha. Yea so for example, you could parameterize select_tables like so:
Copy code
@solid(config_schema={"database_name" str})
def select_tables(context):
    db_name = context.solid_config["database_name"]
    ....
And then in dagit, you can navigate to the playground, and provide this config there when executing. If using
execute_pipeline
, you can provide via the
run_config
argument. For more information on these aspects: Execution docs Run config docs (although I just found an error in these docs, the
config_example_solid
should have a config_schema defined lol. Gonna fix that.)
r
Got it @chris I'm trying to use context.config, how do I share config across solids?
I've gotten much further btw, dagit works and I'm able to run the pipeline, yay. Now I wonder how to set pipeline-level config 🤔 I'm repeating the same config values across solids:
p
@Remi Gabillet You may want to check out resources, which you can configure once and access from your solids https://docs.dagster.io/concepts/modes-resources
c
^ this is a super clean option as well. If you need to be able to change on a per-run basis, you can also use the
composite_solid
abstraction, which allows you to map config to internal solids: https://docs.dagster.io/concepts/solids-pipelines/composite-solids#configuration-mapping
r
thanks for the help everyone!