wave everyone I m very new to Dagster as I m trying to conv dagster #ask-community

:wave: everyone. I'm very new to Dagster as I'm tr...

Remi Gabillet

09/03/2021, 2:30 PM

👋 everyone. I'm very new to Dagster as I'm trying to convert an existing script to a Dagster pipeline for the first time. This is not working: • I have a solid returning a dict • I have a downstream solid attempting to iterate through the dict • =>

Error loading repository location glue.py:dagster.core.errors.DagsterInvariantViolationError: Attempted to iterate over an InvokedSolidOutputHandle. This object represents the output "result" from the solid "load_partitions". Consider yielding multiple Outputs if you seek to pass different parts of this output to different solids.

Remi Gabillet

09/03/2021, 2:30 PM

I'm not using any AssetMaterialization, simply returning a Python dict. Any help would be appreciated.

Darren Ng

09/03/2021, 2:33 PM

New to dagster too and still learning. By any chance are any your solids returning multiple outputs? If so you'll have to define Output Definitions and return them by yielding the outputs.

chris

09/03/2021, 2:33 PM

Would you mind posting your code? Makes it easier to tell what's going on.

Remi Gabillet

09/03/2021, 2:38 PM

thanks for the reply, here is the code:

Remi Gabillet

09/03/2021, 2:39 PM

Remi Gabillet

09/03/2021, 2:40 PM

select_tables

returns a dict. With this exact code, I get the following when running `dagit`:

/Users/remi/.pyenv/versions/3.8.9/lib/python3.8/site-packages/dagster/core/workspace/context.py:510: UserWarning: Error loading repository location glue.py:AttributeError: 'InvokedSolidOutputHandle' object has no attribute 'items'

Remi Gabillet

09/03/2021, 2:41 PM

I'm assuming that dagster's

mem_io_manager

is converting the dict output into something else. I'm very new to Dagster (5 minutes of experience lol)

chris

09/03/2021, 2:43 PM

So I think the main issue here is that pipelines aren't intended to take inputs like this. database_name, input_table_names, new_table_name should all probably be specified via config

chris

09/03/2021, 2:43 PM

Is select_tables also a solid?

Remi Gabillet

09/03/2021, 2:46 PM

Using the context.config makes sense for these. yes

select_tables

is a solid. It pulls a list of tables from a data source.

Remi Gabillet

09/03/2021, 2:46 PM

in this case it's from a Glue Data Catalog

chris

09/03/2021, 2:51 PM

Okay, gotcha. Yea so for example, you could parameterize select_tables like so:

Copy code

@solid(config_schema={"database_name" str})
def select_tables(context):
    db_name = context.solid_config["database_name"]
    ....

And then in dagit, you can navigate to the playground, and provide this config there when executing. If using

execute_pipeline

, you can provide via the

run_config

argument. For more information on these aspects: Execution docs Run config docs (although I just found an error in these docs, the

config_example_solid

should have a config_schema defined lol. Gonna fix that.)

Remi Gabillet

09/03/2021, 3:02 PM

Got it @chris I'm trying to use context.config, how do I share config across solids?

Remi Gabillet

09/03/2021, 3:23 PM

I've gotten much further btw, dagit works and I'm able to run the pipeline, yay. Now I wonder how to set pipeline-level config 🤔 I'm repeating the same config values across solids:

prha

09/03/2021, 3:48 PM

@Remi Gabillet You may want to check out resources, which you can configure once and access from your solids https://docs.dagster.io/concepts/modes-resources

chris

09/03/2021, 4:30 PM

^ this is a super clean option as well. If you need to be able to change on a per-run basis, you can also use the

composite_solid

abstraction, which allows you to map config to internal solids: https://docs.dagster.io/concepts/solids-pipelines/composite-solids#configuration-mapping

Remi Gabillet

09/04/2021, 8:45 AM

thanks for the help everyone!

2 Views

Open in Slack

Previous Next