Hi All, I'm just starting diving into dagster and ...
# announcements
e
Hi All, I'm just starting diving into dagster and hoping to use it to refactor some of our (simple) data pipelines at for my place of work. I'm trying to create a reusable solid that would connect to our database and execute a sql query which would return the data in the form of a pandas dataframe. However, I'm receiving an error DagsterInvalidDefinitionError which states "Must pass the output from previous solid invocations or inputs to the composition function as inputs when invoking solids during composition.". Is it possible to pass just a regular data type (e.g string, int, etc) as a parameter to a solid instead of having it be the return type of another solid ?
a
Hey! What you’ve hit is certainly a confusing part of the API for new comers - we should improve the error message. The
@pipeline
decorated function is special - we just use it to build up the dependency graph of solids to define the pipeline - not resolve any “runtime” values or inputs. What you are looking to do is probably
Copy code
@pipeline
def hello_world():
  execute_db2_sql() # dont fill out inputs here

execute_pipeline(hello_world, environment_dict={'solids': {'execute_db2_sql': {'inputs': {'conn_str': '<CONN>', 'sql': 'SELECT dagster FROM good_stuff''}}}})
The point being you can fill out inputs to solids at execution time since they are parameterized. In python api this is the
environment_dict
and in dagit you use the yaml editor.
This recently updated section of the tutorial is relevant: https://dagster.readthedocs.io/en/0.6.3/sections/learn/tutorial/inputs.html
and you may also using the config system to parameterize your solids
e
ahhh yes. ok. while reading through the docs I saw the config files were available and was guessing that would be the appropriate solution.
does that mean each pipeline will have it's own config that defines inputs for it's solids (and other items available in the config) ?
a
yep each pipeline will have its own specific config schema
and so you’ll want to figure which pieces of information you want input that way and which you want to be static for that pipeline
e
perfect. I understand now. thanks for the help alex. this tutorial looks great, I'll give that a read as well.
a
considering sql for example you could consider this (quite complicated) example of having a solid builder function https://github.com/dagster-io/dagster/blob/master/examples/dagster_examples/airline_demo/solids.py#L44-L131
👍 1
c
that's what I ended up using for our prototype as well