https://dagster.io/ logo
b

Ben Fogelson

12/18/2019, 11:20 PM
Is there a way for the input to a
solid
to have a default value? e.g.
Copy code
@solid
def some_solid(context, x, y=1.0):
    do_something(x, y)
Related question, is there a way to set the value for a solid in config so that aliases of the solid inherit that value?
a

abhi

12/19/2019, 1:08 AM
Yes and no. You can set default values for solid inputs but not in the function signature like you did. You would need to do that via a config field. The reason we want to defer eveything to the config in the solid decorator is because it gives you a lot of nifty features when you execute pipelines via dagit and also because your pipelines are much more mantainable when there is a single source of truth for pipeline configuration. Imagine a world where you have a pipeline with 500 solids and this one solid has a default parameter that tweaks things in a subtle way, it would be nice to have all of your knobs and dials in one config so that you can grok it to figure out what is going on when something goes wrong or when you inevitably try to shim a new solid into that pipeline. Here is more info on our config system: https://dagster.readthedocs.io/en/latest/sections/learn/tutorial/config.html And to directly give you code to answer your question, here is an example of how I used defaults: https://github.com/dagster-io/dagster/blob/455f7aba03a6392e695c3f04cbbcac4f64f973f0/examples/dagster_examples/bay_bikes/solids.py#L60
RE the second question. I am afraid I don't quite understand. Could I get an example of what you mean by "setting the value for a solid in config"?
b

Ben Fogelson

12/19/2019, 1:30 AM
Thanks for the reply. You’re right, the second question was unclear. I was getting at the following: Say I have a solid that applies some transformation to my data:
Copy code
@solid
def do_transformation(context, data, transformation_parameter):
    return transformation_parameter * data
In this case the solid just multiplies my data by a constant. I might need to apply the exact same transformation (with the same value of
transformation_parameter
to more than one value of
data
. In dagster, that would be achieved with aliasing:
Copy code
@pipeline
def some_pipeline():
    # upstream stuff
    result_a = do_transformation.alias('do_transformation_a')(data_a)
    result_b = do_transformation('do_transformation_b')(data_b)
I can stub a value for the
transformation_parameter
input in a yaml file, but I have to separately stub for the two aliased solids:
Copy code
solids:
  do_transformation_a:
    inputs:
      transformation_parameter:
        value: 5
# etc
What I’d like is a way to set a single value for
transformation_parameter
in my yaml that is applied to both
do_transformation_a
and
do_transformation_b
a

abhi

12/19/2019, 2:02 AM
That’s a great question. So I think configs take care of this for you. If you plan on having a value stay the same across all of your aliased solids you should just make it a default argument in the solid config then you don’t need to specify it at all in your environment config
However if you plan on it changing then sadly you will have to specify it as a config variable for each solid because it’s not a default
a

alex

12/19/2019, 5:40 PM
Composites solids, our means of composition, have some tools that my allow you to solve the problem you are facing. You can use them to manipulate the config for their inner solids - such as binding fixed values or mapping a single config field to multiple places. docs: https://dagster.readthedocs.io/en/0.6.6/sections/api/apidocs/solids.html#composing-solids example: https://github.com/dagster-io/dagster/blob/master/examples/dagster_examples/gcp_data_platform/final_pipeline.py#L96-L119