Hi everyone I am trying Dagster for the first time and I hav dagster #ask-community

Hi everyone! I am trying Dagster for the first tim...

Louis Auneau

10/29/2021, 8:14 AM

Hi everyone! I am trying Dagster for the first time and I have a question that is more about “best practises”. We have a very simple “job” that calls a function

get_model(name: str)

twice to then do more complex stuff. In our original project we defined the names of our models as constants and then called the functions using those constants

get_model(MY_MODEL_NAME)

. However when I transform my function into an op/solid, it doesn’t work since my constants are not another op/solid’s output. What’s the best practise to handle such cases ? Should I create one op/solid per constant which is just a function returning the value ? Thank you by advance and have an excellent day !

Kenneth Barrett

10/29/2021, 10:03 AM

Not sure if it's best practice but when we encountered this issue, rather than defining an op per model we got around it by creating a single factory function that takes in the config and then builds and returns an executed op yielding that model. Something like:

Copy code

def get_model(my_model, op_name):
    @op(name=op_name)
    def _op(context):
        return my_model

    return _op()

@job
my_fancy_job()
    my_complex_op(get_model(models.customers, 'customers'))

👍 2

Louis Auneau

11/02/2021, 9:59 AM

I tried the factory pattern and it indeed works. I also tried using

config_schema

. However, I found that the more I developped my functions, the more it added an important overhead to adapt functions to handle such a simple case (for factory pattern) or made my functions less generic (for config_schema). It was until I stepped into a page of the documentation of dagster that seems to be the “bast practise” for such cases: https://docs.dagster.io/concepts/io-management/unconnected-inputs

2 Views

Open in Slack

Previous Next