https://dagster.io/ logo
Title
r

Roel Hogervorst

01/07/2022, 12:40 PM
I have 2 ops that require inputs. How do I pass that through the config in job? I tried
production_job = graphname.to_job(
   config= {"ops": {"op1":{"inputs":{"arg1":"bla", "ar2":"bla"}},
                    {"op2":{"inputs":{"arg1: "something"}
            })
But that doesn't work. It does work for 1 op, but I get
unexpected config entry
. I tried a list in stead of 2 dictionaries. But that is also not allowed. What is the best way to pass the arguments to the ops from the job?
I searched through previous slack messages but I can't find similar questions. Only people trying to give the same input to several jobs
a

Alex Service

01/07/2022, 1:45 PM
are your inputs literally named
arg1
and
arg2
?
because if so, the example you gave has a typo of
ar2
😛
r

Roel Hogervorst

01/07/2022, 1:46 PM
I changed the ops to take these configuration settings from the context.
config = {'ops':{'op1':{'config':{'arg1':'arg'}},{'op2':{'config':{'arg2':'arg'}}
I am able to supply these things with a yaml config in dagit. So something is wrong
a

Alex Service

01/07/2022, 1:47 PM
I use a yaml file as well and have multiple arguments without issue
r

Roel Hogervorst

01/07/2022, 1:47 PM
Can I use yaml files as configuration for schedules?
a

Alex Service

01/07/2022, 1:49 PM
Haven’t tried. I don’t know if dagster natively has a way of configuring a schedule via yaml, but if you’re comfortable using a yaml loader, it shouldn’t be too difficult
I only have one thing scheduled at the moment, which gets by with a ScheduleDefinition in my
@repository
function
r

Roel Hogervorst

01/07/2022, 1:50 PM
My usecase is this: • operator 1: read a specific file from S3, pass the file along to another operator • Do something with that file, pass that file along to • operator3: write the resulting file to another S3 location. I want to specify for operator 1 where the file lives on S3, and for operator 3 where to write the file.
So I want to provide that information (location to get the file, location to write the file) to the job (containing the 3 ops)
a

Alex Service

01/07/2022, 1:52 PM
I’m doing the same, except with Google Storage 🙂 looks something like this:
ops:
  load_data:
    inputs:
      file_paths: 
        - "source/path/example"
  save_to_lake:
    inputs:
      project: myproject
      table: example_table
(actually, a difference here is that I allow a list of source files)
r

Roel Hogervorst

01/07/2022, 1:53 PM
Yeah that works in dagit. But I can't put it in the job definition
a

Alex Service

01/07/2022, 1:56 PM
I load my configs using
from dagster.utils.yaml_utils import load_yaml_from_path
and then just point the job definition to the loaded yaml, seems to work reasonably enough
I actually load all my configs programmatically and use the file name as a key in a dict and do something like this:
my_job = graph_func.to_job(name='My Cool Job', 
        resource_defs={'a_resource': make_values_resource()}, 
        config=configs['my_job_config'])
where
my_job_config
has the example I gave above
Not sure if that’s helpful for your exact case
r

Roel Hogervorst

01/07/2022, 2:10 PM
Thank Alex! I will try some of this out. Looks a lot like how I used to configure airflow jobs!
a

Alex Service

01/07/2022, 2:14 PM
no problem! Hope it’s helpful 🙂
r

Roel Hogervorst

01/07/2022, 2:17 PM
ah too bad, loading in the yaml config with that tool leads to a failing python dictionary
a

Alex Service

01/07/2022, 2:19 PM
it fails on load? might wanna double-check that yaml format then; it’s just using
yaml.safe_load
under the hood:
def load_yaml_from_path(path):
    check.str_param(path, "path")
    with open(path, "r") as ff:
        return yaml.safe_load(ff)
r

Roel Hogervorst

01/07/2022, 2:34 PM
So this yaml is valid when supplied in dagit and will also run a job:
ops:
  ops1:
    config:
      filename: file1.csv
      filepath: dagster-test/file1.csv
  ops2:
    config:
      filepath: dagster-test/output/file2.csv
However that yaml transformed (with load yaml from path) into a dictionary is invalid and dagit refuses to load it.
{'ops' {'ops1': {'config': {'filename': 'file1.csv', 'filepath':'filepath'}}, 'ops2':{'config':{'filepath':'filepath2'}} }}
oh no, I'm an idiot. I misspelled one of the ops names!
😛artydagster: 1
a

Alex Service

01/07/2022, 3:40 PM
not an idiot, we’ve all been there 🙂