Hi all, this is probably a simple question but how...
# ask-community
Hi all, this is probably a simple question but how do I pass the 'base_path' to my io_manager? For example, my io_manager is called below and I'm trying to pass the base path when I run the job
hi @Barry Sun you can set config for io managers in the same way you set config for resources. So if your io manager key is
your run config would look like this
Copy code
      base_dir: "new/path"
or the equivalent in python dictionary form if that's how you're specifying config
Hi Jamie, thanks for that! I've now gotten this error "TypeError: expected str, bytes or os.PathLike object, not NoneType". I'm getting my path in this way. I'm not too sure what's wrong here 😕
can you share more of the code for your IO manager? and which line of code is throwing this error if possible? based on context it seems like the
call is throwing the error, which would make me believe that
isn't being properly set in your
It might be useful to throw a couple print statements in your init to make sure the value from config is getting passed through and set as an instance variable as expected
I'm setting
like so. I'll give the print statements a go too!
Hi Jamie, thanks for the printing tip. It looks like I named the variables differently 😐 (base_dir and base_path). All fixed now! On a side note, I'm trying to wrap my head around supplying config to resources. There's some concepts that I haven't been able to find this explicitly in the docs. For example, what is the difference between
? I've also seen config being supplied to resources via i)
df_table_io_manager.configure({'base_path': 'data/raw/'})
and ii) supplying the config as you did. I wonder when is the right time to use each one?
internally we have different python classes for different types of contexts (for example, the context object passed to a resource has different attributes and functions than that passed to an op). as far as naming goes, i think you could call the parameter
in all cases and be fine (i haven't confirmed this everywhere but i'm pretty sure it's true). it's likely a naming discrepancy in our documentation, or a naming choice to indicate that it's the context passed when the resource is initialized (feature or bug ¯\_(ツ)_/¯ ) as for the
API, it provides another way to provide config to a resource in code. so it may be that for a certain job you always want
to be
and the
API allows you to specify it in code once and never have to deal with it again. providing run config via the dagit UI allows you to modify config more "on the fly" and update values for every run of a job. you may have already found this documentation, but the opening sections do a pretty good job going over cases when you might want to use
❤️ 1
Wow ok thanks for the quick and thorough response Jamie! That is super helpful and makes a lot more sense to me now 🙂
🎉 1