Hello, I am now getting into dagstermill. I am go...
# announcements
d
Hello, I am now getting into dagstermill. I am going through the tutorial and everything runs fine. However, the rendered notebook is being saved where I execute the dagit command. I was expecting the notebook to be saved where my DAGSTER_HOME env variable is set at, but wasn't sure as I am having difficulty finding info in the docs on specifying save location when using fs_io_manager with local_file_manager. I'm assuming there is a way to set location where I want the rendered notebook to be saved at. Thanks in advance!
y
Hi Daniel, which dagster version are you on?
d
0.10.9
y
After 0.10.7,
fs_io_manager
will use the
local_artifact_storage
configured in your dagster instance yaml as the base_dir by default
d
Gotcha, thank you!!!
y
you can configure
base_dir
as a config on the fs_io_manager. here are some examples: at definition time
Copy code
@pipeline(
    mode_defs=[ModeDefinition(resource_defs={"io_manager": fs_io_manager.configured({"base_dir": "path/to/dir"})})],
)
def pipe():
    ...
or via run_config / yaml:
Copy code
resources:
  <io_manager>:
    config:
      base_dir: <path/to/dir>
base_dir
is a config on a resource, where the resource is an io manager
d
Hmmm, still not working for me. FWIW, I am on Windows. I used forward slash for my string base_dir. I configured in dagster.yaml and also configured via dagit config editor, but still didnt save to correct location. No warning or error messages at all.
My path: "D:/gitprojects/dagster_dev/outputs"
y
my bunch is this is related to some regression on file_manager after we introduced io_manager. do you mind sharing your pipeline setup, esp the mode def and the resource configs
d
It is basically the same as in the official docs tutorial. Only difference is path to my notebook is different.
y
gotcha. let me try to repro it
d
You can find my solid and pipeline defs in dagster's github discussion forum in the "Q&A" if need be.
blob thumbs up 1
y
just to confirm,
Copy code
script_relative_path(
        str(DS_RODEO_HOME / 'notebooks' / '2_Dimensionality_Reduction.ipynb')
    )
is the input notebook path right nvm
oh i think instead of configuring io_manager, you do:
Copy code
"file_manager": local_file_manager.configured({"base_dir": "/tmp/aaa/"})
you can find the notebooks.
without setting it, you should find all the output notebooks in the current dir of you executing the dagit cmd
bc local_file_manager defaults to the current dir, i.e. “`.` ”
d
Hi Yuhan, I've configured the local file manager as you've suggested in my pipeline definition, but now I am getting a permission error that shows up on dagit's log:
Error when attempting to materialize executed notebook using file manager (falling back to local): SerializableErrorInfo(message="PermissionError: [WinError 5] Access is denied: 'D:/'\n", stack=['  File "D:\\Python38\\envs\\ds_rodeo\\lib\\site-packages\\dagstermill\\solids.py", line 229, in _t_fn\n    executed_notebook_file_handle = compute_context.resources.file_manager.write(\n', '  File "D:\\Python38\\envs\\ds_rodeo\\lib\\site-packages\\dagster\\core\\storage\\file_manager.py", line 261, in write\n    self.ensure_base_dir_exists()\n', '  File "D:\\Python38\\envs\\ds_rodeo\\lib\\site-packages\\dagster\\core\\storage\\file_manager.py", line 227, in ensure_base_dir_exists\n    mkdir_p(self.base_dir)\n', '  File "D:\\Python38\\envs\\ds_rodeo\\lib\\site-packages\\dagster\\utils\\__init__.py", line 126, in mkdir_p\n    os.makedirs(path)\n', '  File "D:\\Python38\\lib\\os.py", line 223, in makedirs\n    mkdir(name, mode)\n'], cls_name='PermissionError', cause=None)
Here's my pipeline definition:
@pipeline(
    
mode_defs=[
        
ModeDefinition(
            
resource_defs={
                
"io_manager": fs_io_manager,
                
"file_manager": local_file_manager.configured(
                    
{"base_dir": "D:/"}
                
)
            
},
        
)
    
]
)
def notebook_pipeline():
    
dim_reduction()
I wonder perhaps Windows paths with drive letter is not supported or not working with
base_dir
. I will try something else using a relative path.
I changed to:
{"base_dir": "outputs"}
and it works relative to where I executed the dagit command. So it appears the base_dir only accepts relative path. Is there a way to set absolute path? I looked at dagster.utils module, but I see only methods for relative path.
y
Inserting a forward slash works in Unix, ie /output instead of output will be absolute path
Not sure if window works in a similar way. Cc @max re: how to set absolute path for file_manager in window
d
Setting an absolute path on Windows usually entails including the drive letter, which I know doesn't exist for POSIX systems.
m
i wouldn't be shocked if this were broken on windows
hm, though, this looks like a permissions error?
are you certain that your python process has permissions to access
D:\
?
you may want to try running as an adminsitrator; i'm also curious what version of python you're running
d
Hi @max I don't think it was a permission issue as I was able to use relative path which was still in my D:\ drive (I am using Python 3.8 btw). I have upgraded to 0.11.0 as I wanted to be able to set the base_dir setting with the dagster.yaml file instead of having to do it inside the pipeline decorator. After upgrading, looks like I am able to now use full/absolute path with drive letter. But apparently the base_dir config setting in the dagster.yaml file is still not being recognized. So I had to revert back to setting the base_dir config setting within the pipeline decorator again. So I created a new issue: https://github.com/dagster-io/dagster/issues/3898