Rafał Wojdyła
02/16/2021, 11:05 AMMemoizableIOManager
but I having hard time, and right now hitting 2 issues/questions, I would highly appreciate your help. I'm on dagster==0.10.5.
1. has_output
appears to be getting called with context set to None
, thus I can't validate if the external data exists, this is a simplified version of my current POC: https://gist.github.com/ravwojdyla/6a546e3fa65459b17413aac10c643f25
2. when I hard-code had_output
to return True
(just as a POC), I'm getting this error:
for step in self.get_steps_to_execute_by_level()[0]:
IndexError: list index out of range
I was expecting to get an error in that case, but that specific error made me think how the memoization appears to work, is it fair to say that for memoization to work the "central" scheduler needs to have access the the previously run "steps" (to fetch config etc)?alex
02/16/2021, 3:27 PMchris
02/16/2021, 3:30 PMRafał Wojdyła
02/16/2021, 3:33 PMRafał Wojdyła
02/16/2021, 3:38 PMchris
02/16/2021, 3:42 PMchris
02/16/2021, 3:44 PMchris
02/16/2021, 3:45 PMRafał Wojdyła
02/16/2021, 3:47 PMchris
02/16/2021, 3:48 PMRafał Wojdyła
02/16/2021, 3:54 PMRafał Wojdyła
02/16/2021, 3:55 PMcontext
is None
, but that context.log
is `None`:
File "pipelines/data_sources/efo/dagster_tasks.py", line 30, in has_output
<http://context.log.info|context.log.info>(f"Trying to load from {context}")
AttributeError: 'NoneType' object has no attribute 'info'
is that a bug as well?chris
02/16/2021, 3:57 PM@solid(version="hello")
to get up and runningRafał Wojdyła
02/16/2021, 3:59 PMcontext.log
being None
, maybe some kind of init is not done before has_output
?Rafał Wojdyła
02/16/2021, 4:03 PMchris
02/16/2021, 4:05 PMRafał Wojdyła
02/16/2021, 4:05 PMchris
02/16/2021, 4:06 PMRafał Wojdyła
02/16/2021, 4:06 PMI was expecting to get an error in that case, but that specific error made me think how the memoization appears to work, is it fair to say that for memoization to work the "central" scheduler needs to have access the the previously run "steps" (to fetch config etc)?
chris
02/16/2021, 4:07 PMRafał Wojdyła
02/16/2021, 4:09 PMchris
02/16/2021, 4:11 PMRafał Wojdyła
02/16/2021, 4:12 PMchris
02/16/2021, 4:15 PMRafał Wojdyła
02/16/2021, 4:19 PMchris
02/16/2021, 4:22 PMchris
02/16/2021, 4:23 PMchris
02/16/2021, 4:32 PMchris
02/16/2021, 4:33 PMRafał Wojdyła
02/16/2021, 4:43 PMRafał Wojdyła
02/16/2021, 4:44 PMchris
02/16/2021, 4:54 PMRafał Wojdyła
02/16/2021, 5:27 PMRafał Wojdyła
02/16/2021, 5:28 PMversion
, is that right?Rafał Wojdyła
02/16/2021, 5:36 PMRafał Wojdyła
02/16/2021, 6:03 PMversion
or the hashing of the files per solid? is that kind of hashing of files per solid something you recommend doing in real pipelines (which would require a file per solid)?chris
02/16/2021, 6:04 PMRafał Wojdyła
02/16/2021, 6:06 PMchris
02/16/2021, 6:11 PMchris
02/16/2021, 6:14 PMchris
02/16/2021, 6:20 PMRafał Wojdyła
02/16/2021, 6:51 PMchris
02/16/2021, 6:53 PMchris
02/16/2021, 6:57 PM