I'm having trouble executing a pipeline using the ...
# announcements
b
I'm having trouble executing a pipeline using the multi-process executor. it looks like i need to wrap my pipeline in
reconstructable
, but as soon as I do that I can't include it in a repository. the trimmed backtrace is this, although i think there's a bug there, and i think the actual root is that the
repository
decorator doesn't accept
ReconstructablePipeline
objects:
Copy code
File "/home/ben/repos/dataplatform-poc/pipelines/dataplatform/repository.py", line 6, in <module>
    @repository
  File "/home/ben/.pyenv/versions/3.7.5/envs/dataplatform-poc/lib/python3.7/site-packages/dagster/core/definitions/decorators/repository.py", line 225, in repository
    return _Repository()(name)
  File "/home/ben/.pyenv/versions/3.7.5/envs/dataplatform-poc/lib/python3.7/site-packages/dagster/core/definitions/decorators/repository.py", line 44, in __call__
    bad_definitions.append(i, type(definition))
TypeError: append() takes exactly one argument (2 given)
a
how are you executing your pipeline ? you should only need to wrap it in
reconstructable
at the call site
b
i'm using dagster-databricks; all works fine with the in-memory executor, but that only allows a single step at a time
a
i mean, are you calling
execute_pipeline
in a python script or using the cli or using dagit
b
ah right, using dagit
a
Interesting, i wouldn’t expect it to be an issue via dagit
b
ah i should clarify. dagit doesn't seem to have a problem running the pipelines, it's just when the steps get to databricks to be executed that the problem appears
a
aaahhh ok ok - this is likely just some mix up from these architectural changes happening as that PR was being worked on
b
the databricks step launcher uses
run_step_from_ref(step_run_ref)
on a pickled
step_run_ref
file to run the pipeline which seems to be the bit that fails 🙂
we pull the definition out from the reconstructable pipeline we have and pass that down, but we should be passing
step_run_ref.recon_pipeline
in to
create_execution_plans
a classic migration flexible APIs make it easy to mess up situation
cc @sandy
ill send out a fix ehre
b
nice spot, thanks @alex!
@alex want me to create an issue on github btw?
a
for review
(it took longer since I tried to clean up all the
create_execution_plan
callsites which turned out to be too much)
Out of curiosity, why are you trying to use the multiprocess executor? It will only be executing one step at a time right, so its not for parallelism.
b
hmm, it was for parallelism yeah; when using the in-process executor dagit only launches one step at a time, but with the multiprocess executor dagit can launch several (even though the databricks launcher only runs for a single step)
a
oooh ok, I believe I understand the issue now