I m having some difficulties using the `execute job` to test dagster #ask-community

I'm having some difficulties using the `execute_jo...

Dinis Rodrigues

01/22/2023, 9:39 PM

I'm having some difficulties using the

execute_job

to test my jobs. I'm using databricks step launcher, and it requires a reconstructable pipeline, so I'm unable to use

job.execute_in_process

. From the docs I'm doing:

Copy code

instance = DagsterInstance.get()
execute_job(reconstructable(my_job), run_config=config, instance=instance)

But I get this error:

Copy code

dagster._check.CheckError: Failure condition: Unexpected return value from child process <class 'collections.ChildProcessStartEvent'>

Stack Trace:
  File "/opt/conda/envs/dagster_env/lib/python3.9/site-packages/dagster/_core/execution/api.py", line 991, in pipeline_execution_iterator
    for event in pipeline_context.executor.execute(pipeline_context, execution_plan):
  File "/opt/conda/envs/dagster_env/lib/python3.9/site-packages/dagster/_core/executor/multiprocess.py", line 240, in execute
    event_or_none = next(step_iter)
  File "/opt/conda/envs/dagster_env/lib/python3.9/site-packages/dagster/_core/executor/multiprocess.py", line 364, in execute_step_out_of_process
    check.failed("Unexpected return value from child process {}".format(type(ret)))
  File "/opt/conda/envs/dagster_env/lib/python3.9/site-packages/dagster/_check/__init__.py", line 1687, in failed
    raise CheckError(f"Failure condition: {desc}")

Am I missing something?

Dinis Rodrigues

01/23/2023, 12:25 AM

I reported this in a github issue here and I also started a thread a few months ago about this. But I lost the thread due to the 90 day message retention. I only saved this message from that time with @daniel But as I stated, using databricks step launcher I cannot use execute_in_process

yuhan

01/24/2023, 1:04 AM

hey sorry for the late response! we’re looking into this issue. if im recalling it correctly, does the issue occur only when you’re using it with

op_retry_policy

daniel

01/24/2023, 1:07 AM

Hi Dinis - in the short term is it possible to move the code with side-effects to a different file? I think your problem is coming from the fact that your code within the

Copy code

if __name__ == '__main__'

block is also executing within the step launcher which is probably not what you want

daniel

01/24/2023, 1:08 AM

putting your definitions in one file and your testing code in another (that imports the job from the definitions file) is one quick way to get around that

Dinis Rodrigues

01/24/2023, 4:09 PM

@yuhan If I remove the retry policy, the job proceeds, although giving the same error. @daniel I'm trying to implement your suggestion, but I'm getting the same result. I built an example test that matches what I'm doing. If you could take a look 🙏

daniel

01/24/2023, 4:13 PM

Does it work if you take out main.py altogether?

Dinis Rodrigues

01/24/2023, 4:30 PM

You mean running myjob_test.py directly? With that, I get a multiprocessing error, and then the same error again

daniel

01/24/2023, 4:34 PM

I meant removing the main.py file entirely - my suspicion is that something about the fact that code is running in an if __main block could be causing the problem

Dinis Rodrigues

01/24/2023, 4:40 PM

Okok, yes by doing that I get the above image error.

daniel

01/24/2023, 4:52 PM

What if you run the job using the “dagster job launch” command? Does that work better?

Dinis Rodrigues

01/24/2023, 5:19 PM

Using the CLI the error doesn't seem to happen.

Dinis Rodrigues

01/24/2023, 5:21 PM

But with the cli it seems this would require a major refactoring right? From what I understand, I would need to create new graphs, (since I have more than one job running in the test pipeline) and then I would need to specify the run config directly in the job decorator?

Dinis Rodrigues

01/24/2023, 5:22 PM

By doing this, I only have 2 logs. Doesn't seem to fully run, I would expect some logs from databricks launcher

Dinis Rodrigues

01/26/2023, 1:00 PM

Iv'e managed to get the logs by doing

dagster job execute

instead of launch. But I'm still unable to get a workaround on inserting the run_config directly in the decorator. The reason being my run config depends on environment variables. So when this goes to databricks, it tries to read environment variables that don't yet exist on the cluster

Dinis Rodrigues

01/26/2023, 9:59 PM

@daniel @yuhan Based on all this information and for future reference, this is the solution I came up with for the test setup. 1. Read a pre-defined job config 2. Fill it based on environment variables 3. Dump it into a new "generated_job_config.yaml" 4. Launch jobs using the dagster CLI with the generated config This is a crude sample of what we are going to develop, but it's the baseline of the implementation. In this way, we entirely skip the python execution api, and only use the dagster CLI. On a side note, the documentation about the CLI is pretty scattered, I took a really long time until I found out I could use the --config flag. Thank you 🤙

7 Views

Open in Slack

Previous Next