Ken
08/10/2020, 12:33 AMKeyboardInterrupt
File "/usr/local/lib/python3.6/dist-packages/dagster/core/execution/plan/execute_plan.py", line 210, in _dagster_event_sequence_for_step
for step_event in check.generator(step_events):
File "/usr/local/lib/python3.6/dist-packages/dagster/core/execution/plan/execute_step.py", line 269, in core_dagster_event_sequence_for_step
_step_output_error_checked_user_event_sequence(step_context, user_event_sequence)
File "/usr/local/lib/python3.6/dist-packages/dagster/core/execution/plan/execute_step.py", line 53, in _step_output_error_checked_user_event_sequence
for user_event in user_event_sequence:
File "/usr/local/lib/python3.6/dist-packages/dagster/core/execution/plan/execute_step.py", line 399, in _user_event_sequence_for_step_compute_fn
for event in gen:
File "/usr/local/lib/python3.6/dist-packages/dagster/core/execution/plan/compute.py", line 102, in _execute_core_compute
for step_output in _yield_compute_results(compute_context, inputs, compute_fn):
File "/usr/local/lib/python3.6/dist-packages/dagster/core/execution/plan/compute.py", line 73, in _yield_compute_results
for event in user_event_sequence:
File "/usr/local/lib/python3.6/dist-packages/dagster/core/definitions/decorators/solid.py", line 220, in compute
result = fn(context, **kwargs)
File "/imagery_pipeline_source/solids/terravion_solids/solid_1_register_image/main.py", line 91, in register_image
local_result_path = registrate(context, local_src_path, local_ref_image_path);
File "/imagery_pipeline_source/solids/terravion_solids/solid_1_register_image/algo/__init__.py", line 191, in registrate
date_register_if_not_exists(ctx, input_src_path, input_ref_path, day_reg_path)
File "/imagery_pipeline_source/solids/terravion_solids/solid_1_register_image/algo/__init__.py", line 112, in date_register_if_not_exists
register_date_fiji(source_path=source)
File "/imagery_pipeline_source/solids/terravion_solids/solid_1_register_image/lib/__init__.py", line 63, in register_using_fiji
run_sub_cmd(cmd_template)
File "/imagery_pipeline_source/solids/terravion_solids/solid_1_register_image/lib/__init__.py", line 157, in run_sub_cmd
<http://logging.info|logging.info>(subprocess.check_output(cmd, shell=True, stderr=subprocess.STDOUT))
File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/usr/lib/python3.6/subprocess.py", line 425, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/usr/lib/python3.6/subprocess.py", line 850, in communicate
stdout = self.stdout.read()
cat
08/10/2020, 5:11 PMKen
08/10/2020, 6:34 PMcat
08/10/2020, 7:26 PMKen
08/10/2020, 10:55 PMkubectl describe pod ...
, logs and status is the same as the successful ones 🤔
I will submit a support ticket to the Google Support GKE team, and see if there is any piece of configuration/quota-limit that I missed.cat
08/11/2020, 2:32 AMKen
08/11/2020, 4:18 PM"<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>": "false"
can keep the auto-scaler from re-scheduling the pod.
[1] https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-nodeK8S_RESOURCE_REQUIREMENTS_KEY
, so that we can put something like K8S_POD_TEMPLATE_ANNOTATION
in the pipeline-tags and the job.py will initiate the V1PodTemplateSpec with that annotations dictionary.
If not, I can attempt an PR to implement it.cat
08/12/2020, 4:09 PMKen
08/12/2020, 5:05 PMcat
08/13/2020, 4:07 PM