hey , I have scheduled job and it appears in k8s's...
# deployment-kubernetes
s
hey , I have scheduled job and it appears in k8s's
cronjobs
387b49f18c2bb8c47775c350295934e2b1a3b969   0 */6 * * *   False     0        39m             162m
I could find the job -> pod that's been started
387b49f18c2bb8c47775c350295934e2b1a3b969-1607018400-cbkqw   0/1     Completed   0          41m
but there are no logs and the run doesn't appear on dagit either. (same thing is happening in one of my weekly jobs, every other job, is daily and runs fine) any advise how could I debug this?
j
Does
kubectl describe
reveal anything?
cc @cat
s
Copy code
Containers:
  93d49a59785c741920f8952099232ed0:
    Container ID:  <docker://000bf252c97bd2ce0087565c9213373498026c3973d05ecb760f1bf7ed1f14e6>
    Image:         <http://391094253726.dkr.ecr.eu-west-1.amazonaws.com/dagster:latest|391094253726.dkr.ecr.eu-west-1.amazonaws.com/dagster:latest>
    Image ID:      <docker-pullable://xxxxxxx.dkr.ecr.eu-west-1.amazonaws.com/dagster@sha256:0e33171ac552e13491f1940dd261229eb1d8d3fa8d023a2a59b5d32575e3fa30>
    Port:          <none>
    Host Port:     <none>
    Command:
      dagster
    Args:
      api
      launch_scheduled_execution
      /tmp/launch_scheduled_execution_output
      --schedule_name
      madkudu_export_every_6h
      -f
      /opt/dagster/app/src/data_pipeline/repositories.py
      -a
      business_intelligence_repository
      -d
      /opt/dagster/app
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 03 Dec 2020 18:00:02 +0000
      Finished:     Thu, 03 Dec 2020 18:00:08 +0000
    Ready:          False
    Restart Count:  0
seems ok to me
c
thanks for reporting!
will look in a moment
my suspicious is that there is either version incompatibility (what version of dagster is 391094253726.dkr.ecr.eu-west-1.amazonaws.com/dagster:latest and the user code image running?) also is there an events section in for
kubectl describe pod 387b49f18c2bb8c47775c350295934e2b1a3b969-1607018400-cbkqw
?
if youre working off of master, the launch schedule apis have changed a bit
s
sorry for the late response. we are on 0.9.18
I can see the run launcher, but there were no jobs for solids
c
Hey @szalai1 just to check — only the daily jobs are running successfully but the weekly jobs are not? are these schedules within the same repository / with the same image / have you changed the schedules recently? would it be possible to manually launch a cronjob run via
kubectl create job --from=cronjob/387b49f18c2bb8c47775c350295934e2b1a3b969 manual_test
and inspect the job that is created for any errors?
also happy to jump on a call between 9am-6pm pst (not sure your timezone availability)
also are you using the dagster’s published helm chart
s
most of the jobs are fine, but there are 2 of them not running (both of them weekly). will try this >
kubectl create job --from=cronjob/387b49f18c2bb8c47775c350295934e2b1a3b969 manual_test
we are using a hand rolled dagster deployment (based on the helm chart)
happy to jump on a call between 9am-6pm
I'm holidays, but will try what you said and if I cannot resolve it, I'm happy to jump on a call (appreciate a lot 🙂 ) thanks a lot
c
s
Thank you very much, scheduled 🙂
I think I've found the problem. There were no errors in the pod, because the schedule dumps the logs in
/tmp/
file. So I rerun the schedule on the dagit instance (same command from the job). And got that the execution context's scheduled time is a NoneType. my schedule:
Copy code
@schedule(
    name="hourly_madkudu_feature_usage_export",
    cron_schedule="0 */6 * * *",
    pipeline_name="madkudu_feature_usage_export",
    mode='prod',
)
def madkudu_export_every_6h(ctx: ScheduleExecutionContext):
    # run it for the LAST 6 hours
    return run_config_from_datetime(ctx.scheduled_execution_time - datetime.timedelta(hours=6))
relecant error:
Copy code
{
  "__class__": "ScheduledExecutionFailed",
  "errors": [
    {
      "__class__": "SerializableErrorInfo",
      "cause": {
        "__class__": "SerializableErrorInfo",
        "cause": null,
        "cls_name": "TypeError",
        "message": "TypeError: unsupported operand type(s) for -: 'NoneType' and 'datetime.timedelta'\n",
        "stack": [
          "  File \"/usr/local/lib/python3.8/site-packages/dagster/core/errors.py\", line 180, in user_code_error_boundary\n    yield\n",
          "  File \"/usr/local/lib/python3.8/site-packages/dagster/grpc/impl.py\", line 257, in get_external_schedule_execution\n    run_config = schedule_def.get_run_config(schedule_context)\n",
          "  File \"/usr/local/lib/python3.8/site-packages/dagster/core/definitions/schedule.py\", line 180, in get_run_config\n    return self._run_config_fn(context)\n",
          "  File \"/opt/dagster/app/src/data_pipeline/bi_pipelines/madkudu.py\", line 118, in madkudu_export_every_6h\n    return run_config_from_datetime(ctx.scheduled_execution_time - datetime.timedelta(hours=6))\n"
        ]
      },
      "cls_name": "ScheduleExecutionError",
      "message": "dagster.core.errors.ScheduleExecutionError: Error occurred during the execution of run_config_fn for schedule hourly_madkudu_feature_usage_export\n",
      "stack": [
        "  File \"/usr/local/lib/python3.8/site-packages/dagster/grpc/impl.py\", line 257, in get_external_schedule_execution\n    run_config = schedule_def.get_run_config(schedule_context)\n",
        "  File \"/usr/local/lib/python3.8/contextlib.py\", line 131, in __exit__\n    self.gen.throw(type, value, traceback)\n",
        "  File \"/usr/local/lib/python3.8/site-packages/dagster/core/errors.py\", line 190, in user_code_error_boundary\n    raise_from(\n",
        "  File \"/usr/local/lib/python3.8/site-packages/future/utils/__init__.py\", line 403, in raise_from\n    exec(execstr, myglobals, mylocals)\n",
        "  File \"<string>\", line 1, in <module>\n"
      ]
    }
  ],
  "run_id": null
}
tried it with
hourly_schedule
and
should_execute
but got the same error
d
Hi @szalai1 - very sorry for the mixup here. the scheduled_execution_time property on the context is available when using the new scheduler that we're shipping in 0.10.0 but not the current schedulers, and I accidentally added it to the docs too early. We'll update the docs now to make it clearer that it's not available on all schedulers.
s
no worries, thank you very much for the help
hiy @daniel posting it here for visibility:
so I tried the
@schedule
with datetime as argument and got this error:
"message": "TypeError: unsupported operand type(s) for -: 'ScheduleExecutionContext' and 'datetime.timedelta'\n",
from this I assumed it gets ``ScheduleExecutionContext`  , but when I checked it to get
scheduled_execution_time
from it, I got this error
"message": "TypeError: unsupported operand type(s) for -: 'NoneType' and 'datetime.timedelta'\n",
seems like `*`scheduled_execution_time`* is  None in the Schedule context.
d
It does take in a ScheduleExecutionContext as it’s argument, yeah - but the scheduled_execution_time property on the context will be None in the scheduler that you’re currently using (sorry again for the misleading docs). You could probably use the current time in its place for many purposes - in a cron scheduler, the current time generally it shouldn’t be too far off from the scheduled execution time
s
ahh, I see now, sorry I'm slow 😄, I used datetime.now(), Thanks for the clarification
👍 1