Hi all. It seems like the retry policy doesn't wor...
# ask-community
y
Hi all. It seems like the retry policy doesn't work for asset-based jobs. This code doesn't work (dagster adds tags to the job, I can see them in dagit, but failed jobs do not restart):
Copy code
asset_job = define_asset_job(
    name=f"extract_load_prod_job",
    tags={
        "dagster/max_retries": 3,
        "dagster/retry_strategy": "FROM_FAILURE",
        "env": "prod",
    },
    selection=AssetSelection.groups("prod"),
    partitions_def=DailyPartitionsDefinition(start_date="2023-01-01"),
)
dagster version 1.1.17, DockerRunLauncher
j
Have you configured your dagster.yaml to enable run retries?
Copy code
run_retries:
  enabled: true
https://docs.dagster.io/deployment/run-retries#configuration
y
Yep, it is enabled
image.png
j
Do you see any errors in the logs of your daemon process? Particularly with the string
EventLogConsumerDaemon
One possibility- when was your database initialized? If it was before 0.15.0, it may be missing a table required for run retries. If that’s the case, you’ll see logs like
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "kvs" does not exist
. The guide to migrating is here https://docs.dagster.io/deployment/guides/kubernetes/how-to-migrate-your-instance
y
cool, thanks! and if we are deploying to a single node (using docker)?
yep, this is our case
Copy code
2023-03-09 15:15:54 +0000 - dagster.daemon.EventLogConsumerDaemon - ERROR - Caught error:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "kvs" does not exist
👍 1
j
Where are you running your DB?
y
cloud sql
maybe i can do it via docker exec on a running container?
j
The migration guide should cover you. It’s never a bad idea to make a backup before doing a schema upgrade
❤️ 1
Would you mind filing an issue for us to surface this error better? It at least should be mentioned on the docs for retries
y
sure!
❤️ 1
I'll do it asap! Thank you, @johann!