When starting a job I am getting `dagster._core.er...
# ask-community
d
When starting a job I am getting
dagster._core.errors.DagsterLaunchFailedError: Error during RPC setup for executing run: dagster_postgres.utils.DagsterPosgresException: too many retries for DB connection
but other jobs work just fine. Any ideas what could be happening?
This seems to only be happening in 1/3 code locations if that helps
This is still happening, persisting across dagit restarts (consistently the same location). There aren't that many active DB connections at any one time (~20 total). Anyone have any ideas?
d
are there any indications from your postgres logs about what query might be failing?
d
I don't see any errors at all in PG
p
Are you using the default run launcher? It’s odd because the main DB query that gets executed while launching the run is just fetching the run from run storage, and that shouldn’t be affected by user code at all
Can you try spinning up dagit with just the problematic code location?
d
Yes to the first, and tentatively we have a resolution: • The error that dagit was actually getting was
SCRAM authentication requires libpq version 10 or above
which cause the authentication to fail • psycopg2-binary is installed (dependancy of dagster-postgres) • the new code location includes psycopg (version 3) in a venv which doesn't play nice with psycopg2, but I somehow it is interfering with the host dagit process???
I guess the really strange part to me is that something in a venv can break the launch runner somehow on the host process (this is multiple locations on the same machine, using python packages)
p
wow… thanks for the detailed report. I’ll try to replicate and figure out what’s going on. Do you mind filing a GH issue to track?
And can you confirm that dagit and dagster-daemon are using the same virtual environment? (Trying to make sure I’m replicating the same launch path)
d
They are, and the other code location is using a venv. gituhb ticket: https://github.com/dagster-io/dagster/issues/11353
Updated the github issue, but short version is the expected behavior is that dagit requires all code locations to be able to import from dagster_postgres iff it is using postgres for run storage. Adding an extra dependency so that works isn't the end of the world and confirmed it solves the issue
d
that's certainly the case - although I'd have hoped the error message would be much clearer and be something like "could not import dagster-postgres"
d
The reason for the initial terrible error message has to do with a pyscopg2 bug (https://github.com/psycopg/psycopg2/issues/1360), together with swallowing the "real" error from the failure to connect by dagit/code location
Should I explicitly have my packages take a dependency on
dagster-postgres
?
d
That's what we would recommend, yeah
any packages referenced in your dagster.yaml
d
makes sense. Do I have to be concerned about the versions of those things? I run a
dagster migrate
when there is a reason to on the
dagit
processes, but updating all the packages usually happens sometime later
d
You can run dagit and the code packages on different dagster versions - within a single python environment the dagster versions all need to match, but there are pins enforcing that
113 Views