Hi Team I m apparently running into the concurrency limit of dagster #ask-community

Hi Team, I’m (apparently) running into the concurr...

Martim Passos

01/18/2022, 4:48 PM

Hi Team, I’m (apparently) running into the concurrency limit of the default SQLite database with a job/op that `DynamicOutput`s every row in a pandas DF (about 3000):

Copy code

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked
[SQL: INSERT INTO event_logs (run_id, event, dagster_event_type, timestamp, step_key, asset_key, partition) VALUES (?, ?, ?, ?, ?, ?, ?)]

My use case is rather simple and I don’t need to keep track of all the logs Dagster saves by default. I tried unsetting

DAGSTER_HOME

but that just uses a temporary directory that hits the same problem. Is there a way to work around this without having to setup my own SQL db? Something like reducing the amount of events saved or preventing/limiting the

DynamicOutput

ops from running in parallel?

Martim Passos

01/18/2022, 4:54 PM

is fiddling with

max_concurrent

the answer I’m looking for?https://docs.dagster.io/_apidocs/execution#dagster.multiprocess_executor

max

01/18/2022, 5:16 PM

is running Postgres not an option for you? SQLite is fundamentally not designed for concurrent use cases -- we've tried to hack around that, but ultimately you will want a db that supports multiple concurrent writes

Martim Passos

01/18/2022, 6:35 PM

I mean it’s a lot of overhead for a feature I don’t really use… My jobs (which are triggered manually on Dagit or CLI) are built to feed a digital humanities project, formatting image metadata, handling image files on the cloud and querying a few APIs. I don’t deploy to cloud or use the daemons at all, and am not dealing with a complex business environment where this exhaustive logging might be needed. The Dagster features I make use of are more related to type checking/validation, code organization/visualization and IO management

Martim Passos

01/18/2022, 7:21 PM

limiting

max_concurrent

to 4 seems to have fixed it though. I’m on a 24 core machine so that’s likely what caused the issue. I’ll experiment to see where the limit is being reached exactly…

120 Views

Open in Slack

Previous Next