Sandeep Aggarwal01/18/2022, 1:06 PM
API to process a graph with ~15 ops. I am observing significant performance drop when switching to a persistent dagster instance with SQLLite/Postgres based run & event log storages. The execution time increases to
which was earlier taking around
. The executor is still the in-process one, so I guess its DB writes that are causing this overhead. Is that expected? Below are screenshots for execution times.
sandy01/18/2022, 4:26 PM
alex01/18/2022, 7:19 PM
to determine where exactly the slowdown is.
SQLLite/PostgresThe details here will have a big impact. Is it sqlite or postgres? If posgres where is the DB running?
Sandeep Aggarwal01/19/2022, 10:20 AM
is taking significantly more time,
with in-memory compared to
with persistent storage. You might have more insights. I am attaching the files for your reference. Can you please take a look?
800ms - 1200ms
alex01/19/2022, 3:27 PM
runtimes on the order of minutes or even hours, so these 100s of millisecond overheads per event have not yet been a focus.
Sandeep Aggarwal01/19/2022, 6:13 PM
alex01/19/2022, 6:22 PM
is possible to use multiprocess_executor or dask_executor with ephemeral Dagster instance?Not easily, and i expect you to hit further latency issues from the per-process overhead. I believe https://github.com/dagster-io/dagster/issues/4041 is what you would need for your use case.
Sandeep Aggarwal01/19/2022, 6:32 PM
alex01/19/2022, 6:38 PM