David Weber
05/08/2023, 1:56 PMstorage:
postgres:
postgres_db:
username: ""
password: ""
hostname: ""
db_name: ""
port: 5432
From my understanding what should happen now, ist that everything that dagster has written in DAGSTER_HOME under "storage" (before specifying PG, as it was using sql-lite) should now land in my PG-database instead of in my local file-system / folder.
But Dagster is still writing assets in the "storage" folder under DAGSTER_HOME locally. Is this expected?Vinnie
05/08/2023, 2:07 PMdagster.yaml
only relates to what’s written to a backend database; that is (non-exhaustive), schedule information, heartbeats, run history, dynamic partitions, etc.
What you’re likely still seeing in your storage
folder are the outputs of your IO Manager (if using the default IO Manager — https://docs.dagster.io/concepts/io-management/io-managers) and probably compute logs for each run (https://docs.dagster.io/deployment/dagster-instance#compute-log-storage)David Weber
05/08/2023, 2:30 PMadls2_resource
dagster-azure
. And exactly when execuiting a run, that does one of these file copies, it creates a "file" (?) in the storage
folder.
The only reason for this behaviour I could think of, is that dagster needs this in order to know that the asset
has been materialized (aka the copy was successful, aka having a "placeholder" file locally in the storage
folder)Vinnie
05/08/2023, 2:35 PMop
or asset
gets passed to the IO Manager, which then takes care of saving it so downstream `op`s and `asset`s can load it and process it further. The way to stop it from creating that file would be overriding the (either default or selected) IO Manager. As far as I can tell, there’s one in the dagster-azure
library: https://docs.dagster.io/_apidocs/libraries/dagster-azure#dagster_azure.adls2.adls2_pickle_io_manager
As an example, if you’re handling all your IO logic within the op or asset and returning its remote path (as I’ve often seen in Airflow implementations), this would cause dagster to pass this remote path to the IO Manager, which will then write it to a file/blob storage/database/whatever you tell it tosean
05/08/2023, 4:23 PMDavid Weber
05/09/2023, 7:24 AM