Do you know how to setup DAGSTER_HOME with a dev a...
# ask-community
r
Do you know how to setup DAGSTER_HOME with a dev and prod environment? Looking to save my run files different locations if run and if prod.. Is it correctly understood, you execute a dev run with
dagster dev
and a production run with
dagster-webserver
?
I am looking here, but I don't see how to setup a run on dev vs prod: https://docs.dagster.io/deployment/dagster-instance
Or is it possible to setup
resources
and save the logs file there? seems like a good way to split between prod and dev: https://docs.dagster.io/concepts/resources
d
As a very first step, you can simply create two different scripts (for example):
dagster_dev.bat
Copy code
set DAGSTER_HOME=C:\dagster\dev_home
call .venv\scripts\activate
dagster dev ... -p 8080
dagster_prod.bat
Copy code
set DAGSTER_HOME=C:\dagster\prod_home
call .venv\scripts\activate
dagster dev ... -p 8081
Note that if you want to run both at the same time, you need to use different ports with the -p argument. Now you can navigate to localhost:8080 for dev, and localhost:8081 for prod. It might also makes sense to keep the venvs separate, and use a package manager like
pypoetry
to avoid dependency conflicts, but if that's too involved you can just leave everything in the same venv. Later on, when constant availability becomes important, you might want to look into setting up docker to run
dagster-webserver
,
dagster-daemon
, and a gprc code server in separate, restartable containers. You may even want to include a postgres db container for run logs if you have many assets with many materializations, because the default sqlite db can get quite slow. I cannot go the docker route in my current project due to how my external dependencies are set up, but there is a tutorial in the docs iirc. For now, the
dagster_prod.bat
approach works fine for my needs.
r
Thank you @DB ! ๐Ÿ˜„
But what about the
.env
file? I also see you have a \dev_home and a \prod_home, it seems a little bute-force to have to different dagster installations.
j
hey @Rene Czepluch i donโ€™t think DB has different dagster installations. I think itโ€™s just a folder named โ€œdagsterโ€ with two subfolders โ€œdev_homeโ€ and โ€œprod_homeโ€
d
@Rene Czepluch those aren't two different installations. Using different values for
DAGSTER_HOME
just tells dagster to make two separate run storages, one for dev and one for prod. I think this is exactly what you wanted?
r
Yeah it is exactly how I would like ๐Ÿ˜„ so thanks for the suggestion but I don't understand it really. Where are the bat files located? What about the assets, they often use dev data and not the same as production ๐Ÿ˜„ Also what is in the .venv scripts ?
d
The .bat scripts are located in your working folder. So maybe something like this:
Copy code
dagster/
โ”œโ”€ dev_home/
โ”‚  โ”œโ”€ dagster.yaml
โ”œโ”€ prod_home/
โ”‚  โ”œโ”€ dagster.yaml
โ”œโ”€ src/
โ”‚  โ”œโ”€ .venv/
โ”‚  โ”œโ”€ my_dag.py
โ”‚  โ”œโ”€ dagster_dev.bat
โ”‚  โ”œโ”€ dagster_prod.bat
If you want do use different data sources in dev and prod, you should look into resources and how to configure them using `EnvVar`: https://docs.dagster.io/concepts/resources You can then set an additional environment variable in the batch scripts, e.g.
SET DAGSTER_DATA_DIR=C:\dagster_data\prod_data
and
SET DAGSTER_DATA_DIR=C:\dagster_data\dev_data
and configure your data source accordingly. So you end up with something like
Copy code
# my_dag.py
# run with dagster dev -f my_dag.py

from dagster import (
    AssetExecutionContext,
    ConfigurableResource,
    Definitions,
    EnvVar,
    asset,
)

class MyDataSource(ConfigurableResource):
    data_dir: str

    def read_data(self):
        ...

@asset
def data_from_source(context: AssetExecutionContext, my_source: MyDataSource):
    <http://context.log.info|context.log.info>(f"Reading data from {my_source.data_dir}!")
    data = my_source.read_data()
    ...

defs = Definitions(
    assets=[data_from_source],
    resources={
        "my_source": MyDataSource(data_dir=EnvVar("DAGSTER_DATA_DIR")),
    },
)
.venv
simply contains the virtual environment where you installed dagster. If you're not already using virtual environments for different projects, I would suggest having a look here: https://docs.python.org/3/tutorial/venv.html
r
Thank you very much for the tip. In on vacation but I am pretty sure this is what I need as well. Thanks @DB