dagster queued run mount duckDB mount warehouse location I a dagster #ask-community

dagster queued run mount duckDB! mount warehouse_...

geoHeil

04/22/2022, 4:03 PM

dagster queued run mount duckDB! mount warehouse_location I am trying to get my dagster pipeline to work inside docker. For this I am following along with: - https://github.com/dehume/big-data-madison-dagster - https://github.com/dagster-io/dagster/tree/master/examples/deploy_docker In particular, https://github.com/dagster-io/dagster/blob/master/examples/deploy_docker/dagster.yaml#L11 suggests using

DockerRunLauncher

. For both of

dagit

and

dagster-daemon

I have enabled docker-in-docker by mounting: https://github.com/dagster-io/dagster/blob/master/examples/deploy_docker/docker-compose.yml#L61

/var/run/docker.sock:/var/run/docker.sock

But I only get:

Copy code

DockerException: Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))

 File "/opt/conda/lib/python3.9/site-packages/dagster/core/instance/__init__.py", line 1698, in launch_run
    self._run_launcher.launch_run(LaunchRunContext(pipeline_run=run, workspace=workspace))
  File "/opt/conda/lib/python3.9/site-packages/dagster_docker/docker_run_launcher.py", line 152, in launch_run
    self._launch_container_with_command(run, docker_image, command)
  File "/opt/conda/lib/python3.9/site-packages/dagster_docker/docker_run_launcher.py", line 97, in _launch_container_with_command
    client = self._get_client(container_context)
  File "/opt/conda/lib/python3.9/site-packages/dagster_docker/docker_run_launcher.py", line 72, in _get_client
    client = docker.client.from_env()
  File "/opt/conda/lib/python3.9/site-packages/docker/client.py", line 96, in from_env
    return cls(
  File "/opt/conda/lib/python3.9/site-packages/docker/client.py", line 45, in __init__
    self.api = APIClient(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/docker/api/client.py", line 197, in __init__
    self._version = self._retrieve_server_version()
  File "/opt/conda/lib/python3.9/site-packages/docker/api/client.py", line 221, in _retrieve_server_version
    raise DockerException(

when executing:

Copy code

docker compose --profile dagster up --build

I am running Docker for Mac how can I get dagster to work nicely in this setup?

daniel

04/22/2022, 4:05 PM

Hi Georg - do you get this same error if you try to build and run the deploy_docker example with no changes? If so, I use Docker for Mac and it works for me, so there must be some difference between our Docker setups..

geoHeil

04/22/2022, 4:13 PM

When I docker-compose up for https://github.com/dagster-io/dagster/tree/master/examples/deploy_docker I can execute the DIND stuff from dagster - interstingly.

geoHeil

04/22/2022, 4:14 PM

But for me in my setup - it fails. Even though I have mapped the docker socket in the exact same way. Do you have any idea what is causing the problem here?

daniel

04/22/2022, 4:17 PM

That's very perplexing - the best recommendation I have is to triple check that that volume is actually mounted on both the dagit and daemon containers, because that should be all that you need for it to wrok

geoHeil

04/22/2022, 4:22 PM

I can see the docker.sock being mounted in dagit

daniel

04/22/2022, 4:22 PM

what about in the daemon?

geoHeil

04/22/2022, 4:23 PM

same /var/run/docker.sock is available

geoHeil

04/22/2022, 4:24 PM

Though:

USER dagster:dagster

geoHeil

04/22/2022, 4:24 PM

the user in the dockerfile is not root

geoHeil

04/22/2022, 4:24 PM

does the user inside the dockerfile need some special permissions?

geoHeil

04/22/2022, 4:36 PM

I have changed the user - but still get a permission denied

geoHeil

04/22/2022, 4:37 PM

Interestingly I also see failures like:

WARNING:root:Retrying failed database connection: (psycopg2.OperationalError) connection to server at "postgresql" (172.31.0.3), port 5432 failed: Connection refused

geoHeil

04/22/2022, 4:43 PM

one step further: ImageNotFound: 404 Client Error for http+docker://localhost/v1.41/images/create?tag=latest&fromImage=other: Not Found ("pull access denied for other, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")

geoHeil

04/22/2022, 4:43 PM

but locally the client (on the mac side ) is logged in

geoHeil

04/22/2022, 4:48 PM

ok - the docker stuff is fixed now

geoHeil

04/22/2022, 4:48 PM

but I am stuck at: `dagster.core.errors.DagsterInstanceSchemaOutdated: Raised an exception that may indicate that the Dagster database needs to be be migrated. Database is at revision None, head is b601eb913efa. To migrate, run

dagster instance migrate

geoHeil

04/22/2022, 4:48 PM

when spinning up an empty container - I would expect that the migration is run automatically

geoHeil

04/22/2022, 4:48 PM

and in fact, this works for the default example - so what is the difference in my code?

geoHeil

04/22/2022, 4:49 PM

and:

(psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "pg_class_relname_nsp_index"

DETAIL:  Key (relname, relnamespace)=(secondary_indexes_id_seq, 2200) already exists.

[SQL:

CREATE TABLE secondary_indexes (

id SERIAL NOT NULL,

name VARCHAR(512),

create_timestamp TIMESTAMP WITHOUT TIME ZONE DEFAULT CURRENT_TIMESTAMP,

migration_completed TIMESTAMP WITHOUT TIME ZONE,

PRIMARY KEY (id),

UNIQUE (name)

is found in the logs

geoHeil

04/22/2022, 5:03 PM

postgres is also showing me several: ERROR: duplicate key value violates unique constraint "instigators_selector_id_key" errors

geoHeil

04/22/2022, 5:04 PM

Strangely, the SFTP connection to the container name:

sftp

fails: unable to connect to port 2222 on 192.168.48.5

geoHeil

04/22/2022, 5:04 PM

(but works nicely for localhost)

geoHeil

04/22/2022, 5:08 PM

furthermore: NotFound: 404 Client Error for http+docker://localhost/v1.41/containers/ffd4684f2076cc41d455b8928ba1126cdf0e605d0f39c4ce38316bf91450fc7c/start: Not Found ("network docker_example_network not found") the docker error is now again quite irritating

prha

04/22/2022, 5:11 PM

This is from a brand new DB getting initialized?

prha

04/22/2022, 5:11 PM

Wondering if this is some initialization race condition between

dagit

and

dagster-daemon

geoHeil

04/22/2022, 5:13 PM

yes

geoHeil

04/22/2022, 5:15 PM

I think I fixed a couple of mistakes - nonetheless I still cannot get the DIND queued docker launcher to work. Meanwhile 1) It no longer fails instantly but is stuck (without any further logs) when trying to spin up the container 2) dagster daemon fails to instantiate the SFTP connection 3) postgres is throwing duplicate_key constraint validation errors

daniel

04/22/2022, 5:16 PM

Can you try removing your postgres container so that it brings it back up again from scratch?

daniel

04/22/2022, 5:16 PM

docker rm <postgres container ID>

possibly with a -f

daniel

04/22/2022, 5:16 PM

and try re-deploying?

geoHeil

04/22/2022, 5:17 PM

tried that already a couple of times - also deleted the mapped volume directory

geoHeil

04/22/2022, 5:17 PM

to replicate:

Copy code

git clone <https://github.com/geoHeil/dagster-ssh-demo.git>

cd dagster-ssh-demo

docker compose --profile dagster up --build

go to: http://localhost:3000/workspace/deploy_docker_repository@other/jobs/my_job/playground and try to launch

👀 1

geoHeil

04/22/2022, 5:21 PM

(2) is fixed now

geoHeil

04/22/2022, 5:21 PM

but now all (manual dummy and sensor initiated runs) are stuck in the startup phase.

geoHeil

04/22/2022, 5:22 PM

10 runs are currently in progress. Maximum is 10, won't launch more.

geoHeil

04/22/2022, 5:22 PM

And postgres keeps showing logs like: ERROR: duplicate key value violates unique constraint "instigators_selector_id_key" postgresql | 2022-04-22 172216.774 UTC [1140] DETAIL: Key (selector_id)=(40ae1d09616324124ae8fab93494603eee744f81) already exists. postgresql | 2022-04-22 172216.774 UTC [1140] STATEMENT: INSERT INTO instigators (selector_id, repository_selector_id, status, instigator_type, instigator_body) VALUES ('40ae1d09616324124ae8fab93494603eee744f81', '22ed74ad3bf3c735dd23193f2387c48c5a8cc556', 'AUTOMATICALLY_RUNNING', 'SENSOR', '{"__class__": "InstigatorState", "job_specific_data": {"__class__": "SensorInstigatorData", "cursor": null, "last_run_key": null, "last_tick_timestamp": 1650648132.730657, "min_interval": 30}, "job_type": {"__enum__": "InstigatorType.SENSOR"}, "origin": {"__class__": "ExternalJobOrigin", "external_repository_origin": {"__class__": "ExternalRepositoryOrigin", "repository_location_origin": {"__class__": "GrpcServerRepositoryLocationOrigin", "host": "ssh-demo", "location_name": "ssh-demo", "port": 4000, "socket": null}, "repository_name": "SSH_DEMO"}, "job_name": "foo_scd2_asset_sensor"}, "status": {"__enum__": "InstigatorStatus.AUTOMATICALLY_RUNNING"}}') RETURNING instigators.id

daniel

04/22/2022, 5:29 PM

I would hope this wouldn't matter, but is it possible that postgres:11 vs. postgres:14.2 coul dmake a difference?

daniel

04/22/2022, 5:29 PM

the example that is working for you has the former

daniel

04/22/2022, 5:30 PM

i'm a little confused why they aren't waiting for the postgres container to spin up even though they have depends_on: set

daniel

04/22/2022, 5:34 PM

actually taking this line out is making postgres behave better for me:

Copy code

-    volumes:
-      - ./postgres-dagster:/var/lib/postgresql/data

daniel

04/22/2022, 5:35 PM

the broad advice that I have is to start with the example that is working and then incrementally add things that are different until it stops working - i think it will be a lot easier to isolate problems to a specific change that way

prha

04/22/2022, 5:43 PM

I also was able to get the example to working (taking out the line that @daniel mentioned as well as some

warehouse_location

volumen mounts).

geoHeil

04/22/2022, 6:12 PM

nonetheless: I just commented out the 2 (volumes postgresql and warehouse_lcoation) but for me everything is still stuck in queued up startup phase.

geoHeil

04/22/2022, 6:13 PM

also postgres 11 is showing the duplicate key warnings/errors

geoHeil

04/22/2022, 6:19 PM

Are you able to actually get runs to finish running (and not have them stuck in the getting started phase?

prha

04/22/2022, 6:26 PM

Ah, I guess the runs didn’t successfully launch due to a docker permissions issue in the run coordinator:

pull access denied for ssh_demo_other, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

prha

04/22/2022, 6:26 PM

but I am not running into the same DB consistency checks that you are running into as the daemon is running

geoHeil

04/22/2022, 6:27 PM

no -> this is fixed now if you pull the latest version. This was due to the

DAGSTER_CURRENT_IMAGE: "ssh-demo"

instead of

DAGSTER_CURRENT_IMAGE: "ssh_demo_ssh-demo"

prha

04/22/2022, 6:49 PM

I’m hitting the same error as before, (

pull access denied for ssh_demo_other

). switching line 151 to

ssh_demo_ssh-demo

generates the same error also:

Copy code

docker.errors.ImageNotFound: 404 Client Error for <http+docker://localhost/v1.41/images/create?tag=latest&fromImage=ssh_demo_ssh-demo>: Not Found ("pull access denied for ssh_demo_ssh-demo, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")

geoHeil

04/23/2022, 5:51 AM

I cloned into a fresh folder. I guess the name must be identical to the ones from

docker images

for the somehow in this new folder the name needed to change to

dagster-ssh-demo_ssh_demo

then I do not get the pul problem

geoHeil

04/23/2022, 5:52 AM

However, runs are still stuck launching: dagster.daemon.QueuedRunCoordinatorDaemon - INFO - Launched 2 runs. but the logs show that they clearly have been received

geoHeil

04/23/2022, 10:20 AM

When commenting out this block:

Copy code

# run_launcher:
#   module: dagster_docker
#   class: DockerRunLauncher
#   config:
#     env_vars:
#       - DAGSTER_POSTGRES_USER
#       - DAGSTER_POSTGRES_PASSWORD
#       - DAGSTER_POSTGRES_DB
#     network: dagster_network
#     container_kwargs:
#       auto_remove: true

the tasks start to execute.

geoHeil

04/23/2022, 10:20 AM

(in dagster.yaml )

geoHeil

04/23/2022, 12:24 PM

This seems strage though:

Copy code

dagster-daemon    | 2022-04-23 10:19:16 +0000 - dagster.daemon.QueuedRunCoordinatorDaemon - INFO - Retrieved 3 queued runs, checking limits.
dagster-daemon    | 2022-04-23 10:19:19 +0000 - dagster.daemon.QueuedRunCoordinatorDaemon - INFO - Launched 3 runs.
dagster-daemon    | INFO  [dagster.daemon.QueuedRunCoordinatorDaemon] Launched 3 runs.
dagster-daemon    | DEBUG [dagster.daemon.SchedulerDaemon] Not checking for any runs since no schedules have been started.
dagster-daemon    | DEBUG [dagster.daemon.QueuedRunCoordinatorDaemon] Poll returned no queued runs.

daniel

04/23/2022, 12:31 PM

Do the runs show up in dagit? Usually the event log will say what container it tried to spin up, and open there are clues for why it didn't start in the logs for that container

geoHeil

04/23/2022, 12:32 PM

yes - but are stuck in startup

geoHeil

04/23/2022, 12:33 PM

unfortunately, so far I did not find any clues yet

geoHeil

04/23/2022, 12:36 PM

I can only see: [DockerRunLauncher] Launching run in a new container 337b6b03bc2dc28bda38318700100de882ababb349388c587acb318b829c6cc3 with image dagster-ssh-demo_other

daniel

04/23/2022, 12:37 PM

Run ‘docker logs’ with that container ID, any clues there?

geoHeil

04/23/2022, 12:40 PM

dagster.core.errors.DagsterInvalidConfigError: Errors whilst loading configuration for {'postgres_url': Field(<dagster.config.source.StringSourceType object at 0x7fbe5fad2f10>, default=@, is_required=False), 'postgres_db': Field(<dagster.config.field_utils.Shape object at 0x7fbe5ae89be0>, default=@, is_required=False), 'should_autocreate_tables': Field(<dagster.config.config_type.Bool object at 0x7fbe60783340>, default=True, is_required=False)}. Error 1: Post processing at path rootpostgres dbhostname of original value {'env': 'DAGSTER_POSTGRES_HOSTNAME'} failed: dagster.config.errors.PostProcessingError: You have attempted to fetch the environment variable "DAGSTER_POSTGRES_HOSTNAME" which is not set. In order for this execution to succeed it must be set in this environment.

geoHeil

04/23/2022, 12:41 PM

though I set these here: x-app-vars: &default-app-vars DAGSTER_POSTGRES_HOSTNAME: "postgresql"

geoHeil

04/23/2022, 12:41 PM

(and pass it to all the containers)

daniel

04/23/2022, 12:42 PM

That looks like some additional config is needed on the run launcher to include some env vars, check the example for reference

daniel

04/23/2022, 12:44 PM

https://github.com/dagster-io/dagster/blob/master/examples/deploy_docker/dagster.yaml#L15

geoHeil

04/23/2022, 12:45 PM

Interestingly: https://github.com/dagster-io/dagster/blob/master/examples/deploy_docker/docker-compose.yml (the working example for the docker launcher) is not setting the DAGSTER_POSTGRES_HOSTNAME variable. But: https://github.com/dagster-io/dagster/blob/master/examples/deploy_docker/dagster.yaml#L24 -- let me double check the dagster.yaml file

geoHeil

04/23/2022, 12:46 PM

But why does the link you sent not include the hostname?

geoHeil

04/23/2022, 12:46 PM

(and is working)

daniel

04/23/2022, 12:47 PM

It may be using the default that postgres sets, I'm not positive

geoHeil

04/23/2022, 12:50 PM

https://github.com/geoHeil/dagster-ssh-demo/commit/1e782be5ebdad32b80ed5da8d66499713cea2c50

geoHeil

04/23/2022, 12:51 PM

this fixes the run launcher

🎉 1

geoHeil

04/23/2022, 12:51 PM

let me re-enable the other services successively and then work on mounting the volumes.

geoHeil

04/23/2022, 12:51 PM

But I am curious - you mentioned that dropping the volume mount makes your postgres behave better - how did this have any influence here at all?

daniel

04/23/2022, 12:52 PM

I'm not sure, I was just removing things that were different from the example until it started working again :)

daniel

04/23/2022, 12:53 PM

It seemed like having it in the volume was keeping the postgres container from starting up correctly

geoHeil

04/23/2022, 12:57 PM

Interestingly - this did not work for me. Anyways - let me check the next steps.

geoHeil

04/23/2022, 12:57 PM

I am on to the next step now: The pyspark resource (in local mode) is not coming up:

Copy code

RuntimeError: Java gateway process exited before sending its port number
  File "/opt/conda/lib/python3.9/site-packages/dagster/core/errors.py", line 184, in user_code_error_boundary
    yield
  File "/opt/conda/lib/python3.9/site-packages/dagster/core/execution/resources_init.py", line 298, in single_resource_event_generator
    resource_def.resource_fn(context)
  File "/opt/conda/lib/python3.9/site-packages/dagster_pyspark/resources.py", line 53, in pyspark_resource
    return PySparkResource(init_context.resource_config["spark_conf"])
  File "/opt/conda/lib/python3.9/site-packages/dagster_pyspark/resources.py", line 20, in __init__
    self._spark_session = spark_session_from_config(spark_conf)
  File "/opt/conda/lib/python3.9/site-packages/dagster_pyspark/resources.py", line 15, in spark_session_from_config
    return builder.getOrCreate()
  File "/opt/conda/lib/python3.9/site-packages/pyspark/sql/session.py", line 228, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "/opt/conda/lib/python3.9/site-packages/pyspark/context.py", line 392, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "/opt/conda/lib/python3.9/site-packages/pyspark/context.py", line 144, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/opt/conda/lib/python3.9/site-packages/pyspark/context.py", line 339, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/opt/conda/lib/python3.9/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway
    raise RuntimeError("Java gateway process exited before sending its port number")

But so far I did not see anything suspicious in the logs of the container.

geoHeil

04/23/2022, 1:00 PM

this looks like JAVA_HOME is not set error

geoHeil

04/23/2022, 1:00 PM

(but the logs) do not show this directly

geoHeil

04/23/2022, 1:01 PM

This looks like https://github.com/geoHeil/dagster-ssh-demo/blob/master/environment.yml#L12 (conda pyspark) is not bringing in the downstream JDK dependency and not setting java home

geoHeil

04/23/2022, 1:21 PM

spark is starting now

geoHeil

04/23/2022, 1:21 PM

Is the postgres error/warning: ERROR: duplicate key value violates unique constraint "instigators_selector_id_key" anything I should worry about?

geoHeil

04/23/2022, 1:33 PM

I can confirm that a lot of the dummy example jobs are working now! However the state_handling for the mappend volumes is not yet working as expected

geoHeil

04/23/2022, 1:52 PM

Also strange: too many retries for DB connection sometimes dagit & the dagster-daemon fail to start with this error- albeit they are set to depend-on / wait for the DB container

geoHeil

04/23/2022, 1:57 PM

@daniel is my suspicion correct that the ``DockerRunLauncher`` is not using the volume mappings defined in docker-compose? How can I set up the volume mappings for the run launcher so they are applied when starting the container?

daniel

04/23/2022, 1:58 PM

Volume mounting sample config is here: https://docs.dagster.io/deployment/guides/docker#mounting-volumes

geoHeil

04/23/2022, 2:02 PM

thanks. Interesting: `

Copy code

repo.py:/opt/dagster/app/

fro mthe link looks like a relative path. Though for me when passing: ./warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location get an error message that relative paths are not allowed

geoHeil

04/23/2022, 2:14 PM

Even when mounting an absolute path:

/path/to/dagster-ssh-demo/warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location

no files are written to this directory

daniel

04/23/2022, 2:20 PM

I'm not at my keyboard currently but we’ll be able to take a closer look at this on Monday

🎉 1

geoHeil

04/25/2022, 11:37 AM

@daniel did you already find the time to take a look at the volume mount problem?

daniel

04/25/2022, 9:52 PM

I did not, but I have a bit of time now to investigate

daniel

04/25/2022, 9:52 PM

is the github repo that you posted earlier up to date with the code that you're using that repros this?

daniel

04/25/2022, 10:26 PM

what I think is happening with the postgres volume thing is that making it mount that data as a volume on startup causes it to take a lot longer to start up, so other services that depends on it start spewing some errors while they are waiting for postgres to be ready (adding

depends_on

in your docker-compose file just makes the containers wait for the container to start, they don't make it wait to be fully ready). For me the daemon and dagit eventually reach an OK place and are able to run correctly - there's just some spew at the beginning while they wait to be able to connect to postgres

daniel

04/25/2022, 10:26 PM

there are some tips in the docker docs for controlling service order more explicitly if you want to make the other services specifically wait for postgres to be ready: https://docs.docker.com/compose/startup-order/

daniel

04/25/2022, 10:28 PM

The "duplicate key value violates unique constraint" errors are actually expected currently (it adds the row if it doesn't exist, then updates it if it does exist and that constraint fires) but we'll see what we can do to make that less spew-y

daniel

04/25/2022, 10:30 PM

that leaves the issue you described about mounting volumes in the run launcher not working - can you share more details about what exactly I should do to reproduce that? Which job i should run, what the exact expected behavior is vs. what you're seeing, and what code is supposed to be writing to the volume?

geoHeil

04/26/2022, 3:42 AM

yes the code repository is up-to-date

geoHeil

04/26/2022, 3:46 AM

In particular the IO managers are writing to this location https://github.com/geoHeil/dagster-ssh-demo/blob/master/SSH_DEMO/resources/parquet_io_manager.py#L99 the ingest assets https://github.com/geoHeil/dagster-ssh-demo/blob/master/SSH_DEMO/assets/ingest_assets.py read from t he SFTP docker container - and store it into the warehouse but they store it locally in the container which gets deleted (as the volume mappings which are applied a) from docker-compose and b) from the launch configuration of dagit.yaml for the docker-based executor somehow seem to work in a a different way

geoHeil

04/26/2022, 3:47 AM

You do not need to run any job.

geoHeil

04/26/2022, 3:47 AM

Copy code

git clone <https://github.com/geoHeil/dagster-ssh-demo.git>
cd dagster-ssh-demo

make start 
# or alternatively without make
docker compose --profile dagster up --build

is all what is needed - the sensors start to automatically poll the SFTP resource for ingestable files

daniel

04/27/2022, 2:05 AM

l think when specifying volumes using container_kwargs, the key has to be an absolute path, not a relative one - that's actually a docker restriction: https://docker-py.readthedocs.io/en/stable/containers.html (and is different than docker-compose)

daniel

04/27/2022, 2:07 AM

i.e. if you changed it from

Copy code

volumes:
-        - warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location

to something like (this is my absolute path, yours is probably different):

Copy code

volumes:
         - /Users/dgibson/dagster-ssh-demo/warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location

I think it would be more likely to work. I tried that and am still getting an error in your job, but i think that may be logic in your modified IO manager now? I'd hope that the volume would work as expectedn ow

daniel

04/27/2022, 2:09 AM

when I ran

docker inspect <container ID>

on a launched container, I saw

Copy code

"Mounts": [
            {
                "Type": "bind",
                "Source": "/Users/dgibson/dagster-ssh-demo/warehouse_location_dagster",
                "Destination": "/opt/dagster/dagster_home/warehouse_location",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],

which matched the path on the gRPC server

daniel

04/27/2022, 2:12 AM

I see how needing to specify the absolute path is annoying - it's tricky because dagster.yaml is getting loaded and interpreted inside the Docker container, where it doesn't have any way to know what the base directory to use for a relative path that's referring to the filesystem outside of Docker

geoHeil

04/27/2022, 6:37 AM

I experimented locally with an absolute path and had the same problem

geoHeil

04/27/2022, 6:37 AM

let me re try/check with docker inspect

geoHeil

04/27/2022, 8:10 AM

I get:

geoHeil

04/27/2022, 8:10 AM

"Mounts": [ { "Type": "bind", "Source": "/Users/geoheil/Downloads/fooo/dagster-ssh-demo/warehouse_location_dagster", "Destination": "/opt/dagster/dagster_home/warehouse_location", "Mode": "", "RW": true, "Propagation": "rprivate" } ]

geoHeil

04/27/2022, 8:10 AM

for:

- /Users/geoheil/Downloads/fooo/dagster-ssh-demo/warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location

geoHeil

04/27/2022, 8:21 AM

But still get: Path does not exist: file:/opt/dagster/dagster_home/src/warehouse_location/foo_asset.

geoHeil

04/27/2022, 8:21 AM

I think I need to adapt the path mapping to:

/Users/geoheil/Downloads/fooo/dagster-ssh-demo/warehouse_location_dagster:/opt/dagster/dagster_home/src/warehouse_location

geoHeil

04/27/2022, 8:21 AM

(include the src)

geoHeil

04/27/2022, 8:38 AM

Interestingly:

Error: No arguments given and workspace.yaml not found.

I get this error then from dagster daemon (did not have that one before this change)

geoHeil

04/27/2022, 8:58 AM

Cool - this error indeed is solved with the absolute path and including the

src

in the mapping (see latest commit)

geoHeil

04/27/2022, 8:58 AM

Except for the combined_assset_sensor everything else works fine. This sensor however does not fire from within docker.

210 Views

Open in Slack

Previous Next