geoHeil
04/22/2022, 4:03 PMDockerRunLauncher
.
For both of dagit
and dagster-daemon
I have enabled docker-in-docker by mounting: https://github.com/dagster-io/dagster/blob/master/examples/deploy_docker/docker-compose.yml#L61 /var/run/docker.sock:/var/run/docker.sock
But I only get:
DockerException: Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))
File "/opt/conda/lib/python3.9/site-packages/dagster/core/instance/__init__.py", line 1698, in launch_run
self._run_launcher.launch_run(LaunchRunContext(pipeline_run=run, workspace=workspace))
File "/opt/conda/lib/python3.9/site-packages/dagster_docker/docker_run_launcher.py", line 152, in launch_run
self._launch_container_with_command(run, docker_image, command)
File "/opt/conda/lib/python3.9/site-packages/dagster_docker/docker_run_launcher.py", line 97, in _launch_container_with_command
client = self._get_client(container_context)
File "/opt/conda/lib/python3.9/site-packages/dagster_docker/docker_run_launcher.py", line 72, in _get_client
client = docker.client.from_env()
File "/opt/conda/lib/python3.9/site-packages/docker/client.py", line 96, in from_env
return cls(
File "/opt/conda/lib/python3.9/site-packages/docker/client.py", line 45, in __init__
self.api = APIClient(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/docker/api/client.py", line 197, in __init__
self._version = self._retrieve_server_version()
File "/opt/conda/lib/python3.9/site-packages/docker/api/client.py", line 221, in _retrieve_server_version
raise DockerException(
when executing:
docker compose --profile dagster up --build
I am running Docker for Mac how can I get dagster to work nicely in this setup?daniel
04/22/2022, 4:05 PMgeoHeil
04/22/2022, 4:13 PMgeoHeil
04/22/2022, 4:14 PMdaniel
04/22/2022, 4:17 PMgeoHeil
04/22/2022, 4:22 PMdaniel
04/22/2022, 4:22 PMgeoHeil
04/22/2022, 4:23 PMgeoHeil
04/22/2022, 4:24 PMUSER dagster:dagster
geoHeil
04/22/2022, 4:24 PMgeoHeil
04/22/2022, 4:24 PMgeoHeil
04/22/2022, 4:36 PMgeoHeil
04/22/2022, 4:37 PMWARNING:root:Retrying failed database connection: (psycopg2.OperationalError) connection to server at "postgresql" (172.31.0.3), port 5432 failed: Connection refused
geoHeil
04/22/2022, 4:43 PMgeoHeil
04/22/2022, 4:43 PMgeoHeil
04/22/2022, 4:48 PMgeoHeil
04/22/2022, 4:48 PMdagster instance migrate
.`geoHeil
04/22/2022, 4:48 PMgeoHeil
04/22/2022, 4:48 PMgeoHeil
04/22/2022, 4:49 PM(psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "pg_class_relname_nsp_index"
DETAIL: Key (relname, relnamespace)=(secondary_indexes_id_seq, 2200) already exists.
[SQL:
CREATE TABLE secondary_indexes (
id SERIAL NOT NULL,
name VARCHAR(512),
create_timestamp TIMESTAMP WITHOUT TIME ZONE DEFAULT CURRENT_TIMESTAMP,
migration_completed TIMESTAMP WITHOUT TIME ZONE,
PRIMARY KEY (id),
UNIQUE (name)
)
is found in the logsgeoHeil
04/22/2022, 5:03 PMgeoHeil
04/22/2022, 5:04 PMsftp
fails: unable to connect to port 2222 on 192.168.48.5geoHeil
04/22/2022, 5:04 PMgeoHeil
04/22/2022, 5:08 PMprha
04/22/2022, 5:11 PMprha
04/22/2022, 5:11 PMdagit
and dagster-daemon
geoHeil
04/22/2022, 5:13 PMgeoHeil
04/22/2022, 5:15 PMdaniel
04/22/2022, 5:16 PMdaniel
04/22/2022, 5:16 PMdocker rm <postgres container ID>
possibly with a -fdaniel
04/22/2022, 5:16 PMgeoHeil
04/22/2022, 5:17 PMgeoHeil
04/22/2022, 5:17 PMgit clone <https://github.com/geoHeil/dagster-ssh-demo.git>
cd dagster-ssh-demo
docker compose --profile dagster up --build
go to:
http://localhost:3000/workspace/deploy_docker_repository@other/jobs/my_job/playground
and try to launchgeoHeil
04/22/2022, 5:21 PMgeoHeil
04/22/2022, 5:21 PMgeoHeil
04/22/2022, 5:22 PMgeoHeil
04/22/2022, 5:22 PMdaniel
04/22/2022, 5:29 PMdaniel
04/22/2022, 5:29 PMdaniel
04/22/2022, 5:30 PMdaniel
04/22/2022, 5:34 PM- volumes:
- - ./postgres-dagster:/var/lib/postgresql/data
daniel
04/22/2022, 5:35 PMprha
04/22/2022, 5:43 PMwarehouse_location
volumen mounts).geoHeil
04/22/2022, 6:12 PMgeoHeil
04/22/2022, 6:13 PMgeoHeil
04/22/2022, 6:19 PMprha
04/22/2022, 6:26 PMpull access denied for ssh_demo_other, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
prha
04/22/2022, 6:26 PMgeoHeil
04/22/2022, 6:27 PMDAGSTER_CURRENT_IMAGE: "ssh-demo"
instead of DAGSTER_CURRENT_IMAGE: "ssh_demo_ssh-demo"
prha
04/22/2022, 6:49 PMpull access denied for ssh_demo_other
). switching line 151 to ssh_demo_ssh-demo
generates the same error also:
docker.errors.ImageNotFound: 404 Client Error for <http+docker://localhost/v1.41/images/create?tag=latest&fromImage=ssh_demo_ssh-demo>: Not Found ("pull access denied for ssh_demo_ssh-demo, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")
geoHeil
04/23/2022, 5:51 AMdocker images
for the somehow in this new folder the name needed to change to dagster-ssh-demo_ssh_demo
then I do not get the pul problemgeoHeil
04/23/2022, 5:52 AMgeoHeil
04/23/2022, 10:20 AM# run_launcher:
# module: dagster_docker
# class: DockerRunLauncher
# config:
# env_vars:
# - DAGSTER_POSTGRES_USER
# - DAGSTER_POSTGRES_PASSWORD
# - DAGSTER_POSTGRES_DB
# network: dagster_network
# container_kwargs:
# auto_remove: true
the tasks start to execute.geoHeil
04/23/2022, 10:20 AMgeoHeil
04/23/2022, 12:24 PMdagster-daemon | 2022-04-23 10:19:16 +0000 - dagster.daemon.QueuedRunCoordinatorDaemon - INFO - Retrieved 3 queued runs, checking limits.
dagster-daemon | 2022-04-23 10:19:19 +0000 - dagster.daemon.QueuedRunCoordinatorDaemon - INFO - Launched 3 runs.
dagster-daemon | INFO [dagster.daemon.QueuedRunCoordinatorDaemon] Launched 3 runs.
dagster-daemon | DEBUG [dagster.daemon.SchedulerDaemon] Not checking for any runs since no schedules have been started.
dagster-daemon | DEBUG [dagster.daemon.QueuedRunCoordinatorDaemon] Poll returned no queued runs.
daniel
04/23/2022, 12:31 PMgeoHeil
04/23/2022, 12:32 PMgeoHeil
04/23/2022, 12:33 PMgeoHeil
04/23/2022, 12:36 PMdaniel
04/23/2022, 12:37 PMgeoHeil
04/23/2022, 12:40 PMgeoHeil
04/23/2022, 12:41 PMgeoHeil
04/23/2022, 12:41 PMdaniel
04/23/2022, 12:42 PMdaniel
04/23/2022, 12:44 PMgeoHeil
04/23/2022, 12:45 PMgeoHeil
04/23/2022, 12:46 PMgeoHeil
04/23/2022, 12:46 PMdaniel
04/23/2022, 12:47 PMgeoHeil
04/23/2022, 12:50 PMgeoHeil
04/23/2022, 12:51 PMgeoHeil
04/23/2022, 12:51 PMgeoHeil
04/23/2022, 12:51 PMdaniel
04/23/2022, 12:52 PMdaniel
04/23/2022, 12:53 PMgeoHeil
04/23/2022, 12:57 PMgeoHeil
04/23/2022, 12:57 PMRuntimeError: Java gateway process exited before sending its port number
File "/opt/conda/lib/python3.9/site-packages/dagster/core/errors.py", line 184, in user_code_error_boundary
yield
File "/opt/conda/lib/python3.9/site-packages/dagster/core/execution/resources_init.py", line 298, in single_resource_event_generator
resource_def.resource_fn(context)
File "/opt/conda/lib/python3.9/site-packages/dagster_pyspark/resources.py", line 53, in pyspark_resource
return PySparkResource(init_context.resource_config["spark_conf"])
File "/opt/conda/lib/python3.9/site-packages/dagster_pyspark/resources.py", line 20, in __init__
self._spark_session = spark_session_from_config(spark_conf)
File "/opt/conda/lib/python3.9/site-packages/dagster_pyspark/resources.py", line 15, in spark_session_from_config
return builder.getOrCreate()
File "/opt/conda/lib/python3.9/site-packages/pyspark/sql/session.py", line 228, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/opt/conda/lib/python3.9/site-packages/pyspark/context.py", line 392, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/opt/conda/lib/python3.9/site-packages/pyspark/context.py", line 144, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "/opt/conda/lib/python3.9/site-packages/pyspark/context.py", line 339, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "/opt/conda/lib/python3.9/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway
raise RuntimeError("Java gateway process exited before sending its port number")
But so far I did not see anything suspicious in the logs of the container.geoHeil
04/23/2022, 1:00 PMgeoHeil
04/23/2022, 1:00 PMgeoHeil
04/23/2022, 1:01 PMgeoHeil
04/23/2022, 1:21 PMgeoHeil
04/23/2022, 1:21 PMgeoHeil
04/23/2022, 1:33 PMgeoHeil
04/23/2022, 1:52 PMgeoHeil
04/23/2022, 1:57 PMdaniel
04/23/2022, 1:58 PMgeoHeil
04/23/2022, 2:02 PMrepo.py:/opt/dagster/app/
fro mthe link looks like a relative path. Though for me when passing: ./warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location get an error message that relative paths are not allowedgeoHeil
04/23/2022, 2:14 PM/path/to/dagster-ssh-demo/warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location
no files are written to this directorydaniel
04/23/2022, 2:20 PMgeoHeil
04/25/2022, 11:37 AMdaniel
04/25/2022, 9:52 PMdaniel
04/25/2022, 9:52 PMdaniel
04/25/2022, 10:26 PMdepends_on
in your docker-compose file just makes the containers wait for the container to start, they don't make it wait to be fully ready). For me the daemon and dagit eventually reach an OK place and are able to run correctly - there's just some spew at the beginning while they wait to be able to connect to postgresdaniel
04/25/2022, 10:26 PMdaniel
04/25/2022, 10:28 PMdaniel
04/25/2022, 10:30 PMgeoHeil
04/26/2022, 3:42 AMgeoHeil
04/26/2022, 3:46 AMgeoHeil
04/26/2022, 3:47 AMgeoHeil
04/26/2022, 3:47 AMgit clone <https://github.com/geoHeil/dagster-ssh-demo.git>
cd dagster-ssh-demo
make start
# or alternatively without make
docker compose --profile dagster up --build
is all what is needed - the sensors start to automatically poll the SFTP resource for ingestable filesdaniel
04/27/2022, 2:05 AMdaniel
04/27/2022, 2:07 AMvolumes:
- - warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location
to something like (this is my absolute path, yours is probably different):
volumes:
- /Users/dgibson/dagster-ssh-demo/warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location
I think it would be more likely to work. I tried that and am still getting an error in your job, but i think that may be logic in your modified IO manager now? I'd hope that the volume would work as expectedn owdaniel
04/27/2022, 2:09 AMdocker inspect <container ID>
on a launched container, I saw
"Mounts": [
{
"Type": "bind",
"Source": "/Users/dgibson/dagster-ssh-demo/warehouse_location_dagster",
"Destination": "/opt/dagster/dagster_home/warehouse_location",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
}
],
which matched the path on the gRPC serverdaniel
04/27/2022, 2:12 AMgeoHeil
04/27/2022, 6:37 AMgeoHeil
04/27/2022, 6:37 AMgeoHeil
04/27/2022, 8:10 AMgeoHeil
04/27/2022, 8:10 AMgeoHeil
04/27/2022, 8:10 AM- /Users/geoheil/Downloads/fooo/dagster-ssh-demo/warehouse_location_dagster:/opt/dagster/dagster_home/warehouse_location
geoHeil
04/27/2022, 8:21 AMgeoHeil
04/27/2022, 8:21 AM/Users/geoheil/Downloads/fooo/dagster-ssh-demo/warehouse_location_dagster:/opt/dagster/dagster_home/src/warehouse_location
geoHeil
04/27/2022, 8:21 AMgeoHeil
04/27/2022, 8:38 AMError: No arguments given and workspace.yaml not found.
I get this error then from dagster daemon (did not have that one before this change)geoHeil
04/27/2022, 8:58 AMsrc
in the mapping (see latest commit)geoHeil
04/27/2022, 8:58 AM