Hi Dagster team I m trying to use docker compose + docker ru dagster #announcements

Hi Dagster team! I'm trying to use docker-compose ...

Fran Sanchez

01/27/2021, 3:00 PM

Hi Dagster team! I'm trying to use docker-compose + docker runner to run locally a dummy pipeline. This is something that I got working last week, but now it has stopped working... I can see in dagit

Copy code

[DockerRunLauncher] Launching run in container five_oracle_pipeline_dagster_oracle_pipelines with ID ceacbfd6707e3cfa09dce15898d1923b3e4edb14de818a9cf39bfc0390a4cfe1

And in the docker-compose output:

Copy code

dagster_docker_daemon       | 2021-01-27 14:56:41 - SchedulerDaemon - INFO - Not checking for any runs since no schedules have been started.
dagster_docker_daemon       | 2021-01-27 14:56:41 - SensorDaemon - INFO - Not checking for any runs since no sensors have been started.
dagster_docker_daemon       | 2021-01-27 14:56:41 - QueuedRunCoordinatorDaemon - INFO - Retrieved 1 queued runs, checking limits.
dagster_docker_daemon       | 2021-01-27 14:56:43 - QueuedRunCoordinatorDaemon - INFO - Launched 1 runs.
dagster_docker_daemon       | 2021-01-27 14:56:47 - QueuedRunCoordinatorDaemon - INFO - Poll returned no queued runs.
dagster_docker_daemon       | 2021-01-27 14:56:47 - QueuedRunCoordinatorDaemon - INFO - Launched 0 runs.

But nothing actually runs... Any hints what could be happening?

Fran Sanchez

01/27/2021, 3:02 PM

I just realised that something that I have changed is the base image to use python 3.8, could that be the issue?

daniel

01/27/2021, 3:05 PM

Hi Fran - is there anything useful in the logs for the container that gets launched? Something might be failing early enough in the container that it doesn't make its way back into dagit

Fran Sanchez

01/27/2021, 3:07 PM

That's the thing, I don't see any container being launched...

Fran Sanchez

01/27/2021, 3:07 PM

I've tried restarting the docker server also

Fran Sanchez

01/27/2021, 3:08 PM

Is there any way to increase the verbosity level of the logs?

Fran Sanchez

01/27/2021, 3:10 PM

I can only see output from the dagster_docker_daemon container

daniel

01/27/2021, 3:12 PM

hmmmm, strange. At that point all that's left to do is call container.start() in the docker API - i'd expect that to raise some kind of exception if it failed to start, which should then appear in the logs. This launcher is fairly new though, so it's possible there's a failure scenario that isn't getting surfaced the way that it should

Fran Sanchez

01/27/2021, 3:13 PM

Mmm, there is something interesting, when I execute

dagster instance info

in the container I get an error

Copy code

dagster.check.CheckError: Failure condition: Couldn't import module dagster_postgres.run_storage when attempting to load the configurable class dagster_postgres.run_storage.PostgresRunStorage

So it seems that

dagster-postgres

isn't installed

Fran Sanchez

01/27/2021, 3:14 PM

Copy code

root@c5430b41cce3:/opt/dagster/app# pip freeze |grep dagster
dagster==0.10.0
dagster-docker==0.10.0

Fran Sanchez

01/27/2021, 3:15 PM

also

dagster-graphql

seems missing there... so my mistake when building the image...

daniel

01/27/2021, 3:16 PM

Got it - generally we could be better about handling errors when the resource that a run launcher spins up fails early enough that it can't even create the instance (e.g. a missing module error like that). I'm still surprised that there weren't any docker logs for the container though

Fran Sanchez

01/27/2021, 3:16 PM

I'd have expected it to fail when launching the grpc server as the configured RunLauncher has missing dependencies

Fran Sanchez

01/27/2021, 3:17 PM

I've tried with both 0.10.0 and 0.10.1 also python 3.7 and 3.8

Fran Sanchez

01/27/2021, 3:17 PM

Same behaviour

daniel

01/27/2021, 3:17 PM

I would too. I'll see if I can reproduce by taking dagster-postgres out of the pipeline container in the example

Fran Sanchez

01/27/2021, 3:18 PM

It's working now 👍

Fran Sanchez

01/27/2021, 3:19 PM

I basically added both,

dagster-graphql

and

dagster-postgres

and it is working now. I'm not sure yet what

dagster=graphql

is used for but it shows up in the docker-compose example in the repo so I added it...

daniel

01/27/2021, 3:20 PM

Ah, so I think it doesn't actually need to load anything on the instance in order to display the pipelines in dagit / launch the gRPC server. So that's why it didn't fail there.

daniel

01/27/2021, 3:21 PM

hmm, so for me, docker logs <container_id> does show the dagster_postgres import error though? Still not a great experience, particularly since the run is just hanging in dagit

Fran Sanchez

01/27/2021, 3:22 PM

Nothing for me... that's weird...

daniel

01/27/2021, 3:22 PM

weird!

Fran Sanchez

01/27/2021, 3:22 PM

I checked using

docker logs

Fran Sanchez

01/27/2021, 3:22 PM

Also the output from docker-compose was empty 🤷

daniel

01/27/2021, 3:22 PM

and just to triple-check, it's 'docker logs ceacbfd6707e3cfa09dce15898d1923b3e4edb14de818a9cf39bfc0390a4cfe1' (from your example above)?

daniel

01/27/2021, 3:23 PM

when you run that, it gives you a 'No such container' error?

Fran Sanchez

01/27/2021, 3:23 PM

No, actually the logs of that one show the error

Fran Sanchez

01/27/2021, 3:23 PM

I was only checking the logs of the grpc server

Fran Sanchez

01/27/2021, 3:24 PM

Also I get

Copy code

/usr/local/lib/python3.8/site-packages/dagster/cli/api.py:76: UserWarning: execute_run_with_structured_logs is deprecated. Use execute_run instead.

But that's unrelated 😄

daniel

01/27/2021, 3:26 PM

ahhh, OK. It's a bit confusing - in that example the gRPC server is just used to serve the pipeline metadata. The DockerRunLauncher that the example uses doesn't actually hit the gRPC server, it launches a new container using the same image that was used to launch the gRPC server. (That way each run can use its own container)

daniel

01/27/2021, 3:27 PM

Switching to the DefaultRunLauncher would make it launch the run from the gRPC server without spinning up a new container.

Fran Sanchez

01/27/2021, 3:28 PM

So, dagit makes the request to dagster_daemon, and dagster_daemon interacts with the docker API to create the container, is that right?

daniel

01/27/2021, 3:30 PM

that's right. Well technically dagit queues the run and then the daemon polls for queued runs, but the docker API part is exactly right.

👍 1

Fran Sanchez

01/27/2021, 3:31 PM

Makes sense, for some reason I thought it would be the gRPC server the one creating the container

Fran Sanchez

01/27/2021, 3:31 PM

So, in a K8s context does it work in the same way but interfacing the K8s API?

daniel

01/27/2021, 3:32 PM

Yeah, exactly. What you thought was very reasonable since there's already a container with that image running, i'll tweak the message on the DockerRunLauncher to emphasize that it's a new container.

Fran Sanchez

01/27/2021, 3:34 PM

And then, if I configure the RunLauncher to run in the same process, does the queue commands the gRPC server to run a pipeline?

daniel

01/27/2021, 3:35 PM

Yeah, the DefaultRunLauncher (the one that is used if you don't change your instance) calls out to the gRPC server and tells it to execute the pipeline there

👍 1

Fran Sanchez

01/27/2021, 3:36 PM

Gotcha. I think with this I have enough to keep experimenting 😃

Fran Sanchez

01/27/2021, 3:36 PM

Thanks Daniel!

daniel

01/27/2021, 3:36 PM

awesome, thanks for the feedback and for trying it out

Fran Sanchez

01/27/2021, 3:37 PM

You guys are doing an awesome job with Dagster 👏

condagster 2

2 Views

Open in Slack

Previous Next