What's the simplest way to deploy Dagster in a AWS...
# ask-community
m
What's the simplest way to deploy Dagster in a AWS (least external dependencies to configure)? I'm piloting using Dagster and want to prove to our team that it's relatively simple to get up and running for real, in a way multiple folks can access and where execution can scale. I ran across
dagster-aws up
in the release announcement for 0.6 but when I
pip install dagster-aws
there's no executable, did that functionality go away? I also found someone's Terraform module which doesn't look like it's seen much adoption; and I can follow the Deploying Dagster to AWS docs but it leaves setting up an RDS and an executor as exercises for the reader.
j
https://github.com/dagster-io/dagster/tree/master/examples/deploy_ecs will get dagster up and running on ECS without needing to provide any additional infrastructure (for example, it runs Postgres inside a container instead of using RDS). It’s useful for scaffolding an example ECS deployment and as inspiration for a more robust one (where you probably would want to switch to using your own external database and other dependencies)
m
Thanks, I will check that out!
m
Hi @jordan I'm trying to run deploy_ecs and I'm getting
UsercodeService EssentialContainerExited: Essential container in task exited
but it seems ok when I run user_code locally. I haven't changed anything at all in the example - is there anything I need to change to get this up and running? E.g. do the environment variables need to be anything specific, or can everything really just be left as it is to get this example up and running on ECS?
j
There shouldn’t be - I’ll take a look this morning. Development has been pretty active on ECS in the last week or two so I want to make sure one of the latests changes doesn’t break the example user code.
m
Thank you! Just to confirm, I haven't changed anything at all in the files on the repo (e.g. the postgres env vars are the ones below). Just wnated to check I don't need to change any of these when using the postgres template/service. Everything runs ok and then the user_code, daemon and dagit services just exit.
Copy code
DAGSTER_POSTGRES_DB: "postgres_db"
      DAGSTER_POSTGRES_HOSTNAME: "postgresql"
      DAGSTER_POSTGRES_PASSWORD: "postgres_password"
      DAGSTER_POSTGRES_USER: "postgres_user"
Starting to think it's a permissions/VPN/VPC problem with my company AWS set up... Not sure how to diagnose the problem
j
Docker compose actually sets up all the networking/permissions it needs. So if the cloudformation stack is succeeding, I suspect it’s not a permissions issue. I’m about to walk out the door to an appointment but I’ll look into this as soon as I get back in a few hours.
m
Thank you, much appreciated!
FYI, cloudformation succeeds until the services start and then everything starts to delete e.g.
j
Can I see the full CloudWatch timeline?
m
I'm not sure where to get the full CloudWatch timeline, is that in the logs sections of CloudWatch? The full CloudFormation events timeline is below
Copy code
2021-12-01 15:58:49 UTC+0000	dagster	DELETE_COMPLETE	-
2021-12-01 15:58:48 UTC+0000	CloudMap	DELETE_COMPLETE	-
2021-12-01 15:58:05 UTC+0000	LogGroup	DELETE_COMPLETE	-
2021-12-01 15:58:05 UTC+0000	UsercodeTaskExecutionRole	DELETE_COMPLETE	-
2021-12-01 15:58:03 UTC+0000	UsercodeTaskExecutionRole	DELETE_IN_PROGRESS	-
2021-12-01 15:58:03 UTC+0000	LogGroup	DELETE_IN_PROGRESS	-
2021-12-01 15:58:02 UTC+0000	UsercodeTaskDefinition	DELETE_COMPLETE	-
2021-12-01 15:58:02 UTC+0000	CloudMap	DELETE_IN_PROGRESS	-
2021-12-01 15:58:01 UTC+0000	Cluster	DELETE_COMPLETE	-
2021-12-01 15:58:01 UTC+0000	UsercodeServiceDiscoveryEntry	DELETE_COMPLETE	-
2021-12-01 15:58:00 UTC+0000	DefaultNetwork	DELETE_COMPLETE	-
2021-12-01 15:57:59 UTC+0000	Cluster	DELETE_IN_PROGRESS	-
2021-12-01 15:57:59 UTC+0000	UsercodeServiceDiscoveryEntry	DELETE_IN_PROGRESS	-
2021-12-01 15:57:59 UTC+0000	DefaultNetwork	DELETE_IN_PROGRESS	-
2021-12-01 15:57:59 UTC+0000	UsercodeTaskDefinition	DELETE_IN_PROGRESS	-
2021-12-01 15:57:59 UTC+0000	UsercodeService	DELETE_COMPLETE	-
2021-12-01 15:57:50 UTC+0000	PostgresqlTaskExecutionRole	DELETE_COMPLETE	-
2021-12-01 15:57:48 UTC+0000	PostgresqlTaskExecutionRole	DELETE_IN_PROGRESS	-
2021-12-01 15:57:47 UTC+0000	PostgresqlTaskDefinition	DELETE_COMPLETE	-
2021-12-01 15:57:47 UTC+0000	PostgresqlServiceDiscoveryEntry	DELETE_COMPLETE	-
2021-12-01 15:57:46 UTC+0000	PostgresqlTaskDefinition	DELETE_IN_PROGRESS	-
2021-12-01 15:57:46 UTC+0000	PostgresqlServiceDiscoveryEntry	DELETE_IN_PROGRESS	-
2021-12-01 15:57:45 UTC+0000	PostgresqlService	DELETE_COMPLETE	-
2021-12-01 15:56:54 UTC+0000	DagitTaskRole	DELETE_COMPLETE	-
2021-12-01 15:56:54 UTC+0000	DagitTaskExecutionRole	DELETE_COMPLETE	-
2021-12-01 15:56:53 UTC+0000	DaemonTaskExecutionRole	DELETE_COMPLETE	-
2021-12-01 15:56:53 UTC+0000	DaemonTaskRole	DELETE_COMPLETE	-
2021-12-01 15:56:53 UTC+0000	DagitTaskRole	DELETE_IN_PROGRESS	-
2021-12-01 15:56:52 UTC+0000	DagitTaskExecutionRole	DELETE_IN_PROGRESS	-
2021-12-01 15:56:52 UTC+0000	DaemonTaskExecutionRole	DELETE_IN_PROGRESS	-
2021-12-01 15:56:52 UTC+0000	DaemonTaskRole	DELETE_IN_PROGRESS	-
2021-12-01 15:56:52 UTC+0000	DagitTaskDefinition	DELETE_COMPLETE	-
2021-12-01 15:56:51 UTC+0000	DagitServiceDiscoveryEntry	DELETE_COMPLETE	-
2021-12-01 15:56:51 UTC+0000	DaemonServiceDiscoveryEntry	DELETE_COMPLETE	-
2021-12-01 15:56:51 UTC+0000	DaemonTaskDefinition	DELETE_COMPLETE	-
2021-12-01 15:56:50 UTC+0000	DefaultNetworkIngress	DELETE_COMPLETE	-
2021-12-01 15:56:50 UTC+0000	Default3000Ingress	DELETE_COMPLETE	-
2021-12-01 15:56:50 UTC+0000	LoadBalancer	DELETE_COMPLETE	-
2021-12-01 15:56:50 UTC+0000	DagitTCP3000TargetGroup	DELETE_COMPLETE	-
2021-12-01 15:56:50 UTC+0000	UsercodeService	DELETE_IN_PROGRESS	-
2021-12-01 15:56:50 UTC+0000	DagitTaskDefinition	DELETE_IN_PROGRESS	-
2021-12-01 15:56:50 UTC+0000	PostgresqlService	DELETE_IN_PROGRESS	-
2021-12-01 15:56:50 UTC+0000	DefaultNetworkIngress	DELETE_IN_PROGRESS	-
2021-12-01 15:56:50 UTC+0000	DaemonTaskDefinition	DELETE_IN_PROGRESS	-
2021-12-01 15:56:50 UTC+0000	DagitServiceDiscoveryEntry	DELETE_IN_PROGRESS	-
2021-12-01 15:56:50 UTC+0000	LoadBalancer	DELETE_IN_PROGRESS	-
2021-12-01 15:56:50 UTC+0000	DaemonServiceDiscoveryEntry	DELETE_IN_PROGRESS	-
2021-12-01 15:56:50 UTC+0000	Default3000Ingress	DELETE_IN_PROGRESS	-
2021-12-01 15:56:49 UTC+0000	DagitTCP3000TargetGroup	DELETE_IN_PROGRESS	-
2021-12-01 15:56:45 UTC+0000	PostgresqlService	CREATE_FAILED	Resource creation cancelled
2021-12-01 15:56:45 UTC+0000	UsercodeService	CREATE_FAILED	Resource creation cancelled
2021-12-01 15:56:44 UTC+0000	LoadBalancer	CREATE_FAILED	Resource creation cancelled
2021-12-01 15:56:44 UTC+0000	dagster	DELETE_IN_PROGRESS	User Initiated
2021-12-01 15:56:01 UTC+0000	UsercodeService	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:56:01 UTC+0000	PostgresqlService	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:59 UTC+0000	PostgresqlService	CREATE_IN_PROGRESS	-
2021-12-01 15:55:59 UTC+0000	UsercodeService	CREATE_IN_PROGRESS	-
2021-12-01 15:55:56 UTC+0000	PostgresqlServiceDiscoveryEntry	CREATE_COMPLETE	-
2021-12-01 15:55:56 UTC+0000	PostgresqlServiceDiscoveryEntry	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:56 UTC+0000	DagitServiceDiscoveryEntry	CREATE_COMPLETE	-
2021-12-01 15:55:56 UTC+0000	UsercodeServiceDiscoveryEntry	CREATE_COMPLETE	-
2021-12-01 15:55:56 UTC+0000	DaemonServiceDiscoveryEntry	CREATE_COMPLETE	-
2021-12-01 15:55:56 UTC+0000	DagitServiceDiscoveryEntry	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:55 UTC+0000	UsercodeServiceDiscoveryEntry	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:55 UTC+0000	DaemonServiceDiscoveryEntry	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:54 UTC+0000	DagitServiceDiscoveryEntry	CREATE_IN_PROGRESS	-
2021-12-01 15:55:54 UTC+0000	DaemonServiceDiscoveryEntry	CREATE_IN_PROGRESS	-
2021-12-01 15:55:54 UTC+0000	UsercodeServiceDiscoveryEntry	CREATE_IN_PROGRESS	-
2021-12-01 15:55:54 UTC+0000	PostgresqlServiceDiscoveryEntry	CREATE_IN_PROGRESS	-
2021-12-01 15:55:52 UTC+0000	CloudMap	CREATE_COMPLETE	-
2021-12-01 15:55:31 UTC+0000	DagitTaskDefinition	CREATE_COMPLETE	-
2021-12-01 15:55:31 UTC+0000	DagitTaskDefinition	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:30 UTC+0000	DaemonTaskDefinition	CREATE_COMPLETE	-
2021-12-01 15:55:30 UTC+0000	DaemonTaskDefinition	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:30 UTC+0000	UsercodeTaskDefinition	CREATE_COMPLETE	-
2021-12-01 15:55:29 UTC+0000	UsercodeTaskDefinition	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:29 UTC+0000	PostgresqlTaskDefinition	CREATE_COMPLETE	-
2021-12-01 15:55:29 UTC+0000	PostgresqlTaskDefinition	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:28 UTC+0000	DagitTaskDefinition	CREATE_IN_PROGRESS	-
2021-12-01 15:55:28 UTC+0000	DaemonTaskDefinition	CREATE_IN_PROGRESS	-
2021-12-01 15:55:27 UTC+0000	UsercodeTaskDefinition	CREATE_IN_PROGRESS	-
2021-12-01 15:55:27 UTC+0000	PostgresqlTaskDefinition	CREATE_IN_PROGRESS	-
2021-12-01 15:55:26 UTC+0000	DaemonTaskRole	CREATE_COMPLETE	-
2021-12-01 15:55:26 UTC+0000	DagitTaskRole	CREATE_COMPLETE	-
2021-12-01 15:55:25 UTC+0000	DaemonTaskExecutionRole	CREATE_COMPLETE	-
2021-12-01 15:55:25 UTC+0000	DagitTaskExecutionRole	CREATE_COMPLETE	-
2021-12-01 15:55:25 UTC+0000	UsercodeTaskExecutionRole	CREATE_COMPLETE	-
2021-12-01 15:55:25 UTC+0000	PostgresqlTaskExecutionRole	CREATE_COMPLETE	-
2021-12-01 15:55:14 UTC+0000	DefaultNetworkIngress	CREATE_COMPLETE	-
2021-12-01 15:55:13 UTC+0000	Default3000Ingress	CREATE_COMPLETE	-
2021-12-01 15:55:13 UTC+0000	DefaultNetworkIngress	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:13 UTC+0000	DefaultNetworkIngress	CREATE_IN_PROGRESS	-
2021-12-01 15:55:13 UTC+0000	Default3000Ingress	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:13 UTC+0000	Default3000Ingress	CREATE_IN_PROGRESS	-
2021-12-01 15:55:11 UTC+0000	Cluster	CREATE_COMPLETE	-
2021-12-01 15:55:11 UTC+0000	DefaultNetwork	CREATE_COMPLETE	-
2021-12-01 15:55:10 UTC+0000	DefaultNetwork	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:07 UTC+0000	LogGroup	CREATE_COMPLETE	-
2021-12-01 15:55:07 UTC+0000	Cluster	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:07 UTC+0000	CloudMap	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:07 UTC+0000	LogGroup	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:07 UTC+0000	DagitTCP3000TargetGroup	CREATE_COMPLETE	-
2021-12-01 15:55:06 UTC+0000	DaemonTaskRole	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:06 UTC+0000	DagitTaskRole	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:06 UTC+0000	DagitTaskExecutionRole	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:06 UTC+0000	DagitTCP3000TargetGroup	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:06 UTC+0000	LoadBalancer	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:06 UTC+0000	UsercodeTaskExecutionRole	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:06 UTC+0000	DaemonTaskExecutionRole	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:06 UTC+0000	DaemonTaskRole	CREATE_IN_PROGRESS	-
2021-12-01 15:55:06 UTC+0000	PostgresqlTaskExecutionRole	CREATE_IN_PROGRESS	Resource creation Initiated
2021-12-01 15:55:05 UTC+0000	DagitTaskExecutionRole	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	DagitTaskRole	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	UsercodeTaskExecutionRole	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	Cluster	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	DefaultNetwork	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	PostgresqlTaskExecutionRole	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	DagitTCP3000TargetGroup	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	CloudMap	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	LoadBalancer	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	DaemonTaskExecutionRole	CREATE_IN_PROGRESS	-
2021-12-01 15:55:05 UTC+0000	LogGroup	CREATE_IN_PROGRESS	-
2021-12-01 15:54:59 UTC+0000	dagster	CREATE_IN_PROGRESS	User Initiated
j
@Marcus Tuke I haven’t been able to reproduce this - but I haven’t forgotten about you. I’ll continue looking into this tomorrow.
👍 1
m
Hey @jordan I finally got this working 🙂 It seemed to be 2 problems simultaneously for me: (a) Because I'm using Apple M1 chip, there seems to be something strange going on with the image reverting to 32bit. I noticed this when looking up why catboost package wasn't being found in pip (see 2nd and 3rd answers here: https://stackoverflow.com/questions/54598558/why-does-pip-install-not-work-for-catboost) so I changed
python:3.7-slim
to
amd64/python:3.7-slim
in the Dockerfile (b) the Postgres user/db/passwd needed to just be
postgres
e.g.
Copy code
DAGSTER_POSTGRES_USER: "postgres"
      DAGSTER_POSTGRES_PASSWORD: "postgres"
      DAGSTER_POSTGRES_DB: "postgres"
Once these 2 changes were made (only works with both of them) then it worked perfectly for me 🎉
j
Because I’m using Apple M1 chip, there seems to be something strange going on with the image reverting to 32bit
well that at least explains in part why i wasn’t able to reproduce 😅 ok - this is good info - i’m going to dig a bit more into it now that you’ve diagnosed the problem and i’ll look into updating the example (unless you’d like to open a PR directly against the repo - your choice)
(b) the Postgres user/db/passwd needed to just be postgres e.g.
This really surprises me - unless it was also changed in the postgres service 🤔