Kevin
06/22/2020, 6:11 PMmax
06/22/2020, 7:11 PMjohann
06/22/2020, 7:46 PMexecution:
celery-k8s:
# ... configs here
in your pipeline configKevin
06/22/2020, 9:11 PMjohann
06/22/2020, 9:34 PMKevin
06/22/2020, 9:41 PMjohann
06/22/2020, 9:47 PMexecution:
celery-k8s:
isn't definedexecution:
celery-k8s:
config:
job_image: '<http://my_repo.com/image_name:latest|my_repo.com/image_name:latest>'
job_namespace: 'some-namespace'
broker: '<pyamqp://guest@localhost//>' # Optional[str]: The URL of the Celery broker
backend: 'rpc://' # Optional[str]: The URL of the Celery results backend
include: ['my_module'] # Optional[List[str]]: Modules every worker should import
config_source: # Dict[str, Any]: Any additional parameters to pass to the
#... # Celery workers. This dict will be passed as the `config_source`
#... # argument of celery.Celery().
Kevin
06/22/2020, 9:50 PMjohann
06/22/2020, 9:55 PMKevin
06/22/2020, 10:00 PMexecution:
celery-k8s:
config:
job_image: '<http://my_repo.com/image_name:latest|my_repo.com/image_name:latest>'
job_namespace: 'some-namespace'
referring to a dagster-celery-worker image with the following dockerfile:
FROM python:3.7.7
# ADD build_cache/ /
RUN apt-get update -yqq && \
apt-get install -yqq cron && \
pip install \
dagster \
dagster-graphql \
dagster-postgres \
dagster-cron \
dagster-celery[flower,redis,kubernetes] \
dagster-k8s
johann
06/22/2020, 10:46 PMKevin
06/22/2020, 11:09 PMnate
06/22/2020, 11:12 PMexecution:
YAML snippet to your pipeline run config in the playground?dagster.yaml
- we split up configuration between dagster.yaml
for instance-wide defaults, and in the pipeline run config (in the playground) for pipeline-specific configurationKevin
06/22/2020, 11:15 PMnate
06/22/2020, 11:23 PMKevin
06/23/2020, 5:29 PMTraceback (most recent call last):
File "/usr/local/bin/dagster-graphql", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/cli.py", line 219, in main
cli(obj={}) # pylint:disable=E1120
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/cli.py", line 211, in ui
execute_query_from_cli(workspace, query, variables, output)
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/cli.py", line 96, in execute_query_from_cli
workspace, query, variables=seven.json.loads(variables) if variables else None
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/cli.py", line 47, in execute_query
else DagsterInstance.get()
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 239, in get
return DagsterInstance.from_config(_dagster_home())
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 261, in from_config
return DagsterInstance.from_ref(instance_ref)
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 274, in from_ref
run_launcher=instance_ref.run_launcher,
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/ref.py", line 205, in run_launcher
return self.run_launcher_data.rehydrate() if self.run_launcher_data else None
File "/usr/local/lib/python3.7/site-packages/dagster/serdes/__init__.py", line 351, in rehydrate
config_dict,
dagster.core.errors.DagsterInvalidConfigError: Errors whilst loading configuration for {'instance_config_map': Field(<dagster.config.source.StringSourceType object at 0x7f4a46f5d1d0>, default=@, is_required=True), 'postgres_password_secret': Field(<dagster.config.source.StringSourceType object at 0x7f4a46f5d1d0>, default=@, is_required=True), 'dagster_home': Field(<dagster.config.source.StringSourceType object at 0x7f4a46f5d1d0>, default=/opt/dagster/dagster_home, is_required=False), 'load_incluster_config': Field(<dagster.config.config_type.Bool object at 0x7f4a48e31950>, default=True, is_required=False), 'kubeconfig_file': Field(<dagster.config.config_type.Noneable object at 0x7f4a2a2d2c50>, default=None, is_required=False), 'broker': Field(<dagster.config.config_type.Noneable object at 0x7f4a45101ad0>, default=@, is_required=False), 'backend': Field(<dagster.config.config_type.Noneable object at 0x7f4a45101b50>, default=rpc://, is_required=False), 'include': Field(<dagster.config.config_type.Array object at 0x7f4a450a3190>, default=@, is_required=False), 'config_source': Field(<dagster.config.config_type.Noneable object at 0x7f4a450a3950>, default=@, is_required=False), 'retries': Field(<dagster.config.field_utils.Selector object at 0x7f4a48dd58d0>, default={'enabled': {}}, is_required=False)}.
Error 1: Post processing at path root:instance_config_map of original value {'env': 'DAGSTER_K8S_INSTANCE_CONFIG_MAP'} failed:
(PostProcessingError) - dagster.config.errors.PostProcessingError: You have attempted to fetch the environment variable "DAGSTER_K8S_INSTANCE_CONFIG_MAP" which is not set. In order for this execution to succeed it must be set in this environment.
Stack Trace:
File "/usr/local/lib/python3.7/site-packages/dagster/config/post_process.py", line 72, in _post_process
new_value = context.config_type.post_process(config_value)
File "/usr/local/lib/python3.7/site-packages/dagster/config/source.py", line 42, in post_process
return str(_ensure_env_variable(cfg))
File "/usr/local/lib/python3.7/site-packages/dagster/config/source.py", line 23, in _ensure_env_variable
).format(var=var)
Error 2: Post processing at path root:postgres_password_secret of original value {'env': 'DAGSTER_K8S_PG_PASSWORD_SECRET'} failed:
(PostProcessingError) - dagster.config.errors.PostProcessingError: You have attempted to fetch the environment variable "DAGSTER_K8S_PG_PASSWORD_SECRET" which is not set. In order for this execution to succeed it must be set in this environment.
Stack Trace:
File "/usr/local/lib/python3.7/site-packages/dagster/config/post_process.py", line 72, in _post_process
new_value = context.config_type.post_process(config_value)
File "/usr/local/lib/python3.7/site-packages/dagster/config/source.py", line 42, in post_process
return str(_ensure_env_variable(cfg))
File "/usr/local/lib/python3.7/site-packages/dagster/config/source.py", line 23, in _ensure_env_variable
).format(var=var)
johann
06/23/2020, 5:53 PMvalues.yaml
helm file?Kevin
06/23/2020, 6:04 PMdagit:
replicaCount: 1
# REQUIRED: Dagit image repository and tag to deploy
image:
repository: ".../dagster-dagit"
tag: "v1"
pullPolicy: Always
pipeline_run:
# REQUIRED: The Dagster K8s run launchers will invoke job executions in containers from this image
image:
repository: ".../dagster-core"
tag: "v1"
# Change with caution! If you're using a fixed tag for pipeline run images, changing the image
# pull policy to anything other than "Always" will use a cached/stale image, which is almost
# certainly not what you want.
pullPolicy: Always
celery:
image:
repository: ".../dagster-core"
tag: "v1"
pullPolicy: Always
enabled: true
flower:
enabled: false
ingress:
enabled: true
annotations:
<http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: "nginx"
<http://nginx.ingress.kubernetes.io/enable-cors|nginx.ingress.kubernetes.io/enable-cors>: "true"
<http://nginx.ingress.kubernetes.io/affinity|nginx.ingress.kubernetes.io/affinity>: "cookie"
dagit:
host: dagit.testing.testing
path: "/"
annotations:
<http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: "nginx"
<http://nginx.ingress.kubernetes.io/enable-cors|nginx.ingress.kubernetes.io/enable-cors>: "true"
<http://nginx.ingress.kubernetes.io/affinity|nginx.ingress.kubernetes.io/affinity>: "cookie"
flower:
host: flower.testing.testing
johann
06/23/2020, 6:06 PM####################################################################################################
# PostgreSQL: Configuration values for postgresql
#
# <https://github.com/kubernetes/charts/blob/master/stable/postgresql/README.md>
#
# A PostgreSQL database is required to run Dagster on Kubernetes. If postgresql.enabled is marked as
# false, the PG credentials specified here will still be used, and should point to an external PG
# database that is accessible from this chart.
####################################################################################################
postgresql:
# Used by init container to check that db is running. (Even if enabled:false)
image:
repository: "postgres"
tag: "9.6.16"
pullPolicy: IfNotPresent
# set postgresql.enabled to be false to disable deploy of a PostgreSQL database and use an
# existing external PostgreSQL database
enabled: true
# set this PostgreSQL hostname when using an external PostgreSQL database
postgresqlHost: ""
postgresqlUsername: test
# Note when changing this password (e.g. in test) that credentials will
# persist as long as the PVCs do -- see:
# <https://github.com/helm/charts/issues/12836#issuecomment-524552358>
postgresqlPassword: test
postgresqlDatabase: test
service:
port: 5432
Kevin
06/23/2020, 6:10 PMjohann
06/23/2020, 6:11 PMKevin
06/23/2020, 6:12 PMjohann
06/23/2020, 6:30 PMKevin
06/23/2020, 6:31 PMjohann
06/23/2020, 6:32 PMKevin
06/23/2020, 6:35 PM$DAGSTER_DAGIT_PORT
$DAGSTER_DAGIT_PORT_80_TCP
$DAGSTER_DAGIT_PORT_80_TCP_ADDR
$DAGSTER_DAGIT_PORT_80_TCP_PORT
$DAGSTER_DAGIT_PORT_80_TCP_PROTO
$DAGSTER_DAGIT_SERVICE_HOST
$DAGSTER_DAGIT_SERVICE_PORT
$DAGSTER_DAGIT_SERVICE_PORT_HTTP
$DAGSTER_HOME
$DAGSTER_K8S_CELERY_BACKEND
$DAGSTER_K8S_CELERY_BROKER
$DAGSTER_K8S_INSTANCE_CONFIG_MAP
$DAGSTER_K8S_PG_PASSWORD_SECRET
$DAGSTER_K8S_PIPELINE_RUN_ENV_CONFIGMAP
$DAGSTER_K8S_PIPELINE_RUN_IMAGE
$DAGSTER_K8S_PIPELINE_RUN_IMAGE_PULL_POLICY
$DAGSTER_K8S_PIPELINE_RUN_NAMESPACE
$DAGSTER_PG_PASSWORD
$DAGSTER_POSTGRESQL_PORT
$DAGSTER_POSTGRESQL_PORT_5432_TCP
$DAGSTER_POSTGRESQL_PORT_5432_TCP_ADDR
$DAGSTER_POSTGRESQL_PORT_5432_TCP_PORT
$DAGSTER_POSTGRESQL_PORT_5432_TCP_PROTO
$DAGSTER_POSTGRESQL_SERVICE_HOST
$DAGSTER_POSTGRESQL_SERVICE_PORT
$DAGSTER_POSTGRESQL_SERVICE_PORT_TCP_POSTGRESQL
$DAGSTER_RABBITMQ_PORT
$DAGSTER_RABBITMQ_PORT_15672_TCP
$DAGSTER_RABBITMQ_PORT_15672_TCP_ADDR
$DAGSTER_RABBITMQ_PORT_15672_TCP_PORT
$DAGSTER_RABBITMQ_PORT_15672_TCP_PROTO
$DAGSTER_RABBITMQ_PORT_25672_TCP
$DAGSTER_RABBITMQ_PORT_25672_TCP_ADDR
$DAGSTER_RABBITMQ_PORT_25672_TCP_PORT
$DAGSTER_RABBITMQ_PORT_25672_TCP_PROTO
$DAGSTER_RABBITMQ_PORT_4369_TCP
$DAGSTER_RABBITMQ_PORT_4369_TCP_ADDR
$DAGSTER_RABBITMQ_PORT_4369_TCP_PORT
$DAGSTER_RABBITMQ_PORT_4369_TCP_PROTO
$DAGSTER_RABBITMQ_PORT_5672_TCP
$DAGSTER_RABBITMQ_PORT_5672_TCP_ADDR
$DAGSTER_RABBITMQ_PORT_5672_TCP_PORT
$DAGSTER_RABBITMQ_PORT_5672_TCP_PROTO
$DAGSTER_RABBITMQ_SERVICE_HOST
$DAGSTER_RABBITMQ_SERVICE_PORT
$DAGSTER_RABBITMQ_SERVICE_PORT_AMQP
$DAGSTER_RABBITMQ_SERVICE_PORT_DIST
$DAGSTER_RABBITMQ_SERVICE_PORT_EPMD
$DAGSTER_RABBITMQ_SERVICE_PORT_STATS
#!/bin/sh
export DAGSTER_HOME=/opt/dagster/dagster_home
# This block may be omitted if not packaging a repository with cron schedules:
####################################################################################################
# see: <https://unix.stackexchange.com/a/453053> - fixes inflated hard link count
touch /etc/crontab /etc/cron.*/*
service cron start
# Add all schedules defined by the user
dagster schedule up
####################################################################################################
# Launch Dagit as a service
DAGSTER_HOME=/opt/dagster/dagster_home dagit -h 0.0.0.0 -p 3000
johann
06/23/2020, 6:41 PMKevin
06/23/2020, 6:42 PMjohann
06/23/2020, 6:50 PMKevin
06/23/2020, 6:51 PMjohann
06/23/2020, 6:52 PMKevin
06/23/2020, 6:53 PMecho $DAGSTER_K8S_INSTANCE_CONFIG_MAP
dagster-instance
seems rather sparse, is this value configured with the helm charts? and if so wouldn't it be populated to the images defined?johann
06/23/2020, 6:57 PMKevin
06/23/2020, 6:59 PMexecution:
celery-k8s:
config:
job_image: '<http://my_repo.com/image_name:latest|my_repo.com/image_name:latest>'
job_namespace: 'some-namespace'
johann
06/23/2020, 7:01 PMdagit
in terminal)Kevin
06/23/2020, 7:02 PMjohann
06/23/2020, 7:10 PMKevin
06/23/2020, 7:19 PMDagster 0.8.4
DagsterInstance components:
Local Artifacts Storage:
module: dagster.core.storage.root
class: LocalArtifactStorage
config:
base_dir: /opt/dagster/dagster_home
Run Storage:
module: dagster_postgres.run_storage
class: PostgresRunStorage
config:
postgres_db:
db_name: test
hostname: dagster-postgresql
password:
env: DAGSTER_PG_PASSWORD
port: 5432
username: test
Event Log Storage:
module: dagster_postgres.event_log
class: PostgresEventLogStorage
config:
postgres_db:
db_name: test
hostname: dagster-postgresql
password:
env: DAGSTER_PG_PASSWORD
port: 5432
username: test
Compute Log Manager:
module: dagster.core.storage.local_compute_log_manager
class: LocalComputeLogManager
config:
base_dir: /opt/dagster/dagster_home/storage
Schedule Storage:
module: dagster_postgres.schedule_storage
class: PostgresScheduleStorage
config:
postgres_db:
db_name: test
hostname: dagster-postgresql
password:
env: DAGSTER_PG_PASSWORD
port: 5432
username: test
Scheduler:
module: dagster_cron.cron_scheduler
class: SystemCronScheduler
config:
{}
Run Launcher:
module: dagster_k8s.launcher
class: CeleryK8sRunLauncher
config:
backend: amqp
broker: <pyamqp://test:test@dagster-rabbitmq:5672//>
dagster_home:
env: DAGSTER_HOME
instance_config_map:
env: DAGSTER_K8S_INSTANCE_CONFIG_MAP
postgres_password_secret:
env: DAGSTER_K8S_PG_PASSWORD_SECRET
Dagit:
NoneType
Telemetry:
NoneType
johann
06/23/2020, 7:51 PMKevin
06/23/2020, 8:20 PMjohann
06/23/2020, 8:23 PMKevin
06/23/2020, 8:26 PMTraceback (most recent call last):
File "/usr/local/bin/dagster-graphql", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/cli.py", line 219, in main
cli(obj={}) # pylint:disable=E1120
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/cli.py", line 211, in ui
execute_query_from_cli(workspace, query, variables, output)
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/cli.py", line 96, in execute_query_from_cli
workspace, query, variables=seven.json.loads(variables) if variables else None
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/cli.py", line 47, in execute_query
else DagsterInstance.get()
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 239, in get
return DagsterInstance.from_config(_dagster_home())
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 261, in from_config
return DagsterInstance.from_ref(instance_ref)
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 274, in from_ref
run_launcher=instance_ref.run_launcher,
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/ref.py", line 205, in run_launcher
return self.run_launcher_data.rehydrate() if self.run_launcher_data else None
File "/usr/local/lib/python3.7/site-packages/dagster/serdes/__init__.py", line 351, in rehydrate
config_dict,
dagster.core.errors.DagsterInvalidConfigError: Errors whilst loading configuration for {'instance_config_map': Field(<dagster.config.source.StringSourceType object at 0x7f242d4a6f50>, default=@, is_required=True), 'postgres_password_secret': Field(<dagster.config.source.StringSourceType object at 0x7f242d4a6f50>, default=@, is_required=True), 'dagster_home': Field(<dagster.config.source.StringSourceType object at 0x7f242d4a6f50>, default=/opt/dagster/dagster_home, is_required=False), 'load_incluster_config': Field(<dagster.config.config_type.Bool object at 0x7f242f383990>, default=True, is_required=False), 'kubeconfig_file': Field(<dagster.config.config_type.Noneable object at 0x7f2410824c90>, default=None, is_required=False), 'broker': Field(<dagster.config.config_type.Noneable object at 0x7f242b653ad0>, default=@, is_required=False), 'backend': Field(<dagster.config.config_type.Noneable object at 0x7f242b653b50>, default=rpc://, is_required=False), 'include': Field(<dagster.config.config_type.Array object at 0x7f242b5f5190>, default=@, is_required=False), 'config_source': Field(<dagster.config.config_type.Noneable object at 0x7f242b5f59d0>, default=@, is_required=False), 'retries': Field(<dagster.config.field_utils.Selector object at 0x7f242f327910>, default={'enabled': {}}, is_required=False)}.
Error 1: Post processing at path root:instance_config_map of original value {'env': 'DAGSTER_K8S_INSTANCE_CONFIG_MAP'} failed:
(PostProcessingError) - dagster.config.errors.PostProcessingError: You have attempted to fetch the environment variable "DAGSTER_K8S_INSTANCE_CONFIG_MAP" which is not set. In order for this execution to succeed it must be set in this environment.
Stack Trace:
File "/usr/local/lib/python3.7/site-packages/dagster/config/post_process.py", line 72, in _post_process
new_value = context.config_type.post_process(config_value)
File "/usr/local/lib/python3.7/site-packages/dagster/config/source.py", line 42, in post_process
return str(_ensure_env_variable(cfg))
File "/usr/local/lib/python3.7/site-packages/dagster/config/source.py", line 23, in _ensure_env_variable
).format(var=var)
Error 2: Post processing at path root:postgres_password_secret of original value {'env': 'DAGSTER_K8S_PG_PASSWORD_SECRET'} failed:
(PostProcessingError) - dagster.config.errors.PostProcessingError: You have attempted to fetch the environment variable "DAGSTER_K8S_PG_PASSWORD_SECRET" which is not set. In order for this execution to succeed it must be set in this environment.
Stack Trace:
File "/usr/local/lib/python3.7/site-packages/dagster/config/post_process.py", line 72, in _post_process
new_value = context.config_type.post_process(config_value)
File "/usr/local/lib/python3.7/site-packages/dagster/config/source.py", line 42, in post_process
return str(_ensure_env_variable(cfg))
File "/usr/local/lib/python3.7/site-packages/dagster/config/source.py", line 23, in _ensure_env_variable
).format(var=var)
johann
06/23/2020, 9:32 PMKevin
06/23/2020, 10:46 PMStarting periodic command scheduler: cron.
Usage: dagster schedule up [OPTIONS]
Error: There are no schedules defined for repository test_repository.
Telemetry:
As an open source project, we collect usage statistics to inform development priorities. For more
information, read <https://docs.dagster.io/docs/install/telemetry>.
We will not see or store solid definitions, pipeline definitions, modes, resources, context, or
any data that is processed within solids and pipelines.
To opt-out, add the following to $DAGSTER_HOME/dagster.yaml, creating that file if necessary:
telemetry:
enabled: false
Welcome to Dagster!
If you have any questions or would like to engage with the Dagster team, please join us on Slack
(<https://bit.ly/39dvSsF>).
johann
06/23/2020, 11:08 PMKevin
06/23/2020, 11:43 PMjohann
06/25/2020, 5:35 PMKevin
06/25/2020, 5:47 PMjohann
06/25/2020, 6:08 PMKevin
06/25/2020, 6:16 PMjohann
06/25/2020, 6:20 PMKevin
06/25/2020, 6:22 PMmake[1]: Leaving directory '/home/kevin/workspace/tdagster'
# Checking for prod installs - if any are listed below reinstall with 'pip -e'
! pip list --exclude-editable | grep -e dagster -e dagit
cd js_modules/dagit/; yarn install && yarn build-for-python
00h00m00s 0/0: : ERROR: [Errno 2] No such file or directory: 'install'
Makefile:96: recipe for target 'rebuild_dagit' failed
make: *** [rebuild_dagit] Error 1
johann
06/25/2020, 6:48 PMexecution:
celery-k8s:
config:
...
env_config_maps:
- "<DEPLOYMENT NAME>-dagster-pipeline-env"
kubectl get configmaps
and grab whichever ends with pipeline-env
Kevin
06/25/2020, 10:15 PMdagster.core.errors.DagsterUnmetExecutorRequirementsError: You have attempted to use an executor that uses multiple processes while using system storage in_memory which does not persist intermediates. This means there would be no way to move data between different processes. Please configure your pipeline in the storage config section to use persistent system storage such as the filesystem.
File "/usr/local/lib/python3.7/site-packages/dagster/core/execution/context_creation_pipeline.py", line 164, in pipeline_initialization_event_generator
executor_config = create_executor_config(context_creation_data)
File "/usr/local/lib/python3.7/site-packages/dagster/core/execution/context_creation_pipeline.py", line 276, in create_executor_config
instance=context_creation_data.instance,
File "/usr/local/lib/python3.7/site-packages/dagster/core/execution/context_creation_pipeline.py", line 50, in construct_executor_config
return executor_init_context.executor_def.executor_creation_fn(executor_init_context)
File "/usr/local/lib/python3.7/site-packages/dagster_celery/executor_k8s.py", line 113, in celery_k8s_job_executor
check_cross_process_constraints(init_context)
File "/usr/local/lib/python3.7/site-packages/dagster/core/definitions/executor.py", line 209, in check_cross_process_constraints
_check_persistent_storage_requirement(init_context.system_storage_def)
File "/usr/local/lib/python3.7/site-packages/dagster/core/definitions/executor.py", line 234, in _check_persistent_storage_requirement
).format(storage_name=system_storage_def.name)
johann
06/25/2020, 10:33 PMstorage:
s3:
config:
s3_bucket: "dagster-scratch-80542c2"
s3_prefix: "dagster-k8s-test"
In the playgroundKevin
06/29/2020, 6:04 PMStep add_one.compute finished without success or failure event, assuming failure.
logs from kubernetes show that it fails to find the pipeline 😕 any ideas?johann
06/29/2020, 6:09 PMKevin
06/29/2020, 6:12 PM2020-06-29 16:06:04 - dagster - DEBUG - math - 51c3700b-309b-43e5-b82f-800ec57c31a6 - PIPELINE_START - Started execution of pipeline "math".
dagster/solid_selection = "*"
2020-06-29 16:06:04 - dagster - DEBUG - math - 51c3700b-309b-43e5-b82f-800ec57c31a6 - ENGINE_EVENT - Submitting celery task for step "add_one.compute" to queue "dagster".
dagster/solid_selection = "*"
event_specific_data = {"error": null, "marker_end": null, "marker_start": "celery_queue_wait", "metadata_entries": []}
step_key = "add_one.compute"
2020-06-29 16:06:11 - dagster - ERROR - system - 51c3700b-309b-43e5-b82f-800ec57c31a6 - Step add_one.compute finished without success or failure event, assuming failure.
dagster/solid_selection = "*"
2020-06-29 16:06:11 - dagster - INFO - system - 51c3700b-309b-43e5-b82f-800ec57c31a6 - Dependencies for step mult_two.compute failed: ['add_one.compute']. Not executing.
dagster/solid_selection = "*"
solid = "mult_two"
solid_definition = "mult_two"
step_key = "mult_two.compute"
2020-06-29 16:06:11 - dagster - DEBUG - math - 51c3700b-309b-43e5-b82f-800ec57c31a6 - STEP_SKIPPED - Skipped execution of step "mult_two.compute".
dagster/solid_selection = "*"
solid = "mult_two"
solid_definition = "mult_two"
step_key = "mult_two.compute"
2020-06-29 16:06:11 - dagster - ERROR - math - 51c3700b-309b-43e5-b82f-800ec57c31a6 - PIPELINE_FAILURE - Execution of pipeline "math" failed.
dagster/solid_selection = "*"
{"data": {"executeRunInProcess": {"__typename": "PythonError", "message": "dagster.core.errors.DagsterSubprocessError: During celery execution errors occurred in workers:\n[add_one.compute]: (DagsterGraphQLClientError) - dagster_graphql.client.mutations.DagsterGraphQLClientError: Pipeline \"math\" not found: Could not find Pipeline <<in_process>>.test_repository.math:\n\nStack Trace: \n File \"/usr/local/lib/python3.7/site-packages/dagster_celery/engine.py\", line 91, in _core_celery_execution_loop\n step_events = result.get()\n File \"/usr/local/lib/python3.7/site-packages/celery/result.py\", line 217, in get\n self.maybe_throw(callback=callback)\n File \"/usr/local/lib/python3.7/site-packages/celery/result.py\", line 333, in maybe_throw\n self.throw(value, self._to_remote_traceback(tb))\n File \"/usr/local/lib/python3.7/site-packages/celery/result.py\", line 326, in throw\n self.on_ready.throw(*args, **kwargs)\n File \"/usr/local/lib/python3.7/site-packages/vine/promises.py\", line 244, in throw\n reraise(type(exc), exc, tb)\n File \"/usr/local/lib/python3.7/site-packages/vine/five.py\", line 195, in reraise\n raise value\n\n", "stack": [" File \"/usr/local/lib/python3.7/site-packages/dagster_graphql/implementation/utils.py\", line 14, in _fn\n return fn(*args, **kwargs)\n", " File \"/usr/local/lib/python3.7/site-packages/dagster_graphql/implementation/execution/execute_run_in_process.py\", line 28, in execute_run_in_graphql_process\n graphene_info, repository_location_name, repository_name, run_id\n", " File \"/usr/local/lib/python3.7/site-packages/dagster_graphql/implementation/execution/execute_run_in_process.py\", line 158, in _synchronously_execute_run_within_hosted_user_process\n execute_run(recon_pipeline, pipeline_run, graphene_info.context.instance)\n", " File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py\", line 149, in execute_run\n event_list = list(_execute_run_iterable)\n", " File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py\", line 640, in __iter__\n retries=self.retries,\n", " File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py\", line 568, in _pipeline_execution_iterator\n pipeline_context, execution_plan\n", " File \"/usr/local/lib/python3.7/site-packages/dagster_celery/engine.py\", line 166, in _core_celery_execution_loop\n subprocess_error_infos=list(step_errors.values()),\n"]}}}
from the dagster-stepjob:{
"data": {
"executePlan": {
"__typename": "PipelineNotFoundError",
"message": "Could not find Pipeline <<in_process>>.test_repository.math",
"pipelineName": "math"
}
}
}
johann
06/29/2020, 6:56 PMKevin
06/29/2020, 7:34 PMjohann
06/29/2020, 7:36 PMKevin
06/29/2020, 7:39 PMjohann
06/29/2020, 9:09 PMload_from:
- python_file: celery_pipeline.py
load_from:
- python_file: pipeline.py
and should be
load_from:
- python_file: celery_pipeline.py
- python_file: pipeline.py
Kevin
06/29/2020, 9:30 PMjohann
06/29/2020, 9:43 PMexecution:
celery-k8s:
config:
repo_location_name: whatever_name_is_choosen
alex
06/29/2020, 9:44 PMKevin
06/29/2020, 9:53 PMalex
06/30/2020, 2:39 PMdagster
commands in the container. You can read more about /opt/
here https://www.pathname.com/fhs/pub/fhs-2.3.html
2. attribute
is the thing we will go look up in the file / module to find the @repository
. When attribute
is not specified we iterate over every attribute and grab everything thats an instance of @repository
. location_name
allows you to specify the repository_location_name
which is a property mostly hidden from users at this time (except for the workaround you encountered).
3. Theoretically sure - you need the code to end up in the container so it can execute it but if you want to have the @repository
function download stuff from somewhere that is technically possible. It will likely be pretty complicated to pull this off. Things that make this hard include not changing versions of the code at the wrong times, ensuring you have the right dependencies, and latency / caching trade-offs.
4. Running the CLI via a deployed container should work pretty well. Running the CLI locally could work if you configure your instance (dagster.yaml
) and kubernetes CLIs correctly and have access to all the deployed components from your local.
5. You are looking for PresetDefinition
https://docs.dagster.io/docs/apidocs/pipeline#dagster.PresetDefinition
6. ❤️Kevin
06/30/2020, 4:03 PMalex
06/30/2020, 4:07 PMgit
based load_from
target at some point in the future which may be close to what you are looking for.Kevin
06/30/2020, 4:11 PMnate
06/30/2020, 5:00 PMalex
06/30/2020, 5:01 PMKevin
06/30/2020, 6:28 PMpip install -e dagster-aws
after replacing these files (attached) in the s3 directory under dagster-aws. The changes were just to pass the endpoint_url.Simon Späti
10/29/2020, 1:20 PMminio-s3
already supported for storage
in core dagster, or would I need to apply above objects? 🤔Kevin
10/29/2020, 3:19 PMminio-s3
but that version still required some minor code changes, for how my work wanted to operate with minio-s3
within dagster-awsSimon Späti
10/29/2020, 3:30 PMObjectStoreIntermediateStorage
. I'm trying to get it working. Hope so, otherwise I cannot run any pipeline as they cannot communicate to each other 😅0.9.13
(mainly renaming of files and classes)
Would be nice to bring this into the core dagster_aws. @nate? 🤔sandy
10/30/2020, 1:25 AM