Ritasha Verma
05/10/2021, 4:59 AMTiri Georgiou
05/10/2021, 11:32 AMvolumes: # Make docker client accessible so we can terminate containers from dagit
- /var/run/docker.sock:/var/run/docker.sock
fails due to not being able to mount to a host?Ritasha Verma
05/10/2021, 12:54 PMJean-Pierre M
05/10/2021, 4:14 PMCharles Lariviere
05/10/2021, 4:43 PMmultiprocess
execution — our pipeline stalls when Dagster starts multiple threads as if it’s waiting for a signal which it never receives. This only happens in production in our kubernetes deployment with version 0.11.6 — things work fine locally with the same config though. For reference, we have the following run config:
execution:
multiprocess:
config:
max_concurrent: 8
resources:
io_manager:
config:
s3_bucket: <BUCKEt>
s3_prefix: <PREFIX>
s3:
config: {}
With the io_manager
being s3_pickle_io_manager
. I can confirm the AWS credentials used by the cluster have access to the bucket as they’re using the same credentials/bucket for other pipelines that work fine when not using the multiprocess execution. Any ideas why that could be?Steve Pletcher
05/10/2021, 6:38 PMpreconfigure_for_mode(resource, "dev")
, but this doesn't seem to be doable for solids in the context of a pipeline, since pipelines aren't passed a context argument.Max
05/10/2021, 8:04 PMMartim Passos
05/10/2021, 8:33 PMJenny Webster
05/11/2021, 4:42 AMRemy Dufrenoy
05/11/2021, 8:04 AM@daily_schedule(
start_date=datetime.datetime(2020, 1, 1),
pipeline_name="my_pipeline",
solid_selection=preset.solid_selection,
mode=preset.mode,
tags_fn_for_date=lambda _: preset.tags,
)
def my_modified_preset_schedule(date):
modified_run_config = preset.run_config.copy()
modified_run_config["date"] = date.strftime("%Y-%m-%d")
return modified_run_config
When I am selecting the partition in dagit's playground, the config is updated and the right date is picked up. However, when I try to run a backfill, the config is not updated for each run. Instead, it seems to take only the config of the last partition selected in dagit.
Here is the exact code for the schedule I am trying to run, I created it in isolation to make sure it wasn't something else in my code causing the issue:
@solid(config_schema={"test_in": str})
def backfill_solid(context):
test_in = context.solid_config["test_in"]
<http://context.log.info|context.log.info>(f"test_in is {test_in}")
MODE_BUG = ModeDefinition(name="MODE_BACKFILL_BUG", resource_defs={"io_manager": fs_io_manager})
PRESET_BACKFILL_BUG = PresetDefinition.from_files(
name="PRESET_BACKFILL_BUG",
mode="MODE_BACKFILL_BUG",
config_files=[
file_relative_path(__file__, "./presets_backfill_bug.yaml")
])
@pipeline(mode_defs=[MODE_BUG],
preset_defs=[PRESET_BACKFILL_BUG])
def backfill_pipeline():
backfill_solid()
@daily_schedule(
pipeline_name="backfill_pipeline",
start_date=datetime.datetime(2021, 5, 1),
execution_time=datetime.time(9, 0, 0),
solid_selection=PRESET_BACKFILL_BUG.solid_selection,
mode=PRESET_BACKFILL_BUG.mode,
)
def backfill_schedule(date):
modified_run_config = PRESET_BACKFILL_BUG.run_config.copy()
modified_run_config["solids"]["backfill_solid"]["config"]["test_in"] = date.strftime("%Y-%m-%d")
return modified_run_config
I attached the run configuration I see in dagit for one of the runs
I use Dagster 0.11.8 and my Python version is 3.8.5David
05/11/2021, 9:10 AMpipeline
that contains several composite solids
, is there a way to manage the order of execution if there are no dependencies. (e.g. we want some composite_solid
to be executed at the beginning before the other composite solids are executed).
Any ideas?Yan
05/11/2021, 9:49 AMLaura Moraes
05/11/2021, 3:40 PMJeff Hulbert
05/11/2021, 3:53 PMshril
05/12/2021, 12:09 AMhello_cereal.py
tutorial and I encountered this error.
Can anyone help me with this?
I am running in conda environment. (MacOS)
➜ dagster-tutorial dagit -f hello_cereal.py
Using temporary directory /var/folders/ml/cyrfcd_d5l1fn6nvkb4d3y14kdlv1s/T/tmp8gnm3ukt for storage. This will be removed when dagit exits.
To persist information across sessions, set the environment variable DAGSTER_HOME to a directory to use.
Loading repository...
Serving on <http://127.0.0.1:3000> in process 95032
Traceback (most recent call last):
File "/opt/anaconda3/envs/dagster-papermill/lib/python3.8/site-packages/gevent/pywsgi.py", line 999, in handle_one_response
self.run_application()
File "/opt/anaconda3/envs/dagster-papermill/lib/python3.8/site-packages/geventwebsocket/handler.py", line 75, in run_application
self.run_websocket()
File "/opt/anaconda3/envs/dagster-papermill/lib/python3.8/site-packages/geventwebsocket/handler.py", line 52, in run_websocket
list(self.application(self.environ, lambda s, h, e=None: []))
File "/opt/anaconda3/envs/dagster-papermill/lib/python3.8/site-packages/flask/app.py", line 2069, in __call__
return self.wsgi_app(environ, start_response)
File "/opt/anaconda3/envs/dagster-papermill/lib/python3.8/site-packages/flask_sockets.py", line 40, in __call__
handler, values = adapter.match()
File "/opt/anaconda3/envs/dagster-papermill/lib/python3.8/site-packages/werkzeug/routing.py", line 2026, in match
raise WebsocketMismatch()
werkzeug.routing.WebsocketMismatch: 400 Bad Request: The browser (or proxy) sent a request that this server could not understand.
2021-05-11T23:57:14Z {'REMOTE_ADDR': '127.0.0.1', 'REMOTE_PORT': '62035', 'HTTP_HOST': '127.0.0.1:3000', (hidden keys: 32)} failed with WebsocketMismatch
Doron Grinzaig
05/12/2021, 11:30 AMLuis Rodríguez Escario
05/12/2021, 11:35 AM@solid
def my_solid(
context,
var1: dagster.Optional[dagster.Any] = None
...
):
but when I try to set a value for the input in dagit config like:
solids:
my_solid:
inputs:
var1: 42
or
solids:
my_solid:
inputs:
var1: {"ke1":"value1"}
I get:
Expected: "{ json: { path: String } pickle: { path: String } value: Any }"
I've also tried to define a dagster.DagsterTypeLoader to accept any python Object like:
@dagster_type_loader(Permissive())
def load_object(_context, value):
return value
and then define the solid input as
@solid
def my_solid(
context,
var1: dagster.Optional[dagster.PythonObjectDagsterType(python_type=object, loader=load_object)] = None
...
):
but then dagit insist that the input should be a dict.
How can I admit dict/int/None as an input that will only be defined via config?
Thanks!!!Mark
05/12/2021, 4:20 PMMartim Passos
05/12/2021, 5:36 PMStringSource
documentation’s example without successDaniil
05/12/2021, 6:29 PMLaura Moraes
05/12/2021, 7:27 PMMartim Passos
05/12/2021, 10:35 PMpipenv install dagit
? I get
ERROR: Could not find a version that matches graphql-core<3,<4,>=2.0,>=2.1,>=2.3,>=2.3.2,>=3.1.2
but dagster installs fineNoah K
05/12/2021, 11:37 PMpaul.q
05/13/2021, 6:55 AMPartitionSet
with a partition_fn
that works out the dates based on something we can access at the time it is called, so it might need access to a Resource or at least an asset for which we've created an AssetMaterialization
event. But I don't have access to a context with a run_id, so not sure how I can best achieve this.
Even if I can work out how to create the PartitionSet, how do I kick off the backfill? It seems I can only do it via CLI or Dagit UI. I would like to be able to do it via the python or GraphQL API.
What I want to do seems like a reasonably commonplace use case, so I feel I must be missing something obvious about the way Dagster backfills work. Can someone help?
Thanks
Paulmrdavidlaing
05/13/2021, 12:22 PM$ dagit -p 3000 -w dagster_workspace.yaml
Using temporary directory /var/folders/cx/c_41bpzs1jlc465fx_k_9qfh0000gn/T/tmpe442ol4u for storage. This will be removed when dagit exits.
To persist information across sessions, set the environment variable DAGSTER_HOME to a directory to use.
Loading repository...
Serving on <http://127.0.0.1:3000> in process 10619
mrdavidlaing
05/13/2021, 3:31 PM/graphql
queries seem to fail and cause Dagit to emit error logs like the below:
Traceback (most recent call last):
File "/Users/dbasner/.local/share/virtualenvs/tanzu-dm-Qg2J58No/lib/python3.7/site-packages/gevent/pywsgi.py", line 999, in handle_one_response
self.run_application()
File "/Users/dbasner/.local/share/virtualenvs/tanzu-dm-Qg2J58No/lib/python3.7/site-packages/geventwebsocket/handler.py", line 75, in run_application
self.run_websocket()
File "/Users/dbasner/.local/share/virtualenvs/tanzu-dm-Qg2J58No/lib/python3.7/site-packages/geventwebsocket/handler.py", line 52, in run_websocket
list(self.application(self.environ, lambda s, h, e=None: []))
File "/Users/dbasner/.local/share/virtualenvs/tanzu-dm-Qg2J58No/lib/python3.7/site-packages/flask/app.py", line 2069, in __call__
return self.wsgi_app(environ, start_response)
File "/Users/dbasner/.local/share/virtualenvs/tanzu-dm-Qg2J58No/lib/python3.7/site-packages/flask_sockets.py", line 40, in __call__
handler, values = adapter.match()
File "/Users/dbasner/.local/share/virtualenvs/tanzu-dm-Qg2J58No/lib/python3.7/site-packages/werkzeug/routing.py", line 2026, in match
raise WebsocketMismatch()
werkzeug.routing.WebsocketMismatch: 400 Bad Request: The browser (or proxy) sent a request that this server could not understand.
2021-05-13T14:39:37Z {'REMOTE_ADDR': '127.0.0.1', 'REMOTE_PORT': '57343', 'HTTP_HOST': '127.0.0.1:3000', (hidden keys: 31)} failed with WebsocketMismatch
Deveshi
05/13/2021, 3:35 PMdagit -w workspace.yaml
but getting following error
Error loading repository location repo.py:dagster.core.errors.DagsterUserCodeProcessError: dagster.core.errors.DagsterInvariantViolationError: File or glob pattern "config.yaml" for "config_files"produced no results.
This is my workspace.yaml:
load_from:
- python_file:
relative_path: test_project/scripts/repo.py
working_directory: C:/Users/Deveshi/source/test_project/scripts
If I just move to folder containing repo.py and run dagit -f repo.py, it works
I am using a preset that loads config from config.yaml
my dagster version is 11.0
Thanks!Kirk Stennett
05/13/2021, 5:28 PMEduardo Santizo
05/13/2021, 6:27 PMdocker-compose.yaml
?Makoto
05/13/2021, 11:38 PMFailure
? I naively thought that would stop it, but it seems like I need to use conditional branching to achieve it? Is there an example somewhere? I am having a bit of hard time grasping how I can apply it.