Frank Dekervel
01/14/2022, 12:54 PMAlessandro Marrella
01/14/2022, 1:09 PMMartim Passos
01/14/2022, 1:51 PMBryan Chavez
01/14/2022, 3:41 PMBryan Chavez
01/14/2022, 4:05 PMNick Dellosa
01/14/2022, 9:39 PMgrpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with: status = StatusCode.DEADLINE_EXCEEDED details = "Deadline Exceeded" debug_error_string = "{"created":"@1642196167.763819910","description":"Error received from peer ipv4:10.0.128.47:4000","file":"src/core/lib/surface/call.cc","file_line":1063,"grpc_message":"Deadline Exceeded","grpc_status":4}" >
File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 494, in _process_api_request
api_result = self._handle_api_request(request, instance, user_code_launcher)
File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 371, in _handle_api_request
serialized_sensor_data_or_error = client.external_sensor_execution(
File "/dagster/dagster/grpc/client.py", line 293, in external_sensor_execution
chunks = list(
File "/dagster/dagster/grpc/client.py", line 118, in _streaming_query
yield from response_stream
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 426, in __next__
return self._next()
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 826, in _next
raise self
and
dagster.core.errors.DagsterUserCodeUnreachableError: Timed out waiting for response to request: DagsterCloudApiRequest(request_id='7d4c9656-2daa-4729-8778-0edc072fa688', request_api=<DagsterCloudApi.GET_EXTERNAL_SENSOR_EXECUTION_DATA: 'GET_EXTERNAL_SENSOR_EXECUTION_DATA'>, request_args=SensorExecutionArgs(repository_origin=ExternalRepositoryOrigin(repository_location_origin=RegisteredRepositoryLocationOrigin(location_name='human_resources'), repository_name='human_resources_repository'), instance_ref=None, sensor_name='lever_s3_sensor', last_completion_time=None, last_run_key=None, cursor=None), expire_at=1642196769.66473)
File "/dagster/dagster/daemon/sensor.py", line 236, in execute_sensor_iteration
yield from _evaluate_sensor(
File "/dagster/dagster/daemon/sensor.py", line 268, in _evaluate_sensor
sensor_runtime_data = repo_location.get_external_sensor_execution_data(
File "/ursula/ursula/user_code/workspace.py", line 404, in get_external_sensor_execution_data
result = self.api_call(
File "/ursula/ursula/user_code/workspace.py", line 205, in api_call
return dagster_cloud_api_call(
File "/ursula/ursula/user_code/workspace.py", line 65, in dagster_cloud_api_call
for result in gen_dagster_cloud_api_call(
File "/ursula/ursula/user_code/workspace.py", line 121, in gen_dagster_cloud_api_call
raise DagsterUserCodeUnreachableError(
Bryan Chavez
01/15/2022, 2:28 AMcontext.failure_event.message
Nitin Madhavan
01/15/2022, 5:30 AMLaurentS
01/16/2022, 10:01 PMload_from:
- python_package: my_package
and this dagster.yaml
bit:
run_launcher:
module: dagster.core.launcher
class: DefaultRunLauncher
but when I try to use the DockerRunLauncher
I end up hitting one of 2 errors.
# dagster.yaml
run_launcher:
module: dagster_docker
class: DockerRunLauncher
config:
env_vars:
- DAGSTER_POSTGRES_USER
- DAGSTER_POSTGRES_PASSWORD
- DAGSTER_POSTGRES_DB
network: dagsternet
and
# workspace.yaml
load_from:
- grpc_server:
host: dagster_user_code
port: 4000
location_name: "my_package"
with the above setup, I see details = "DNS resolution failed for service: dagster_user_code:4000"
and if I change the network in dagster.yaml to mystack_dagsternet
then I see failed to connect to all addresses
in the dagit logs (the network name is the one shown by docker network ls
scoped with the stack name). I am running all of this with docker stack deploy -f mycomposefile.yml mystack
and I have a container for dagit
, one for dagster_daemon
, and one for dagster_user_code
(and some other services). They are all on the same network so they can reach each other. I must be missing something, but the error message is a bit cryptic. This is dagster 0.13.14. Any suggestions would be welcome :)Daniel Suissa
01/16/2022, 10:52 PMpdpark
01/17/2022, 2:30 AMdagster.yaml
file in the project root folder with the following content:
compute_logs:
module: dagster_aws.s3.compute_log_manager
class: S3ComputeLogManager
config:
bucket: "<bucket>"
prefix: "dagster-logs"
use_ssl: False
verify: False
skip_empty_files: True
When I run dagster job launch --job <job_name> --workspace <workspace_file_name>
dagster appears to ignore these settings: there are no errors and logs are not saved to the specified s3 bucket/prefix. Thanks.Mohammad Nazeeruddin
01/17/2022, 8:48 AM'dagster-k8s/config': {
'job_spec_config': {
'ttl_seconds_after_finished': 60
},
Is it possible to add annotations in job_spec_config ?
{ "annotations" : "<http://argocd.argoproj.io/sync-options|argocd.argoproj.io/sync-options>": "Prune=false" }
Bernardo Cortez
01/17/2022, 11:31 AMKevin Thackray
01/17/2022, 12:05 PMJonas De Beukelaer
01/17/2022, 12:29 PMdagster-k8s/config
tag to attach them)
c. also currently it means I would need to reload dagit each time I create a new job requiring new secrets, instead of just reloading the relevant dagster-user-deployment
3. Or, since my secrets currently live in AWS Secrets Manager - could it make sense to simply create an op to pull these as part of the graph? I think this might work well locally too. Hmm writing this out may have helped me find a solution 😄 Still interested to hear the dagster team’s take on thisMykola Palamarchuk
01/17/2022, 3:32 PMBernardo Cortez
01/17/2022, 5:07 PMcontext.scheduled_execution_time
?VxD
01/18/2022, 2:09 AMop_retry_policy
on a graph, but still bypass it and immediately abort execution if a special exception is raised?Surya Kocherlakota
01/18/2022, 11:43 AMMohammad Nazeeruddin
01/18/2022, 12:48 PMjob_metadata {
"annotations" : "<http://argocd.argoproj.io/sync-options|argocd.argoproj.io/sync-options>": "Prune=false"
}
Sandeep Aggarwal
01/18/2022, 1:06 PMexecute_in_process
API to process a graph with ~15 ops.
I am observing significant performance drop when switching to a persistent dagster instance with SQLLite/Postgres based run & event log storages. The execution time increases to 4 seconds
which was earlier taking around 250ms
. The executor is still the in-process one, so I guess its DB writes that are causing this overhead. Is that expected?
Below are screenshots for execution times.Martim Passos
01/18/2022, 4:48 PMsqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked
[SQL: INSERT INTO event_logs (run_id, event, dagster_event_type, timestamp, step_key, asset_key, partition) VALUES (?, ?, ?, ?, ?, ?, ?)]
My use case is rather simple and I don’t need to keep track of all the logs Dagster saves by default. I tried unsetting DAGSTER_HOME
but that just uses a temporary directory that hits the same problem. Is there a way to work around this without having to setup my own SQL db? Something like reducing the amount of events saved or preventing/limiting the DynamicOutput
ops from running in parallel?Tim Roy
01/18/2022, 5:40 PMvolumes:
dagster_mnt:
external:
name: dagster_mnt
which dagster_mnt was created with docker volume ...
Under each service in the docker-compose.yaml I have added:
volumes:
- dagster_mnt:/mount
However, I cannot interact with the mounted volume from the temp containers that dagit orchestrates, and when I attempt to ssh in, I can see that the volume is not mounted. Has any one run into this issue before or know of any work-around?Shikhar Mohan
01/18/2022, 5:51 PMTiri Georgiou
01/18/2022, 6:32 PMAn exception was thrown during execution that is likely a framework error, rather than an error in user code.
Error
dagster.core.errors.DagsterInvalidConfigError: Error in config for resource slack
Error 1: Value at path root:config:token must not be None. Expected "(String | { env: String })"
Now when I exec into the pod running the user-code-deployment and
$ echo $SLACK_TOKEN
I can retrieve it. My job looks like this (shortened for readability)…
tesco_ev_naming_prod_job = tesco_ev_naming.to_job(
resource_defs={
"slack": slack_resource.configured({"token": os.getenv("SLACK_TOKEN")}, description="Slack on failures to #data-alarms."),
# other resource configs
}
),
hooks={
slack_on_failure(channel="#data-alarms", message_fn=slack_failure_traceback,
dagit_base_url="<http://localhost:3000>")
},
# other configs
)
In addition I’ve defined my envSecrets (passed them in as literals using kubectl) on values.yaml for my user-code-deployment.yaml
envSecrets:
- name: dagster-aws-access-key-id
- name: dagster-aws-secret-access-key
- name: dagster-slack
Any ideas? I would have thought if I can access the env variable $SLACK_TOKEN
in the pod it should work?Kevin Haynes
01/18/2022, 10:47 PMfrom dagster_aws.redshift import redshift_resource
@configured(redshift_resource)
def prod_redshift(_):
return {
'host': {"env": "DAGSTER_RS_HOST"},
'port': {"env": "DAGSTER_RS_PORT"},
'user': {"env": "DAGSTER_RS_USERNAME"},
'password': {"env": "DAGSTER_RS_PASSWORD"},
'database': {"env": "DAGSTER_RS_DB"},
'autocommit': True
}
But I realized a problem with this approach: if I try to unit test my op and/or job, it's going to execute those SQL statements in the prod Redshift cluster. My assumption based on the existence of the dagster.aws.redshift.fake_redshift_resource
resource is that I should use that for testing because it will run everything except the SQL statements themselves, allowing me to test everything except the SQL syntax. But I can't wrap my head around how I would actually handle that configuration within my jobs and ops. I remember reading a while ago about the concept of "modes" but it appears that documentation is now in legacy status - is there a similar construct in the new solid-less op world?Mykola Palamarchuk
01/18/2022, 10:53 PMVxD
01/19/2022, 12:59 AMSundara Moorthy
01/19/2022, 9:31 AMThomas
01/19/2022, 11:54 AM