Sterling Paramore
07/29/2022, 10:37 PMOren Lederman
07/29/2022, 11:12 PM@asset(config_schema={"some_param": str})
def asset_with_config(context):
some_param = context.op_config["some_param"]
return some_param
@repository
def dagster_dags():
return [
asset_with_config,
]
Dagit starts just fine. As expected, I’m getting an error I try to materialize the asset:
__ASSET_JOB cannot be executed with the provided config. Please fix the following errors:
Missing required config entry "ops" at the root. Sample config for missing entry: {'ops': {'asset_with_config': {'config': {'some_param': '...'}}}}
What’s the right way to provide configuration for the asset?Antonio Bertino
07/30/2022, 2:21 PMgeoHeil
07/30/2022, 7:10 PMKushagra Rajput
07/31/2022, 11:01 AMKushagra Rajput
07/31/2022, 11:02 AMKushagra Rajput
07/31/2022, 11:03 AMHuib Keemink
07/31/2022, 11:09 AMMark
07/31/2022, 11:37 AMworkspace.yaml
, it just hangs immediately and nothing happens. I am using multiple python_file
workspaces which worked for me very well before. Now i have the problem that dagit is not starting anymore if there is more than one workspace. I can start dagit with a single workspace and all others commented out. There are 4 workspaces and dagit starts with each of them seperately, so each individual workspace seems not to be the problem. But dagit will only start if there is only one workspace at all.
Also, if i start dagit with just one workspace in the file, then comment in the others and then reload the workspaces via button in the already running dagit instance that also works fine.
I already tried to set log-level to trace, but dagit is not logging anything when it hangs in startup with multiple workspaces, it just gets stuck immediately. I am running dagit inside a docker container with verion 0.15.8 but also tried 0.15.7 and 0.15.6.
I hope you have some idea what i can try to figure out what the problem is because i am not aware of any changes i made that might have lead to that.Edo
07/31/2022, 3:30 PMMatt Fysh
07/31/2022, 10:59 PMMohammad Nazeeruddin
08/01/2022, 5:34 AMdagster.core.errors.DagsterLaunchFailedError: Tried to start a run on a server after telling it to shut down
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 1386, in submit_run
SubmitRunContext(run, workspace=workspace)
File "/usr/local/lib/python3.7/site-packages/dagster/core/run_coordinator/default_run_coordinator.py", line 32, in submit_run
self._instance.launch_run(pipeline_run.run_id, context.workspace)
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 1450, in launch_run
self._run_launcher.launch_run(LaunchRunContext(pipeline_run=run, workspace=workspace))
File "/usr/local/lib/python3.7/site-packages/dagster/core/launcher/default_run_launcher.py", line 105, in launch_run
res.message, serializable_error_info=res.serializable_error_info
Ashish Sharma
08/01/2022, 7:05 AMException: timeout expired
error from the past 1 month, but when we rerun the job it gets completed successfully. I have been trying to kow the Rootcause with there is no information provided on the net. Can you please check this log and let me know, why this issue happens and how to fix it.
dagster.core.errors.DagsterExecutionStepExecutionError: Error occurred while executing op "complete_dq_request":
2
3 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/execute_plan.py", line 230, in dagster_event_sequence_for_step
4 for step_event in check.generator(step_events):
5 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/execute_step.py", line 353, in core_dagster_event_sequence_for_step
6 for user_event in check.generator(
7 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/execute_step.py", line 69, in _step_output_error_checked_user_event_sequence
8 for user_event in user_event_sequence:
9 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/compute.py", line 174, in execute_core_compute
10 for step_output in _yield_compute_results(step_context, inputs, compute_fn):
11 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/compute.py", line 142, in _yield_compute_results
12 for event in iterate_with_context(
13 File "/usr/local/lib/python3.8/dist-packages/dagster/utils/__init__.py", line 407, in iterate_with_context
14 return
15 File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
16 self.gen.throw(type, value, traceback)
17 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/utils.py", line 73, in solid_execution_error_boundary
18 raise error_cls(
19
20The above exception was caused by the following exception:
21Exception: timeout expired
22
23
24 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/utils.py", line 47, in solid_execution_error_boundary
25 yield
26 File "/usr/local/lib/python3.8/dist-packages/dagster/utils/__init__.py", line 405, in iterate_with_context
27 next_output = next(iterator)
28 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/compute_generator.py", line 65, in _coerce_solid_compute_fn_to_iterator
29 result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
30 File "/opt/dagster/home/orchestration_manager/ops/data_quality_ops/op_complete_dq_request.py", line 741, in complete_dq_request
31 raise Exception(l_error)
Sanidhya Singh
08/01/2022, 7:21 AMKatrin Grunert
08/01/2022, 9:05 AMk8s_job_executor
and each step of the job is being spawned as an own pod. I am running dagster version 0.14.20
In my job I set the job_spec_config
to {'ttl_seconds_after_finished': 120}
, and in the docs it says that jobs and their associated pods get deleted after the TTL has expired. After the TTL is exceeded, the k8s-job gets deleted, but my dagster-step-<SOME_ID>
- pods are not.
Is there some issues with the k8s_job_executor
, that these pods are not associated with the job, and is there a way I can achieve this?Alessandro Facchin
08/01/2022, 9:38 AMOperation name: SidebarAssetQuery
Message: 'compute_raw_features_2'
Path: ["assetNodeOrError","configField"]
Locations: [{"line":51,"column":3}]
It happens when I create two assets from the same graph "`compute_raw_features`" using the function AssetsDefinition.from_graph
Katrin Grunert
08/01/2022, 9:39 AMk8s_job_executor
and each step of the job is being spawned as an own pod. Each pod is named dagster-step-<SOME_ID>
and I was wondering, if it is possible to modify this generated name by a custom name.Sanidhya Singh
08/01/2022, 10:41 AMdagster.yaml
. The following code
retention:
schedule:
purge_after_days:
env: DAGSTER_SCHEDULE_PURGE # sets retention policy for schedule ticks of all types
sensor:
purge_after_days:
skipped:
env: DAGSTER_SENSOR_SKIPPED_PURGE
failure:
env: DAGSTER_SENSOR_FAILURE_PURGE
success:
env: DAGSTER_SENSOR_SUCCESS_PURGE
throws
raise DagsterInvalidConfigError(
dagster.core.errors.DagsterInvalidConfigError: Errors whilst loading dagster instance config at dagster.yaml.
Error 1: Invalid scalar at path root:retention:sensor:purge_after_days:failure. Value "{'env': 'DAGSTER_SENSOR_FAILURE_PURGE'}" of type "<class 'dict'>" is not valid for expected type "Int".
Error 2: Invalid scalar at path root:retention:sensor:purge_after_days:skipped. Value "{'env': 'DAGSTER_SENSOR_SKIPPED_PURGE'}" of type "<class 'dict'>" is not valid for expected type "Int".
Error 3: Invalid scalar at path root:retention:sensor:purge_after_days:success. Value "{'env': 'DAGSTER_SENSOR_SUCCESS_PURGE'}" of type "<class 'dict'>" is not valid for expected type "Int".
I’m on Dagster 0.15.8Jakub Zgrzebnicki
08/01/2022, 12:24 PMRemco Loof
08/01/2022, 1:26 PMLucas Gabriel
08/01/2022, 2:42 PMTimo
08/01/2022, 3:18 PMio_manager_key
argument for a graph backed asset AssetsDefinition.from_graph(...)
?Jack Yin
08/01/2022, 4:30 PMdagster-pandera
seems to require dagster-0.15.8
which then conflicts with other stuff that require dagster-0.15.7
and then everything breaksJack Yin
08/01/2022, 4:31 PMdagster-graphql
requires dagster-0.15.7
while dagster-pandera
requires dagster-0.15.8
Jack Yin
08/01/2022, 4:33 PMMatt Fysh
08/01/2022, 5:49 PMMarek
08/01/2022, 6:38 PMYang
08/01/2022, 8:51 PMOliver
08/01/2022, 11:29 PM@asset(
config_schema={
"limit": int,
"cohort_table": str,
"database": str
},
required_resource_keys={'aws'},
# ins={
# "cohort": AssetIn(
# input_manager_key="pandas_df_manager",
# key=AssetKey(('qld_health', 'qld_health_simple_cohort'))
# )
# },
non_argument_deps={
AssetKey(('qld_health', 'qld_health_simple_cohort'))
}
)
def cohort(context,
# cohort
):
pass
when setting non_argument_deps
to the source asset everything works as expected and the UI shows the correct lineage.
however when trying to supply an argument asset using the ins and the same AssetKey I get Input asset '["qld_health", "qld_health_simple_cohort"]' for asset '["cohort"]' is not produced by any of the provided asset ops and is not one of the provided sources
any ideas what I'm missing here?fahad
08/01/2022, 11:46 PMPythonObjectDagsterType
and it’s associated loader
argument. The loader
was created using the @dagster_type_loader
decorator and requests a resource via required_resource_keys={"file_downloader"}
. So put together I have something like this:
@dagster_type_loader(..., required_resource_keys={"file_downloader"})
def type_loader(context, config) -> MyType:
s3_downloader: S3Downloader = context.resources.file_downloader
return MyType(path=s3_downloader.download(...))
DagitConfigurableMyType = PythonObjectDagsterType(
python_type=MyType, loader=type_loader
)
Now I can use this as an input to a graph just fine - and configure it via dagit as an input to a graph. The file_downloader
resource is resolved and brought in as desired:
@graph(
...
ins={"path": In(DagitConfigurableMyType)},
)
def intake_graph(in_path: MyType):
...
However doing this almost works:
@graph(
...
ins={"path": In(list[DagitConfigurableMyType])},
)
def intake_graphs(in_paths: list[MyType]):
...
Except that it complains about not being able to resolve the file_downloader
resource. I’m assuming this is somehow related to how resources are resolved. If I manually add the resource to the first op in the graph it will actually resolve the resource for me. However, since the dagster_type_loader
does get respected even when nested within a list
, it seems like the resource should get resolved as wellfahad
08/01/2022, 11:46 PMPythonObjectDagsterType
and it’s associated loader
argument. The loader
was created using the @dagster_type_loader
decorator and requests a resource via required_resource_keys={"file_downloader"}
. So put together I have something like this:
@dagster_type_loader(..., required_resource_keys={"file_downloader"})
def type_loader(context, config) -> MyType:
s3_downloader: S3Downloader = context.resources.file_downloader
return MyType(path=s3_downloader.download(...))
DagitConfigurableMyType = PythonObjectDagsterType(
python_type=MyType, loader=type_loader
)
Now I can use this as an input to a graph just fine - and configure it via dagit as an input to a graph. The file_downloader
resource is resolved and brought in as desired:
@graph(
...
ins={"path": In(DagitConfigurableMyType)},
)
def intake_graph(in_path: MyType):
...
However doing this almost works:
@graph(
...
ins={"path": In(list[DagitConfigurableMyType])},
)
def intake_graphs(in_paths: list[MyType]):
...
Except that it complains about not being able to resolve the file_downloader
resource. I’m assuming this is somehow related to how resources are resolved. If I manually add the resource to the first op in the graph it will actually resolve the resource for me. However, since the dagster_type_loader
does get respected even when nested within a list
, it seems like the resource should get resolved as wellowen
08/02/2022, 4:16 PMfahad
08/02/2022, 4:17 PM