https://dagster.io/ logo
#ask-ai
Title
# ask-ai
d

Darryl Ng

02/25/2024, 11:36 PM
@Scout I am getting OOM when I click "observe all" source assets within the graph, with the steps running in subprocesses of a k8s pod. From looking at the logs, all steps are created at the same time within the run. These observation functions are created using
observable_source_asset
. How can I for run job to limit the number of steps that can be run at any one time?
s

Scout

02/25/2024, 11:36 PM
To limit the number of steps that can run concurrently within a Dagster run, especially when using Kubernetes, you can utilize the
tag_concurrency_limits
configuration. This configuration allows you to set limits on the concurrency of steps based on tags. However, it's important to note that these limits are applied per run and not globally. Here's how you can configure
tag_concurrency_limits
according to the documentation: 1. Define a set of limits that are applied to steps with particular tags. If a value is set, the limit is applied only to that key-value pair. If no value is set, the limit is applied across all values of that key. 2. If the value is set to a dictionary with
applyLimitPerUniqueValue: true
, the limit will apply to the number of unique values for that key. 3. These limits are specified in the run configuration when you launch a run. For example, if you want to limit the concurrency based on a specific tag, you would include something like this in your run configuration:
Copy code
yaml execution:   config:     tag_concurrency_limits:       - key: "my-concurrency-limit-tag"         value: "limit-value"         limit: 5
This configuration would limit the number of steps that can run concurrently with the tag
my-concurrency-limit-tag
set to
limit-value
to 5 within the scope of a single run. However, it's important to note that
observable_source_asset
observations are a special case and currently cannot be run as part of a standard asset job that materializes assets. The
selection
argument to
define_asset_job
must target only observable source assets, and an error will be thrown if a mix of regular assets and observable source assets is selected. If you are experiencing out-of-memory (OOM) issues, it's also worth considering the resources allocated to your Kubernetes pods. You may need to adjust the resource requests and limits for the pods to ensure they have enough memory to handle the workload. This can be done by setting the
step_k8s_config
in your run configuration to specify the resource requirements for each step. If you continue to face issues or if the
tag_concurrency_limits
configuration does not apply to your use case with
observable_source_asset
, it may be necessary to look into the specific implementation details or consider reaching out for further assistance, as the documentation does not provide a direct solution for limiting concurrency specifically for
observable_source_asset
observations within a Kubernetes environment.
2 Views