Hello all, I am working a project to convert dbt ...
# ask-community
e
Hello all, I am working a project to convert dbt orchestration into dagster open source. I have 5 dbt projects, which I have chained together using asset keys. I have a job with 15 assets, which is probably on the small side of a job definition. I am getting this error when attempting to materialize all the assets in the job (execute a full job run). dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server. gRPC Error code: RESOURCE_EXHAUSTED grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.RESOURCE_EXHAUSTED details = "Sent message larger than max (57680600 vs. 50000000)" debug_error_string = "UNKNOWN:Error received from peer {created_time:"2023-08-29T155118.218220741+00:00", grpc_status:8, grpc_message:"Sent message larger than max (57680600 vs. 50000000)"}"
There is one other thread on this issue, but no resolution. These dbt projects are big. Thousands of sources and models in each one. Dagster would need to handle jobs wiht very large dags and hundreds if not thousands of assets. Can Dagster do this? How can this error be resolved? I am on Dagster 1.4.2.
s
Dagster can handle thousands of assets - it's pretty common with dbt projects. Are you able to report the total number that you have? Also, are you able to try this with the latest version of Dagster? We fixed a bug at one point that is related to this.
d
Hey Eric (just saw your email as well) - if sandy's suggestions don't help after upgrading, you could try setting the
DAGSTER_GRPC_MAX_RX_BYTES
and
DAGSTER_GRPC_MAX_SEND_BYTES
environment variables in your deployment to increase the max amount of metadata being sent in a single RPC call. The default is 50000000 (50 MB of metadata) which it looks like you're just over - you could try setting it to 70000000 instead. As sandy says though it's unusual to run into this limit, even with large dbt projects. Is there anything unique about your project that might increase the amount of metadata that's being shown in the Dagster UI?
e
Hello @sandy, sorry for missing this. I have been actively trying to resolve. Thus far I have upgraded to dagster==1.4.10. Is there a way to get an asset count? I would estimate 1.5-2K assets per project. The projects are linked now using asset keys through a dbttranslator.
d
Sandy's actually out of office for the next couple of weeks - but the suggestion i gave above may work in the short-term to unblock you (at which point the asset count should then be shown in the UI)
e
@daniel,thank you for your reply. I just saw it after I posted. I am not aware of anything unique. I will try the env variables as suggested and report back.
@daniel, the adjustments you suggested for the env variables seems to have worked! I guess the question I have is why is my project generating this much metadata? Where is the count shown in the UI? (I have looked everywhere, no count to be seen). I do use the project.yml file, meta tag to group all the assets into dagster groups. Beyond that, I have 5 dbt projects that are linked together using asset keys to build the dag. Please suggest anywhere I can look to reduce the metadata and reduce this message size. Please also let me know if you need any additional information that might help troubleshoot.
r
Hey Eric, the message size might be due to us including the raw SQL of each model as part of the metadata. If you define a custom
DagsterDbtTranslator
that removes this, you should see your message size go down.
Copy code
from typing import Any, Mapping

from dagster_dbt import DagsterDbtTranslator
from dagster_dbt.asset_utils import default_description_fn


class CustomDagsterDbtTranslator(DagsterDbtTranslator):
    @classmethod
    def get_description(cls, dbt_resource_props: Mapping[str, Any]) -> str:
        return default_description_fn(dbt_resource_props, display_raw_sql=False)
e
Thank you @rex! This seems to have done the trick, and it also appears to have sped up my imports and executions as well. I had increased the limit up to 90000000 and it was still failing on even simple jobs. With removing the sql as suggested the jobs I have configured so far are executing. All in I have about 3K assets spread across 5 dbt projects. Any other suggestions to limit the metadata?