Ted Zhao
02/13/2024, 4:35 AMScout
02/13/2024, 4:35 AMpoll_timeout
configuration option that can be set when initiating a run for a dbt Cloud job using Dagster. This poll_timeout
defines the maximum time that will be waited before the operation is timed out. By default, this timeout is not set, meaning it will never time out. However, if a timeout is configured and the dbt asset job exceeds this duration, the operation will fail due to the timeout, regardless of the outcome of the dbt asset job. If you are experiencing issues with dbt asset jobs running too long and you suspect they are being terminated due to a timeout, you should check the configuration of your poll_timeout
to ensure it is set to an appropriate value that allows enough time for your dbt asset jobs to complete successfully. If you have not set a poll_timeout
, then the job should not fail due to duration, unless there are other factors at play, such as resource constraints or other limits imposed by the environment where Dagster Cloud is running.Ted Zhao
02/13/2024, 4:38 AMRuntime Error in model int_normalised_metrics (models/gold_pipelines/intermediate/p6/00_normalised_metrics/int_normalised_metrics.py)
Runtime Error
Invalid SessionHandle: SessionHandle [319b3f9b-f635-45b8-8162-f2bbdeea42f3]
databricks-sql-connector adapter: Attempted to close session that was already closed: Invalid SessionHandle: SessionHandle [319b3f9b-f635-45b8-8162-f2bbdeea42f3]
Finished running 35 table models, 1 test, 1 view model in 3 hours 33 minutes and 12.48 seconds (12792.48s).
Scout
02/13/2024, 4:38 AMInvalid SessionHandle
error. This error is related to the Databricks SQL connector, which dbt uses to interact with the Databricks environment. Here are a few steps you can take to troubleshoot this issue: 1. Check the dbt Logs: Look at the full dbt logs to see if there are any additional error messages or warnings that precede this error. This might give you more context on why the session was closed unexpectedly. 2. Session Management: Ensure that your dbt project's configuration and the Databricks environment are correctly managing sessions. There might be a configuration issue that is causing sessions to be closed prematurely or multiple operations trying to close the same session. 3. Databricks Environment: Check the Databricks environment for any issues that might be causing sessions to close unexpectedly. This could include looking at resource utilization, cluster configuration, and any relevant Databricks logs. 4. dbt Configuration: Review your dbt project configuration for any settings related to the Databricks SQL connector that might affect session management. 5. Concurrency and Resource Limits: If your dbt run is executing multiple models in parallel, ensure that you are not hitting any concurrency or resource limits imposed by Databricks that might cause sessions to be terminated. 6. Connector Version: Ensure that you are using a compatible and up-to-date version of the databricks-sql-connector
. Sometimes, bugs related to session management are fixed in newer versions of connectors. 7. Retry Logic: Consider implementing retry logic in your dbt project to handle transient errors with session management. 8. Databricks Support: If you continue to face issues, consider reaching out to Databricks support for assistance, as they may be able to provide more insight into the session management on their platform. 9. Dagster Integration: If you are running dbt as part of a Dagster pipeline, ensure that the Dagster integration with dbt and Databricks is correctly configured and that Dagster is correctly managing the lifecycle of the dbt run. Remember to apply changes incrementally and test after each change to isolate the cause of the issue. If the problem persists, you may need to engage with the support channels for dbt, Databricks, or Dagster, depending on where the issue seems to be originating.Ted Zhao
02/13/2024, 4:40 AM03:11:57 1 of 7 START python table model dbt_gold_pipelines.int_index_correlations ...... [RUN]
03:11:57 2 of 7 START python table model dbt_gold_pipelines.int_tradingitem_price_stage2 [RUN]
03:11:58 Databricks adapter: Job submission response={'id': 'c599d2b7d6c3408dab90250e5cf165bd'}
03:11:58 Databricks adapter: Job submission response={'id': '7ac7bb2e8e7a4449b1c81361de738b43'}
03:12:29 1 of 7 OK created python table model dbt_gold_pipelines.int_index_correlations . [OK in 32.07s]
03:49:51 2 of 7 OK created python table model dbt_gold_pipelines.int_tradingitem_price_stage2 [OK in 2273.49s]
03:49:51 3 of 7 START sql view model dbt_gold_pipelines.int_tradingitem_price_stage3 .... [RUN]
03:49:51 databricks-sql-connector adapter: Attempted to close session that was already closed: Invalid SessionHandle: SessionHandle [59be491e-1adb-4140-9a80-b9fb48024298]
03:50:03 3 of 7 OK created sql view model dbt_gold_pipelines.int_tradingitem_price_stage3 [OK in 12.44s]
03:50:03 4 of 7 START python table model dbt_gold_pipelines.int_tradingtem_price_metrics [RUN]
03:50:03 databricks-sql-connector adapter: Attempted to close session that was already closed: Invalid SessionHandle: SessionHandle [2f14f47e-6254-4c51-b910-ba74446c49c3]
03:50:04 Databricks adapter: Job submission response={'id': '23eb17d2ba0848398c0962cefa7f5536'}
04:07:30 4 of 7 OK created python table model dbt_gold_pipelines.int_tradingtem_price_metrics [OK in 1046.91s]
04:07:30 5 of 7 START python table model dbt_gold_pipelines.int_beta_metrics_input ...... [RUN]
04:07:30 databricks-sql-connector adapter: Attempted to close session that was already closed: Invalid SessionHandle: SessionHandle [32df1be7-98cc-4468-9317-069fbfe6b530]
04:07:40 Databricks adapter: Job submission response={'id': '57afc8cb71b94f55bbf6542b105f3286'}
Scout
02/13/2024, 4:41 AMdatabricks-sql-connector
are not errors that should cause your dbt job to fail. They are informational messages indicating that there was an attempt to close a Databricks SQL session that was already closed. This could be due to the way the Databricks SQL connector manages its sessions or cleans up after a job is completed. The key thing to note is that these messages are not marked as errors in the dbt run output, and the subsequent steps continue to execute and complete successfully. The [OK in X.XXs]
messages indicate that the models are being created without any issues. This behavior might be a benign race condition or a cleanup step in the connector where it attempts to close a session that has already been closed by another part of the process. It's also possible that this is a known behavior of the connector that doesn't impact the functionality of your dbt models. If these messages are not causing any actual failures or issues with your dbt runs, you can generally ignore them. However, if you want to ensure that everything is working as expected, you can: 1. Check the Connector Documentation: Review the documentation for the databricks-sql-connector
to see if this behavior is expected or documented. 2. Update the Connector: Make sure you are using the latest version of the databricks-sql-connector
, as any bugs or unexpected behaviors are often addressed in newer releases. 3. Review dbt and Databricks Configurations: Ensure that your dbt project and Databricks configurations are set up correctly and that there are no conflicting settings that might cause session management issues. 4. Monitor Runs: Keep an eye on your dbt runs to ensure that these messages do not start correlating with actual failures or issues. 5. Reach Out for Support: If you are concerned about these messages or if they start causing problems, consider reaching out to Databricks support or the maintainers of the databricks-sql-connector
for further assistance. 6. Community and Forums: Check community forums or places where others might have encountered and discussed similar messages to see if there is a known cause or solution. As long as your dbt jobs are completing successfully and the data transformations are correct, these messages can be considered non-blocking.Ted Zhao
02/13/2024, 4:46 AMRuntime Error in model int_normalised_metrics (models/gold_pipelines/intermediate/p6/00_normalised_metrics/int_normalised_metrics.py)
Runtime Error
Invalid SessionHandle: SessionHandle [319b3f9b-f635-45b8-8162-f2bbdeea42f3]
databricks-sql-connector adapter: Attempted to close session that was already closed: Invalid SessionHandle: SessionHandle [319b3f9b-f635-45b8-8162-f2bbdeea42f3]
Finished running 35 table models, 1 test, 1 view model in 3 hours 33 minutes and 12.48 seconds (12792.48s).
log from a local trigger of the same job
03:11:57 1 of 7 START python table model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_index_correlations ...... [RUN]
03:11:57 2 of 7 START python table model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_tradingitem_price_stage2 [RUN]
03:11:58 Databricks adapter: Job submission response={‘id’: ‘c599d2b7d6c3408dab90250e5cf165bd’}
03:11:58 Databricks adapter: Job submission response={‘id’: ‘7ac7bb2e8e7a4449b1c81361de738b43’}
03:12:29 1 of 7 OK created python table model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_index_correlations . [OK in 32.07s]
03:49:51 2 of 7 OK created python table model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_tradingitem_price_stage2 [OK in 2273.49s]
03:49:51 3 of 7 START sql view model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_tradingitem_price_stage3 .... [RUN]
03:49:51 databricks-sql-connector adapter: Attempted to close session that was already closed: Invalid SessionHandle: SessionHandle [59be491e-1adb-4140-9a80-b9fb48024298]
03:50:03 3 of 7 OK created sql view model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_tradingitem_price_stage3 [OK in 12.44s]
03:50:03 4 of 7 START python table model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_tradingtem_price_metrics [RUN]
03:50:03 databricks-sql-connector adapter: Attempted to close session that was already closed: Invalid SessionHandle: SessionHandle [2f14f47e-6254-4c51-b910-ba74446c49c3]
03:50:04 Databricks adapter: Job submission response={‘id’: ‘23eb17d2ba0848398c0962cefa7f5536’}
04:07:30 4 of 7 OK created python table model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_tradingtem_price_metrics [OK in 1046.91s]
04:07:30 5 of 7 START python table model <http://dbt_gold_pipelines.int|dbt_gold_pipelines.int>_beta_metrics_input ...... [RUN]
04:07:30 databricks-sql-connector adapter: Attempted to close session that was already closed: Invalid SessionHandle: SessionHandle [32df1be7-98cc-4468-9317-069fbfe6b530]
04:07:40 Databricks adapter: Job submission response={‘id’: ‘57afc8cb71b94f55bbf6542b105f3286’}
Scout
02/13/2024, 4:46 AMdatabricks-sql-connector
during the dbt asset run in Dagster Cloud suggests that there is an issue with the session handle being invalid or already closed. This is a runtime error that seems to be specific to the interaction between dbt, the Databricks SQL connector, and the way Dagster Cloud is executing the dbt job.
When dbt is run locally, the same message about the session being closed appears, but it does not cause the job to fail. This discrepancy could be due to differences in how the local environment and Dagster Cloud handle subprocesses and their exit codes.
In Dagster, when a dbt asset is triggered, it typically runs dbt as a subprocess. The success or failure of the dbt run is often determined by the exit code of the subprocess. If the subprocess exits with a non-zero exit code, it is usually considered a failure. However, if the subprocess exits with a code of zero, it is considered a success.
The error message you're seeing in Dagster Cloud might be causing the subprocess to exit with a non-zero exit code, which Dagster interprets as a failure. This could be due to stricter error handling in the Dagster Cloud environment or differences in how the Databricks SQL connector sessions are managed in that environment.
To investigate this issue further, you can:
1. Check the Exit Code: Determine the exit code of the dbt subprocess in Dagster Cloud. If the exit code is non-zero, it indicates that Dagster is correctly interpreting the run as failed.
2. Review Error Handling: Look into how errors are handled in your dbt project and the Databricks SQL connector. It's possible that the error handling logic might need to be adjusted for the Dagster Cloud environment.
3. Dagster Cloud Configuration: Review the configuration of your Dagster Cloud setup to ensure that it is compatible with the Databricks SQL connector and that there are no settings that might be causing stricter error handling.
4. Dagster and dbt Integration: Ensure that the integration between Dagster and dbt is correctly configured, and consider if any recent changes might have affected the error handling behavior.
5. Consult Documentation: Review the documentation for Dagster, dbt, and the Databricks SQL connector for any notes on error handling and subprocess management.
6. Seek Support: If you're unable to resolve the issue, consider reaching out to Dagster support for assistance, as they may have more context on how dbt assets are run within Dagster Cloud and how to handle such errors.
7. Error Reproduction: Try to reproduce the error in a controlled environment where you can debug and trace the execution more closely. This might help you understand why the error is being raised and how to hanScout
02/13/2024, 4:47 AM