https://dagster.io/ logo
#integration-dbt
Title
# integration-dbt
g

geoHeil

06/10/2023, 2:28 PM
@rex even though the old API was parsing the manifest files on demand it was approx. 33 seconds faster. The only major difference is that now I also extract the affected row numbers - they were not available in the old API.
see the box marking the jumping change after deploying the experimental APIs
r

rex

06/10/2023, 4:00 PM
i wonder if this is because dbt is not using partial parsing since the target path is now dynamic. Do you have links to Dagster runs, before and after deploying the experimental APIs?
if there’s a log that says
Copy code
13:49:59  Unable to do partial parsing because config vars, config profile, or config target have changed
then this is probably the reason why execution is slower than expected
g

geoHeil

06/10/2023, 5:39 PM
I saw these logs before. Let me search the logs and then share the links
r

rex

06/12/2023, 8:55 PM
Yeah, this looks related to project parsing. Because we’re dynamically setting the target path, I believe some optimizations are being skipped. I can see if there’s a way to continue to take advantage of this. We could probably do something like copy the
partial_parse.msgpack
file to the dynamic target path before invoking the CLI. This is like the equivalent of
Reusing Objects
in the dbt programatic invocation case. But since this API isn’t stable yet, there’s no first-class way to do this right now.
g

geoHeil

06/13/2023, 10:23 AM
so once this gets merged it should be sped up automatically?
r

rex

06/13/2023, 1:35 PM
You would get speed up in instances where partial parsing is possible.
g

geoHeil

06/13/2023, 1:35 PM
dagster pride
r

rex

06/13/2023, 1:35 PM
To truly confirm the root of the issue with the performance degradation you’re seeing, we would need to see a pyspy profile
g

geoHeil

06/13/2023, 3:27 PM
Would I somehow have the possibility to get this from the dagit UI? I guess not but this might be a feature request. If I connect directly to the docker image and start a container can I easily run a pipeline with the credentials from dagster cloud?
r

rex

06/13/2023, 3:58 PM
Let’s wait for the partial parsing change to land before going down this rabbithole. The partial parsing is probably the cause of your problems, since I don’t see how there could be any other slow down from calling the dbt CLI.
🌈 1
For future reference, here’s some instructions to get the py-spy profile in Cloud: https://github.com/dagster-io/dagster/discussions/14771
g

geoHeil

06/18/2023, 4:11 PM
I am now seeing: /partial_parse.msgpack` to take advantage of partial parsing. and an improved time. I still have to run for several more days to tell exactly if it is working well and if it is now overall faster than before or if more needs to be done. But this certainly is a step in the right direction
19 Views