Hi team, I am trying to backfill a pipeline but th...
# ask-community
m
Hi team, I am trying to backfill a pipeline but the process keeps on failing with the following message. Any idea how to solve?
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.DEADLINE_EXCEEDED
details = "Deadline Exceeded"
debug_error_string = "{"created":"@1651039595.624784023","description":"Error received from peer ipv6:[::1]:4266","file":"src/core/lib/surface/<http://call.cc|call.cc>","file_line":1074,"grpc_message":"Deadline Exceeded","grpc_status":4}"
I guess I need to pass/set somewhere an increased grpc timeout value (defaulting to
DEFAULT_GRPC_TIMEOUT
)?
d
Hi Marco - do you have a full stack trace for this? This indicates that a gRPC call is taking more than 60 seconds to run, which would typically indicate user code that takes a really long time to run, or a bug on our side making it take much longer than it should be. Is it possible to share the code for the parittion set that's taking a really long time to generate config for the backfill?
m
Hi Daniel, thanks. the pipeline runs a solid that takes a long time to execute - might this be the issue? Or it is really the config generation? If you can narrow it down I might be able to share specific bits.
d
The full stack trace of the deadline exceeded would be really useful - a slow op shouldn't make a difference here
since the actual runs happen in a different process - this is likely just generating config for hte backfill
m
Does this help?
d
It does - what version of dagster is this, and is it possible to share the code of the partition set being backfilled?
m
0.11.11
I'll try to share the code but it's not going to be trivial as there are partials, configs etc
Do you have a feeling For what is mostly relevant?
d
if upgrading would ever be on the table, there have been substantial performance improvements and bugfixes since 0.11.11
the most likely cause for this is calling the function on your partition set that generates the partitions taking a very long time
m
The code that generates the partitions config is the same that I use elsewhere (normally without problems). That said, I did have similar issues before. I have reduced the number of concurrent runs, hopefully that will help. I’ll also look into unpgrading; I am a bit concerned with backward compatibility - do you expect there might be issues in this respect?
d
I don't expect significant backwards-compatible issues with your code - you'd need to migrate your storage though. The list of breaking changes between minor releases can be found here: https://github.com/dagster-io/dagster/blob/master/MIGRATION.md