Hello everyone Has anyone faced failure from large backfills dagster #ask-community

Hello everyone, Has anyone faced failure from larg...

R Lucas

01/31/2023, 7:33 PM

Hello everyone, Has anyone faced failure from large backfills (100+ partitions) ? I'm still testing some configuration using local/dev deployement (with

dagster dev

and local code locations) and have tried launching 2 large backfills which failed with the following errors:

Copy code

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses"
	debug_error_string = "{"created":"@1675190530.356000000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3261,"referenced_errors":[{"created":"@1675190530.356000000","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"

Do you know what could be causing this issue ?

Balázs Dukai

04/11/2023, 10:28 AM

I'm facing a similar issue. I have about 1400 partitions and after I launch a backfill, it runs until about 300 partitions completed. Then the execution stops I get the same error as you do. Did you manage to find a solution?

R Lucas

04/11/2023, 11:52 AM

Hello Balázs, Unfortunately I didn't find a solution. Since this was a one-shot operation (full-refresh), the "workaround", at the time, was to split into smaller backfills (with a smaller set of partitions). Following the full-refresh, all of our assets are materialized either through schedules or sensors and we didn't have a need to use backfill on large amount of partitions. I'm not sure if newer versions of dagster have solved this issue. If not, I would advise you to raise an issue on their Github.

Balázs Dukai

04/11/2023, 11:55 AM

Hi Lucas, thanks for responding. Yeah, basically that's what I'm doing right this moment, selecting smaller subsets and running those, because I don't know what else to do. I'm using a fairly recent dagster, so I'll raise an issue.

Balázs Dukai

04/15/2023, 11:07 AM

I upgraded to version 1.2.7 (from 1.2.3) and this issue seems to be fixed.

Open in Slack

Previous Next