Is there a way to support an additional custom ste...
# ask-community
g
Is there a way to support an additional custom step for backfills (e.g. truncate the SDA)? My pipelines are idempotent but it's still a bit easier to track the process if I empty it all to start with instead of overwriting partition at a time. Will accept design advice if this is silly of me.
c
do you mean like wiping the existing assets for each partition?
g
Yeah just truncate and load style.
c
https://docs.dagster.io/_apidocs/cli#dagster-asset we have the
dagster asset wipe
CLI which will remove the dagster logs of the assets, but if you're talking about deleting the actual code artifacts for old runs, I think you're left to including additional steps at the beginning of runs to do so.
g
Well I'm most after a way to customise a backfill for an asset
c
I don't think there's a way to explicitly add custom additional functionality for backfills only, if I'm correct in that description. Will poll the team for advice on your use case though.
p
The workaround that I can think of is that we currently expose tags (including backfill tags) off of the run on the context. You might be able to add a truncation step to your job and check to see if the currently executing run is a backfill run.
Copy code
@op
def truncate_something(context):
    if not context.pipeline_run.tags.get("dagster/backfill):
        # do nothing
    else:
        # truncate
s
@George Pearse would you ideally do the entire backfill in a single step, instead of a step per partition?
g
Hey @sandythat is exactly the sort of thing I'm thinking. For backfills I'm likely to want to use a different loading process that would be more optimised for the volume of data + if the query to select the time window is slow, just not using it could probably a save a fair chunk of time? These are all design considerations with trade offs though.
@prha workaround doesn't look horribly work around ish, not sure how it'd fit into my workflow with SDAs though, and get the 'right' metadata output to the Dagit UI to properly represent what I've done
s
@George Pearse - that makes sense. We've built the internals with an eye towards eventually enable this - we pass around asset partition ranges instead of single partition keys. Here's an issue to track this: https://github.com/dagster-io/dagster/issues/8706.
❤️ 1