https://dagster.io/ logo
#dagster-cloud
Title
# dagster-cloud
s

Seth Kimmel

06/14/2022, 4:10 PM
Not sure if this a cloud specific question, but thought I'd ask here. We ran a job, and a single op failed due to some underlying data issues. We corrected the issues, and I tried to re-run that op in isolation, but it creates a different run and fails to locate the upstream dependencies for that op, so it won't succeed. This behavior seems different than dagit run locally, where I don't recall ever having dependency issues re-running failed ops in isolation.
d

daniel

06/14/2022, 4:36 PM
Do you have a link to the job? If the isolated op has inputs that are outputs from other ops, I think it would need to pull in those inputs using the iomanager, even if its just a single op being run
s

Seth Kimmel

06/14/2022, 4:38 PM
It does have a "dummy" input. This is probably an anti-pattern in dagster, but was the only I found a number of months ago to pass multiple upstream deps to an op, and it percolated through the rest of my code.
d

daniel

06/14/2022, 4:38 PM
What are you using as your io manager?
s

Seth Kimmel

06/14/2022, 4:38 PM
But when you re-run a failed op, why would it create a new run with no reference to the previous job that had failed?
d

daniel

06/14/2022, 4:39 PM
I would expect it to have a reference to the previous job - do you have a link to the job in cloud?
s

Seth Kimmel

06/14/2022, 4:39 PM
Sure - I'll share in our private channel
d

daniel

06/14/2022, 4:40 PM
it should show up in the run lineage on the right hand side of the runs page
s

Seth Kimmel

06/14/2022, 4:40 PM
sorry - do you want the url or the run ID?
d

daniel

06/14/2022, 4:40 PM
either or
s

Seth Kimmel

06/14/2022, 4:41 PM
Job with op failure:
85ee54f8
. Attempted re-runs:
2ab44583
,
6d631844
d

daniel

06/14/2022, 4:42 PM
if you click on the runs page for that re-run, do you see the run lineage like the one I posted on the right hand side linking it to the previous runs?
s

Seth Kimmel

06/14/2022, 4:44 PM
I don't actually
Happy to dig in with you if you'd like
d

daniel

06/14/2022, 4:45 PM
You don't see something like this if you click through to the timeline view for the run?
s

Seth Kimmel

06/14/2022, 4:46 PM
ah yes
It has the original failed job in the upstream
So not sure why it can't grab the ref
d

daniel

06/14/2022, 4:47 PM
You would need to be using an iomanager that will persist the output somewhere that a new run can find it - each run is in its own ECS task, so that would need to be something like s3, rather than the default filesystem IO manager
alternately, if you don't actually need the output/input since its a dummy, we can see if there are ways of removing it
s

Seth Kimmel

06/14/2022, 4:49 PM
Ah - gotcha
I think it's probably worth going with the more general approach of adding an s3 iomanager
👍 1
but yeahhh maybe cleaning up that pattern in code would be good too
y

Yeachan Park

10/26/2022, 12:56 PM
I assume since a new run get's created, it resets the status of the rest of OPs after re-running a specific OP of a partition. Is there a way we can preserve the status from a previous run in the partition view? E.g. this is what it looks like after backfilling a specific OP from a job that ran previously
d

daniel

10/26/2022, 1:28 PM
Hi Yeachen - in each row It should be using the op from the most recent run that executes that op. If you have a link to a partitions page that isn’t behaving that way we would be happy to take a look
y

Yeachan Park

10/26/2022, 2:46 PM
Oh strange, then maybe I'm doing something wrong? So I initially successfully ran a partition (e6260714), then created a backfill (03cb2642) by using
Step subset
to select a specific (failed) OP to re-run for that partition. That resulted in the green OP status circles turning grey (i.e image above) for all the other successful ops from e6260714. The only one that's green now is the OP that ran via
step subset
. Functionally, what I want to do is just to run a failed OP again and see that all the OPs ran successfully in the overview page, without having to run all the OPs again.
d

daniel

10/26/2022, 2:47 PM
Do you have a link handy to the partitions page? I can use that to pull it up in our logs
y

Yeachan Park

10/26/2022, 2:49 PM
Ah sorry, I found this via search, didn't realise it was in the dagster-cloud channel. We're on open-source
d

daniel

10/26/2022, 2:49 PM
Ah got - would you mind making a new post either here or in #dagster-support? I can surface it to our support oncall
er in dagster-support would be best actually if its open-source
👍 1
y

Yeachan Park

10/26/2022, 3:15 PM
4 Views