https://dagster.io/ logo
#ask-community
Title
# ask-community
g

geoHeil

07/24/2022, 4:00 PM
I have a sensor with a date based cursor. when resetting the cursor in dagits UI i.e. from 2021-12-14 to 2021-12-07 to reprocess old data this is not piced up and the job which is scheduled is still referring to the old data. How can I allow in dagits UI to actually reset the state of the cursor?
Strangely, also the manual backfill does not seem to work and is stuck in incomplete state
dagster sensor cursor flow_ingestion_sensor --set 2022-12-07 seems to work (partially) i.e. the cursor is reseted successfully (strangely not when setting the value in the UI.
However, the path generated now no longer resolves. When deleting dagster`s DB and running it again the same path works just fine.
@daniel do you have an idea what is going on?
d

daniel

07/25/2022, 8:25 PM
what do you mean by 'this is not picked up' exactly? context.cursor is still the old value?
can you share your sensor code?
g

geoHeil

07/25/2022, 8:53 PM
sure - here you go: https://gist.github.com/geoHeil/a7cc0b70e31f3e946a9ffed14d84a3ed 1) not picking up: When setting this value from the UI indeed, still the old value is being used. When using the CLI a suitable one is used. 2) however, dagster now complains about not being able to find the file path. (but when printing the path it is the same one which worked before
d

daniel

07/25/2022, 8:55 PM
are you sure the sensor whose cursor your editing is the one that's running, i.e. there aren't two with the same name or something? can you post or DM a screenshot of your Sensors page ?
the CLI and the UI should be writing to the same place
g

geoHeil

07/25/2022, 9:00 PM
I am pretty certain that I am referring to only one sensor
d

daniel

07/25/2022, 9:00 PM
is there a reason it says the sensor daemon isn't running? depending on what's going on there, that could be relevant
if you set it while the sensor is stopped, do you have any different results?
(update it in the UI while the sensor is stopped, that is)
g

geoHeil

07/25/2022, 9:09 PM
let me check this
1. stop dagit 2. delete dagits DB 3. start dagit & daemon - job is triggered for path: upload/my_file_20211214.zip for a sensor cursor of 2021-12-14 4. setting sensor from UI: 2021-12-07: Skipping 1 run for sensor my_sensor already completed with run keys: ["upload/my_file_20211214.zip" How can I overwrite the sensor to allow a backfill for a runkey? Should I disable:
return j.run_request_for_partition(next_date, run_key=path)
the runkey here and only keep the partition? 5. CLI dagster sensor cursor my_sensor --set 2022-12-07 Sensor my_sensor skipped: Did not find file upload/my_file_20221214.zip I think the runkey is preventing interaction from the UI to properly materialize the job. However, why is the CLI delivering a different result i.e. a) not a skipped run key and b) skipping due to the path not being found --> see (3) the same path triggered a job nicely before
d

daniel

07/25/2022, 9:21 PM
oh i see - the reason it wasn't working is because it still had the run keys even though you changed the cursor?
g

geoHeil

07/25/2022, 9:21 PM
seems so (from the UI) but as written the result for CLI seems to be different.
d

daniel

07/25/2022, 9:21 PM
one unsatisfying way to 'backfill' is to change the format of the run key, but i don't think we currently have a way to delete run keys en masse
g

geoHeil

07/25/2022, 9:23 PM
I think I could simply remove the runkey
(and rely on partitions only)
The more problematic thing is (5).
d

daniel

07/25/2022, 9:25 PM
I can file an issue to investigate htat more - don't quite have cycles at the moment to dig deeper
@Dagster Bot issue investigate sensor CLI incorrectly skipping run when you try to set the cursor
d

Dagster Bot

07/25/2022, 9:25 PM
g

geoHeil

07/25/2022, 9:30 PM
Understood. Is there one of your colleagues available who might have some more time for this specific issue?
However: TypeError: run_request_for_partition() missing 1 required positional argument: 'run_key'
It looks like manually running the backfill from the backfills page can work as a suitable workaround
But when having jobs interact from different workspaces it would be great if partitioned backfills actually work
(also via a sensor)
p

prha

08/23/2022, 10:57 PM
@geoHeil did you ever get to the bottom of this skipped call? My immediate thought is there might be something going on with the sftp resource which changes depending on when it’s invoked…
g

geoHeil

08/24/2022, 6:29 AM
unfortunately, no
5 Views