Rui Zhang
02/06/2024, 6:42 PMScout
02/06/2024, 6:42 PMpython @sensor(job=log_file_job) def my_directory_sensor_cursor(context): last_mtime = float(context.cursor) if context.cursor else 0 max_mtime = last_mtime for filename in os.listdir(MY_DIRECTORY): filepath = os.path.join(MY_DIRECTORY, filename) if os.path.isfile(filepath): fstats = os.stat(filepath) file_mtime = fstats.st_mtime if file_mtime <= last_mtime: continue # the run key should include mtime if we want to kick off new runs based on file modifications run_key = f"{filename}:{file_mtime}" run_config = {"ops": {"process_file": {"config": {"filename": filename}}}} yield RunRequest(run_key=run_key, run_config=run_config) max_mtime = max(max_mtime, file_mtime) context.update_cursor(str(max_mtime))
In this example, the update_cursor
method is called after the RunRequest
is yielded, with the max_mtime
value that represents the most recent modification time encountered during the sensor evaluation. This pattern ensures that the next time the sensor runs, it will only consider files that have been modified after the max_mtime
value currently stored in the cursor.