Shane Keller
03/05/2024, 11:44 PMScout
03/05/2024, 11:44 PMpython from dagster import job, op, graph @op def fetch_data(): # Imagine this function fetches data from a source and returns it return {"data": "some data"} @op def process_data(data): # This function processes the data in some way and returns a new form return f"processed {data}" @op def store_data(processed_data): # This function stores the processed data somewhere pass @graph def process_and_store(data): processed_data = process_data(data) store_data(processed_data) @job def my_job(): raw_data = fetch_data() process_and_store(raw_data)
In this example: - fetch_data
is an op that fetches data and returns it. - process_data
is an op that takes data as input and processes it. - store_data
is an op that takes processed data and stores it. - process_and_store
is a graph that takes data, processes it with process_data
, and then stores it with store_data
. - my_job
is a job that invokes the fetch_data
op and then feeds its output into the process_and_store
graph. When my_job
is executed, Dagster will run the ops in the order defined by their dependencies: fetch_data
will run first, followed by process_data
, and finally store_data
. The output of each op is passed to the next op as defined by the connections in the graph.