Quy
06/18/2023, 4:44 PMAttributeError: 'list' object has no attribute 'itertuples'
@op(required_resource_keys={"analytic_db"},
out=Out(pd.DataFrame),
)
def query_foo_data(context) -> pd.DataFrame:
engine = create_engine(context.resources.analytic_db)
query = """select * from foo limit 1;"""
df = pd.read_sql_query(query, con=engine.connect())
return df
@graph
def main_graph():
result = query_foo_data()
for o in result.itertuples():
pass
Justin Taylor
06/19/2023, 12:31 PM@op(required_resource_keys={"analytic_db"},
out=Out(pd.DataFrame),
)
def query_foo_data(context) -> pd.DataFrame:
engine = create_engine(context.resources.analytic_db)
query = """select * from foo limit 1;"""
df = pd.read_sql_query(query, con=engine.connect())
return df
@op
def downstream_processing(data):
for o in data.itertuples():
pass
@graph
def main_graph():
downstream_processing(query_foo_data())
In general, it doesn't seem possible to access the attributes of an op's output from within a graph, even if you've annotated the op with the output type. As I understand it, the graph is strictly for wiring the underlying ops, not for doing computation on the output of the ops. I'm speaking from my own experience and previous answers from Owen, so maybe someone else has a better answer. IMO it would be super helpful if the graph documentation was more explicit about this.