Hi Dagsters, Assume I have an op that queries dat...
# ask-community
q
Hi Dagsters, Assume I have an op that queries data and returns a pandas data frame. I checked op works as expected, but my main graph considers the result from the op as a list. Is there a way that makes Dagster recognize this is a data frame. I used dagster_pandas but had the same result. I use Default IOManager. My error
AttributeError: 'list' object has no attribute 'itertuples'
Copy code
@op(required_resource_keys={"analytic_db"},
    out=Out(pd.DataFrame),
)
def query_foo_data(context) -> pd.DataFrame:
    engine = create_engine(context.resources.analytic_db)
    query = """select * from foo limit 1;"""
    df = pd.read_sql_query(query, con=engine.connect())
    return df

@graph
def main_graph():
    result = query_foo_data()
    for o in result.itertuples():
        pass
dagster bot responded by community 1
j
Hi Quy, does your code work if you create a new op to do the iteration you want to do after the first step?
Copy code
@op(required_resource_keys={"analytic_db"},
    out=Out(pd.DataFrame),
)
def query_foo_data(context) -> pd.DataFrame:
    engine = create_engine(context.resources.analytic_db)
    query = """select * from foo limit 1;"""
    df = pd.read_sql_query(query, con=engine.connect())
    return df

@op
def downstream_processing(data):
    for o in data.itertuples():
        pass
Copy code
@graph
def main_graph():
    downstream_processing(query_foo_data())
In general, it doesn't seem possible to access the attributes of an op's output from within a graph, even if you've annotated the op with the output type. As I understand it, the graph is strictly for wiring the underlying ops, not for doing computation on the output of the ops. I'm speaking from my own experience and previous answers from Owen, so maybe someone else has a better answer. IMO it would be super helpful if the graph documentation was more explicit about this.
❤️ 1