Hello, I’ve inherited a dagster pipeline and still...
# ask-community
Hello, I’ve inherited a dagster pipeline and still getting to grips with it. We’re doing some refactoring. The original pipeline has some pandas dataframe objects which go through some ML model training. The minor changes I’m trying to implement are some filtering on the dataframe. e.g. df = df[df[‘my_column’] == 0] But this runs into some Type issues. e.g.
AttributeError: 'InputMappingNode' object has no attribute 'copy'
Now obviously the type of this object should be a dataframe, but there’s some interaction which changes the type? Does anyone know how to interact with dataframe objects in the ‘normal’ way? Struggling to find the right information in the dagster docs or google
dagster bot responded by community 1
From that error I'd assume you're trying to do said filtering inside a job/pipeline definition.
Job/pipeline definitions only really define dependencies between ops/solids. Your filtering should be done inside one of these ops/solids.
it’s inside a @graph at runtime, and the graph is inside a job. So I need to make an op to do this filtering, and place the op inside the graph?
yep, a graph is also only defining dependencies between ops, forgot to include it, sorry
great, thank you, will try that
yeah, or doing it inside one of the existing ops if it makes sense there
👍 1