https://dagster.io/ logo
Title
c

CJ

10/06/2022, 2:01 PM
Noob question: Configuring dagster for the first time. The goal is to pull data from a redshift database, shove it through an ML pipeline, and save the results dynamo. I have the assets portion up and running, I've written my op, but I'm having a problem accessing the assets within the op. How does one access assets in an op?
:dagster-bot-resolve: 1
s

sandy

10/06/2022, 9:49 PM
Hey CJ - typically assets are accessed from other assets. Do you mind including a little more context about what you're trying to do with the op?
c

CJ

10/06/2022, 10:04 PM
Thank you for the response Sandy. I had originally planned on creating assets and then picking up the assets in ops once all the files were created. The hope was to store the assets so that if the pipeline failed, it could pick up where it left off rather than having to download all the data again. I ended up rewriting it all using ops and a graph. I'm not entirely sure if the same concept will work if a piece of the pipeline past the data downloads fails. I think I can use a retry policy, although I'm not sure if I'd have to define the downloaded file. 🤔 I'm now trying to figure out how to add MLFlow tracking. I used a context manager prior, but I'm working through how that works with a graph. Open to any pointers or any docs beyond https://docs.dagster.io/_apidocs/libraries/dagster-mlflow
s

sandy

10/06/2022, 10:29 PM
Got it - glad you're unblocked on that. I think the doc that you linked is the main resource we have on mlflow