Separate question was wondering how GPU jobs were orchestrat dagster #ask-community

Separate question, was wondering how GPU jobs were...

Abhinav Ayalur

12/05/2022, 1:37 AM

Separate question, was wondering how GPU jobs were orchestrated? We have some large GPU tasks that we'd like to write in, and were wondering if there was a way to have an init function that instantiates a model so we don't have to instantiate it for each call of a task, and that we could maybe have "warm tasks" running that could pickup these jobs faster without having to initialize the model again?

dagster bot responded by community 1

Oliver

12/05/2022, 3:01 AM

I don't think there is a built in way here but you could have an asset that creates a serving endpoint and then use that in you're inference job

Zach

12/05/2022, 4:26 PM

Yeah what Oliver said seems like it would work. You could also model it as a resource that your ops are dependent on as they get initialized before each op. You'd just need to make the resource idempotent as it'll get initialized for every op / resource that is dependent on it.

Abhinav Ayalur

12/05/2022, 5:46 PM

I see, them my follow up is how does "spinning up workers" work in that case, like if I had 500 requests, would a new resource be initialized every time?

Zach

12/05/2022, 5:57 PM

yes it would. If serializing / deserializing the model is possible and doesn't take too long you could make a resource that takes a path to the model on disk - if it exists, load it from there, if not, create it, save it to disk, and return the model to the op as an input.

3 Views

Open in Slack

Previous Next