Document processing.
Hello, I have a few questions coming up with this use case.
I have a bunch of document I want to process. Those documents have different template. For the example, I want to extract some meta data on both types. First I concentrate on the first document then I might change my pipeline to take both types.
I want my pipeline just execute at the minimum. Document already extracted do nothing, other extract.
How can I do that ? How can I see if an asset is already there ? Other questions I might not think about