An Unbiased Review of Snowflake’s Document AI

As data , we’re comfortable with tabular data…

Tabular data. Image by Writer.

We may handle words, json, xml feeds, and pictures of cats. But what a couple of cardboard box stuffed with things like this?

The data on this receipt wants so badly to be in a tabular database somewhere. Wouldn’t it’s great if we could scan all these, run them through an LLM, and save the leads to a table?

Lucky for us, we live within the era of Document Ai. Document AI combines OCR with LLMs and allows us to construct a bridge between the paper world and the digital database world.

All the key cloud vendors have some version of this…

Here I’ll share my thoughts on Snowflake’s Document AI. Except for using Snowflake at work, I even have no affiliation with Snowflake. They didn’t commission me to write down this piece and I’m not a part of any ambassador program. All of that’s to say I can write an review of Snowflake’s Document AI.

What’s Document AI?

Document AI allows users to quickly extract information from digital documents. Once we say “documents” we mean pictures with words. Don’t confuse this with area of interest NoSQL things.

The product combines OCR and LLM models in order that a user can create a set of prompts and execute those prompts against a big collection of documents suddenly.

Snowflake’s Document AI on a (scrubbed) resume. Image by writer.

LLMs and OCR each have room for error. Snowflake solved this by (1) banging their heads against OCR until it’s sharp — I see you, Snowflake developer — and (2) letting me fine-tune my LLM.

Fantastic-tuning the Snowflake LLM feels rather a lot more like glamping than some rugged outdoor adventure. I review 20+ documents, hit the “train model” button, then rinse and repeat until performance is satisfactory. Am I even a knowledge scientist anymore?

Once the model is trained, I can run my prompts on 1000 documents at a time. I like to avoid wasting the outcomes to a table but you can do whatever you would like with the outcomes real time.

Why does it matter?

This product is cool for several reasons.

You may construct a bridge between the paper and digital world. I never thought the large box of paper invoices under my desk would make it into my cloud data warehouse, but now it may well. Scan the paper invoice, upload it to snowflake, run my Document AI model, and wham! I even have my desired information parsed right into a tidy table.
It’s frighteningly convenient to invoke a machine-learning model via SQL. Why didn’t we expect of this sooner? In a old times this was a number of hundred of lines of code to load the raw data (SQL >> python/spark/etc.), clean it, engineer features, train/test split, train a model, make predictions, after which often write the predictions back into SQL.

To construct this in-house can be a serious undertaking. Yes, OCR has been around a protracted time but can still be finicky. Fantastic-tuning an LLM obviously hasn’t been around too long, but is getting easier by the week. To piece these together in a way that achieves high accuracy for quite a lot of documents could take a protracted time to hack on your individual. Months of months of polish.

After all some elements are still in-built house. Once I extract information from the document I even have to work out what to do with that information. That’s relatively quick work, though.

Our Use Case — Bring on Flu Season:

I work at an organization called IntelyCare. We operate within the healthcare staffing space, which implies we help hospitals, nursing homes, and rehab centers find quality clinicians for individual shifts, prolonged contracts, or full-time/part-time engagements.

A lot of our facilities require clinicians to have an up-to-date flu shot. Last 12 months, our clinicians submitted over 10,000 flu shots along with a whole bunch of 1000’s of other documents. We manually reviewed all of those manually to make sure validity. A part of the enjoyment of working within the healthcare staffing world!

Spoiler Alert: Using Document AI, we were capable of reduce the variety of flu-shot documents needing manual review by ~50% and all in only a few weeks.

To drag this off, we did the next:

Uploaded a pile of flu-shot documents to snowflake.

Massaged the prompts, trained the model, massaged the prompts some more, retrained the model some more…
Built out the logic to match the model output against the clinician’s profile (e.g. do the names match?). Definitely some trial and error here with formatting names, dates, etc.
Built out the “decision logic” to either approve the document or send it back to the humans.
Tested the total pipeline on greater pile of manually reviewed documents. Took a detailed have a look at any false positives.
Repeated until our confusion matrix was satisfactory.

For this project, false positives pose a business risk. We don’t wish to approve a document that’s expired or missing key information. We kept iterating until the false-positive rate hit zero. We’ll have some false positives eventually, but fewer than what we have now now with a human review process.

False negatives, nevertheless, are harmless. If our pipeline doesn’t like a flu shot, it simply routes the document to the human team for review. In the event that they go on to approve the document, it’s business as usual.

The model does well with the clean/easy documents, which account for ~50% of all flu shots. If it’s messy or confusing, it goes back to the humans as before.

Things we learned along the way in which

Initially, our prompts attempted to find out validity of the document.

Bad:

We found it simpler to limit our prompts to questions that could possibly be answered by the document. The LLM doesn’t anything. It just grabs the relevant data points off the page.

Good:

Save the outcomes and do the mathematics downstream.

You continue to should be thoughtful about training data

We had a number of duplicate flu shots from one clinician in our training data. Call this clinician Ben. One in all our prompts was, “what’s the patient’s name?” Because “Ben” was within the training data multiple times, any remotely unclear document would return with “Ben” because the patient name.

So overfitting remains to be a thing. Over/under sampling remains to be a thing. We tried again with a more thoughtful collection of coaching documents and things did significantly better.

Document AI is pretty magical, but not magical. Fundamentals still matter.

The model could possibly be fooled by writing on a napkin.

To my knowledge, Snowflake doesn’t have a solution to render the document image as an embedding. You may create an embedding from the extracted text, but that won’t let you know if the text was written by hand or not. So long as the is valid, the model and downstream logic will give it a green light.

You could possibly fix this gorgeous easily by comparing image embeddings of submitted documents to the embeddings of accepted documents. Any document with an embedding way out in left field is shipped back for human review. This is easy work, but you’ll should do it outside Snowflake for now.

Not as expensive as I used to be expecting

Snowflake has a popularity of being spendy. And for HIPAA compliance concerns we run a higher-tier Snowflake account for this project. I are likely to worry about running up a Snowflake tab.

Ultimately we needed to try extra hard to spend greater than $100/week while training the model. We ran 1000’s of documents through the model every few days to measure its accuracy while iterating on the model, but never managed to interrupt the budget.

Higher still, we’re saving money on the manual review process. The prices for AI reviewing 1000 documents (approves ~500 documents) is ~20% of the associated fee we spend on humans reviewing the remaining 500. All in, a 40% reduction in costs for reviewing flu-shots.

Summing up

I’ve been impressed with how quickly we could complete a project of this scope using Document AI. We’ve gone from months to days. I give it 4 stars out of 5, and am open to giving it a fifth star if Snowflake ever gives us access to image embeddings.

Since flu shots, we’ve deployed similar models for other documents with similar or higher results. And with all this prep work, as an alternative of dreading the upcoming flu season, we’re able to bring it on.

An Unbiased Review of Snowflake’s Document AI

What’s Document AI?

Why does it matter?

Our Use Case — Bring on Flu Season:

Things we learned along the way in which

Summing up

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

On the Challenge of Converting TensorFlow Models to PyTorch

Synthesized Data for Sovereign AI

Using AI to perceive the universe in greater depth

YOLOv1 Paper Walkthrough: The Day YOLO First Saw the World

A Latest Learning Track on DataCamp

An Unbiased Review of Snowflake’s Document AI

What’s Document AI?

Why does it matter?

Our Use Case — Bring on Flu Season:

Things we learned along the way in which

Summing up

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.