The 80/20 problem of generative AI — a UX research insight

-

When an LLM solves a task 80% accurately, that usually only amounts to twenty% of the user value.

The Pareto principle says when you solve an issue 20% through, you get 80% of the worth. The other appears to be true for generative AI.

Concerning the writer: Zsombor Varnagy-Toth is a Sr UX Researcher at SAP with background in machine learning and cognitive science. Working with qualitative and quantitative data for product development.

I first realized this as I studied professionals writing marketing copy using LLMs. I observed that when these professionals start using LLMs, their enthusiasm quickly fades away, and most return to their old way of manually writing content.

This was an utterly surprising research finding because these professionals acknowledged that the AI-generated content was not bad. The truth is, they found it unexpectedly good, say 80% good. But when that’s so, why do they still fall back on creating the content manually? Why not take the 80% good AI-generated content and just add that last 20% manually?

Here is the intuitive explanation:

If you’ve gotten a mediocre poem, you’ll be able to’t just turn it into an incredible poem by replacing a number of words here and there.

Say, you’ve gotten a house that’s 80% well built. It’s kind of OK, however the partitions will not be straight, and the foundations are weak. You may’t fix that with some additional work. You might have to tear it down and begin constructing it from the bottom up.

We investigated this phenomenon further and identified its root. For these marketing professionals if a bit of copy is simply 80% good, there isn’t a individual piece within the text they may swap that might make it 100%. For that, the entire copy must be reworked, paragraph by paragraph, sentence by sentence. Thus, going from AI’s 80% to 100% takes almost as much effort as going from 0% to 100% manually.

Now, this has an interesting implication. For such tasks, the worth of LLMs is “all or nothing.” It either does a superb job or it’s useless. There’s nothing in between.

We checked out a number of various kinds of user tasks and figured that this reverse Pareto principle affects a particular class of tasks.

  • Not easily decomposable and
  • Large task size and
  • 100% quality is predicted

If considered one of these conditions will not be met, the reverse Pareto effect doesn’t apply.

Writing code, for instance, is more composable than writing prose. Code has its individual parts: commands and functions that may be singled out and stuck independently. If AI takes the code to 80%, it really only takes about 20% extra effort to get to the 100% result.

As for the duty size, LLMs have great utility in writing short copy, similar to social posts. The LLM-generated short content remains to be “all or nothing” — it’s either good or worthless. Nevertheless, due to brevity of those pieces of copy, one can generate ten at a time and spot the most effective one in seconds. In other words, users don’t have to tackle the 80% to 100% problem — they only pick the variant that got here out 100% in the primary place.

As for quality, there are those use cases when skilled grade quality shouldn’t be a requirement. For instance, a content factory could also be satisfied with 80% quality articles.

In the event you are constructing an LLM-powered product that deals with large tasks which are hard to decompose however the user is predicted to provide 100% quality, you could construct something across the LLM that turns its 80% performance into 100%. It could possibly be a classy prompting approach on the backend, an extra fine-tuned layer, or a cognitive architecture of varied tools and agents that work together to iron out the output. Whatever this wrapper does, that’s what gives 80% of the shopper value. That’s where the treasure is buried, the LLM only contributes 20%.

This conclusion is in step with Sequoia Capital’s Sonya Huang’s and Pat Grady’s assertion that the subsequent wave of value within the AI space shall be created by these “last-mile application providers” — the wrapper firms that determine the way to jump that last mile that creates 80% of the worth.

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x