Making sense of this mess

-


Steven Liu's avatar

Once I joined Hugging Face nearly 3 years ago, the Transformers documentation was very different from its current form today. It focused on text models and how you can train or use them for inference on natural language tasks (text classification, summarization, language modeling, etc.).

The fundamental version of the Transformers documentation today in comparison with version 4.10.0 from nearly 3 years ago.

As transformer models increasingly became the default approach to approach AI, the documentation expanded significantly to incorporate recent models and recent usage patterns. But recent content was added incrementally without really considering how the audience and the Transformers library have evolved.

I believe that is the rationale why the documentation experience (DocX) feels disjointed, difficult to navigate, and outdated. Mainly, a large number.

That is why a Transformers documentation redesign is needed to make sense of this mess. The goal is to:

  1. Write for developers thinking about constructing products with AI.
  2. Allow organic documentation structure and growth that scales naturally, as a substitute of rigidly adhering to a predefined structure.
  3. Create a more unified documentation experience by integrating content reasonably than amending it to the present documentation.



A brand new audience

The Transformers documentation was initially written for machine learning engineers and researchers, model tinkerers.

Now that AI is more mainstream and mature, and not only a fad, developers are growing thinking about learning how you can construct AI into products. This implies realizing developers interact with documentation in another way than machine learning engineers and researchers do.

Two key distinctions are:

  • Developers typically start with code examples and are looking for an answer to something they’re trying to unravel.
  • Developers who aren’t acquainted with AI could be overwhelmed by Transformers. The worth of code examples are reduced, or worse, useless, in case you don’t understand the context during which they’re used.

With the redesign, the Transformers documentation might be more code-first and solution-oriented. Code and explanation of beginner machine learning concepts might be tightly coupled to supply a more complete and beginner-friendly onboarding experience.

Once developers have a basic understanding, they will progressively level up their Transformers knowledge.



Toward a more organic structure

Considered one of my first projects at Hugging Face was to align the Transformers documentation with Diátaxis, a documentation approach based on user needs (learning, solving, understanding, reference).

But somewhere along the way in which, I began using Diátaxis as a plan as a substitute of a guide. I attempted to force content to suit neatly into certainly one of the 4 prescribed categories.

Rigidity prevented naturally occurring content structures from emerging and prevented the documentation from adapting and scaling. Documentation about one topic soon spanned several sections, since it was what the structure dictated, not since it made sense.

It’s okay if the structure is complex, however it’s not okay if it’s complex and demanding to search out your way around.

The redesign will replace rigidity with flexibility to enable the documentation to grow and evolve.



Integration versus amendment

Tree rings provide a climatological record of the past (drought, flood, wildfire, etc.). In a way, the Transformers documentation also has its own tree rings or eras that capture its evolution:

  1. Not only text era: Transformer models are used across other modalities like computer vision, audio, multimodal, and not only text.
  2. Large language model (LLM) era: Transformer models are scaled to billions of parameters, resulting in recent ways of interacting with them, comparable to prompting and chat. You begin to see loads more documentation about how you can efficiently train LLMs, like using parameter efficient finetuning (PEFT) methods, distributed training, and data parallelism.
  3. Optimization era: Running LLMs for inference or training generally is a challenge unless you’re GPU Wealthy, so now there’s a ton of interest in how you can democratize LLMs for the GPU Poor. There may be more documentation about methods like quantization, FlashAttention, optimizing the key-value cache, Low-Rank Adaptation (LoRA), and more.

Each era incrementally added recent content to the documentation, unbalancing and obscuring its previous parts. Content is sprawled over a greater surface, navigation is more complex.

Within the tree ring model, recent content is layered progressively over the previous content. Whereas within the integrated model, content coexists together as a component of the general documentation.

A redesign will help rebalance the general documentation experience. Content will feel native and integrated reasonably than added on.



Next steps

This post explored the rationale and motivation behind our quest to revamp the Transformers documentation.

Stay tuned for the subsequent post which identifies the mess in additional detail and answers necessary questions comparable to, who’re the intended users and stakeholders, what’s the present state of the content, and the way is it being interpreted.


Shout out to [@evilpingwin](https://x.com/evilpingwin) for the feedback and motivation to revamp the docs.





Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x