ðŸ§TL;DR
Hugging Face AI Sheets is a brand new, open-source tool for constructing, enriching, and remodeling datasets using AI models with no code. The tool might be deployed locally or on the Hub. It helps you to use 1000’s of open models from the Hugging Face Hub via Inference Providers or local models, including gpt-oss from OpenAI!
Useful links
Try the tool free of charge (no installation required): https://huggingface.co/spaces/aisheets/sheets
Install and run locally: https://github.com/huggingface/sheets
What’s AI Sheets
AI Sheets is a no-code tool for constructing, transforming, and enriching datasets using (open) AI models. It’s tightly integrated with the Hub and the open-source AI ecosystem.
AI Sheets uses an easy-to-learn user interface, much like a spreadsheet. The tool is built around quick experimentation, starting with small datasets before running long/costly data generation pipelines.
In AI Sheets, latest columns are created by writing prompts, and you’ll be able to iterate as again and again as you would like and edit the cells/validate cells to show the model what you would like. But more on this later!
What can I exploit it for
You should use AI Sheets to:
Compare and vibe test models. Imagine you ought to test the newest models in your data. You possibly can import a dataset with prompts/questions, and create several columns (one per model) with a prompt like this: Answer the next: {{prompt}}, where prompt is a column in your dataset. You possibly can validate the outcomes manually or create a brand new column with an LLM as a judge prompt like this: Evaluate the responses to the next query: {{prompt}}. Response 1: {{model1}}. Response 2: {{model2}}, where model1 and model2 are columns in your dataset with different model responses.
Improve prompts on your data and specific models. Imagine you ought to construct an application to process customer requests and provides automatic answers. You possibly can load a sample dataset with customer requests and begin playing and iterating with different prompts and models to generate responses. One cool feature of AI Sheets is which you can provide feedback by editing or validating cells. These example cells shall be added to your prompts robotically. You possibly can consider it as a tool to fine-tune prompts and add a few-shot examples to your prompts very efficiently, by your data in real-time!
Transform a dataset. Imagine you ought to clean up a column of your dataset. You possibly can add a brand new column with a prompt like Remove extra punctuation marks from the next text: {{text}}, where text is a column in your dataset containing the texts you ought to clean up.
Classify a dataset. Imagine you ought to classify some content in your dataset. You possibly can add a brand new column with a prompt like Categorize the next text: {{text}}, where text is a column in your dataset containing the texts you ought to categorize.
Analyze a dataset. Imagine you ought to extract the foremost ideas in your dataset. You possibly can add a brand new column with a prompt like this: Extract an important ideas from the next: {{text}}, where text is a column in your dataset containing the texts you ought to analyze.
Enrich a dataset. Imagine you might have a dataset with addresses which might be missing zip codes. You possibly can add a brand new column with a prompt like this: Find the zip code of the next address: {{address}} (on this case, it’s essential to enable the “Search the online” option to make sure accurate results).
Generate an artificial dataset. Imagine you would like a dataset with realistic emails, but the info just isn’t available for data privacy reasons. You possibly can create a dataset with a prompt like this: Write a brief description of knowledgeable in the sector of pharma firms and name the column person_bio. Then you definitely can create one other column with a prompt like this Write a sensible skilled email because it was written by the next person: {{person_bio}}.
Now let’s dive into easy methods to use it!
Methods to use it
AI Sheets gives you two ways to start out: import existing data or generate a dataset from scratch. Once your data is loaded, you’ll be able to refine it by adding columns, editing cells, and regenerating content.
Getting began
To start, you would like create one from scratch describing it in natural language or import an existing dataset.
Generate Dataset from Scratch
Best for: Familiarizing with AI Sheets, brainstorming, rapid experiments, and creating test datasets.
Consider this as an auto-dataset or prompt-to-dataset feature—you describe what you would like, and AI Sheets creates the whole dataset structure and content for you.
When to make use of this:
- You are exploring AI Sheets for the primary time
- You wish synthetic data for testing or prototyping
- Data accuracy and variety usually are not critical (e.g., brainstorming use cases, quick research, generating test datasets)
- You need to experiment with ideas quickly
How it really works:
- Describe the dataset you would like within the prompt area
- Example: “A listing of fictional startups with name, industry, and slogan”
- AI Sheets generates the schema and creates 5 sample rows
- Extend to as much as 1,000 rows or modify the prompt to vary structure
Example
Should you type this prompt: cities of the world, alongside countries they belong to and a landmark image for every, generated in Ghibli style:
AI Sheets will robotically generate a dataset with three columns, as shown below:

This dataset accommodates only five rows, but you’ll be able to add more cells by dragging down on each column, including the image one! You can too write items in any of the cells and complete the others by dragging.
The next sections will show you easy methods to iterate and expand the dataset.
Import your dataset (really helpful)
Best for: Most use cases where you ought to transform, classify, enrich, and analyze real-world data.
That is really helpful for many use cases, as importing real data gives you more control and suppleness than ranging from scratch.
When to make use of this:
- You could have existing data to rework or enrich using AI models
- You need to generate synthetic data, and accuracy and variety are necessary
How it really works:
- Upload your data in XLS, TSV, CSV, or Parquet format
- Ensure your file includes at the least one column name and one row of information
- Upload as much as 1,000 rows (unlimited columns)
- Your data appears in a well-known spreadsheet format
Pro tip: In case your file accommodates minimal data, you’ll be able to manually add more entries by typing directly into the spreadsheet.
Working together with your dataset
Once your data is loaded (no matter the way you began), you will see it in an editable spreadsheet interface. Here’s what you might want to know:
Understanding AI Sheets
- Imported cells: Manually editable but cannot be modified by AI prompts
- AI-generated cells: May be regenerated and refined using prompts and your feedback (edits + thumbs-up)
- Recent columns: All the time AI-powered and fully customizable
Getting Began with AI columns
- Click the “+” button so as to add a brand new column
- Pick from really helpful actions:
- Extract specific information
- Summarize long text
- Translate content
- Or write custom prompts with “Do something with {{column}}”
Refining and expanding the dataset
Now that you might have AI columns, you’ll be able to improve their results and expand your data. You possibly can improve results by providing feedback through manual edits and likes or by adjusting the column configuration. Each require regeneration to take effect.
1. Methods to add more cells
- Drag down: From the last cell in a column to generate additional rows immediately
- No regeneration needed – latest cells are created immediately
- You should use this to regenerate errored cells too
2. Manual editing and feedback
- Edit cells: Click any cell to edit content directly – this provides the model examples of your chosen output
- Like results: Use thumbs-up to mark examples of excellent output
- Regenerate to use feedback to other cells within the column.
Under the hood, these manually edited and liked cells shall be used as few-shot examples for generating the cells whenever you regenerate or add more cells within the column!
3. Adjust column configuration Change the prompt, switch models or providers, or modify settings, then regenerate to get well results.
Rewrite the prompt
- Each column has its generation prompt
- Edit anytime to vary or improve output
- Column regenerates with latest results
Switch models/providers
- Try different models for various performance or compare them.
- Some are more accurate, creative, or structured than others for specific tasks.
- Some providers have faster inference and different context lengths; test different providers for the chosen model.
Toggle Search
- Enable: Model pulls up-to-date information from the online
- Disable: Offline, model-only generation
Exporting your final dataset to the Hub
When you’re glad together with your latest dataset, export it to the Hub! This has the extra advantage of generating a config file you’ll be able to reuse for (1) generating more data with HF jobs using this script, and (2) reusing the prompts for downstream applications, including the few shots out of your edited and liked cells.
Here’s an example dataset created with AISheets, which produces this config.
Running data generation scripts using HF Jobs
If you ought to generate a bigger dataset, you should use the above-mentioned config and script, like this:
hf jobs uv run
-s HF_TOKEN=$HF_TOKEN
https://huggingface.co/datasets/aisheets/uv-scripts/raw/foremost/extend_dataset/script.py
--config https://huggingface.co/datasets/dvilasuero/nemotron-personas-kimi-questions/raw/foremost/config.yml
--num-rows 100
nvidia/Nemotron-Personas dvilasuero/nemotron-kimi-qa-distilled
Examples
This section provides examples of datasets you’ll be able to construct with AI Sheets to encourage your next project.
Vibe testing and comparing models
AI Sheets is your perfect companion if you ought to test the newest models on different prompts and data you care about.
You only must import a dataset (or create one from scratch) after which add different columns with the models you ought to test.
Then, you’ll be able to either inspect the outcomes manually or add a column to make use of LLMs to evaluate the standard of every model.
Below is an example, comparing open frontier models for mini web apps. AI Sheets helps you to see the interactive results and play with each app. Moreover, the dataset includes several columns using LLM to evaluate and compare the standard of the apps.
Example dataset exported from a session just like the one we just described: : https://huggingface.co/datasets/dvilasuero/jsvibes-qwen-gpt-oss-judged
Config:
columns:
gpt-oss:
modelName: openai/gpt-oss-120b
modelProvider: groq
userPrompt: Create a complete, runnable HTML+JS file implementing {{description}}
searchEnabled: false
columnsReferences:
- description
eval-qwen-coder:
modelName: Qwen/Qwen3-Coder-480B-A35B-Instruct
modelProvider: cerebras
userPrompt: "Please compare the 2 apps and tell me which one is healthier and why:nnApp description:nn{{description}}nnmodel 1:nn{{qwen3-coder}}nnmodel 2:nn{{gpt-oss}}nnKeep it very short and give attention to whether or not they work well for the aim, be sure that they work and usually are not incomplete, and the code quality, not on visual appeal and unrequested features. Assume the models might provide non working solutions, so watch out to evaluate thatnnRespond with:nnchosen: {model 1, model 2}nnreason: ..."
searchEnabled: false
columnsReferences:
- gpt-oss
- description
- qwen3-coder
eval-gpt-oss:
modelName: openai/gpt-oss-120b
modelProvider: groq
userPrompt: "Please compare the 2 apps and tell me which one is healthier and why:nnApp description:nn{{description}}nnmodel 1:nn{{qwen3-coder}}nnmodel 2:nn{{gpt-oss}}nnKeep it very short and give attention to whether or not they work well for the aim, be sure that they work and usually are not incomplete, and the code quality, not on visual appeal and unrequested features. Assume the models might provide non working solutions, so watch out to evaluate thatnnRespond with:nnchosen: {model 1, model 2}nnreason: ..."
searchEnabled: false
columnsReferences:
- gpt-oss
- description
- qwen3-coder
eval-kimi:
modelName: moonshotai/Kimi-K2-Instruct
modelProvider: groq
userPrompt: "Please compare the 2 apps and tell me which one is healthier and why:nnApp description:nn{{description}}nnmodel 1:nn{{qwen3-coder}}nnmodel 2:nn{{gpt-oss}}nnKeep it very short and give attention to whether or not they work well for the aim, be sure that they work and usually are not incomplete, and the code quality, not on visual appeal and unrequested features. Assume the models might provide non working solutions, so watch out to evaluate thatnnRespond with:nnchosen: {model 1, model 2}nnreason: ..."
searchEnabled: false
columnsReferences:
- gpt-oss
- description
- qwen3-coder
Add categories to a Hub dataset
AI Sheets may also augment existing datasets and allow you to with quick data evaluation and data science projects that involve analyzing text datasets.
Here’s an example of adding categories to an existing Hub dataset.
A cool feature is which you can validate or edit manually the initial categorization outputs and regenerate the total column to enhance the outcomes, as seen below:
Config:
columns:
category:
modelName: moonshotai/Kimi-K2-Instruct
modelProvider: groq
userPrompt: |-
Categorize the foremost topics of the next query:
{{query}}
prompt: "
You're a rigorous, intelligent data-processing engine. Generate only the
requested response format, with no explanations following the user
instruction. You may be supplied with positive, accurate examples of how
the user instruction have to be accomplished.
# Examples
The next are correct, accurate example outputs with respect to the
user instruction:
## Example
### Input
query: Given the world of a parallelogram is 420 square centimeters and
its height is 35 cm, find the corresponding base. Show all work and label
your answer.
### Output
Mathematics – Geometry
## Example
### Input
query: What's the minimum variety of red squares required to make sure
that every of $n$ green axis-parallel squares intersects 4 red squares,
assuming the green squares might be scaled and translated arbitrarily
without intersecting one another?
### Output
Geometry, Combinatorics
# User instruction
Categorize the foremost topics of the next query:
{{query}}
# Your response
"
searchEnabled: false
columnsReferences:
- query
Evaluate models with LLMs-as-Judge
One other use case is evaluating the outputs of models using an LLM as a judge approach. This might be useful for comparing models or assessing the standard of an existing dataset, for instance, fine-tuning a model on an existing dataset on the Hugging Face Hub.
In the primary example, we combined vibe testing with a judge LLM column. Here’s the judge prompt:
Example dataset: https://huggingface.co/datasets/dvilasuero/jsvibes-qwen-gpt-oss-judged
Config:
columns:
object_name:
modelName: meta-llama/Llama-3.3-70B-Instruct
modelProvider: groq
userPrompt: Generate the name of a common day to day object
searchEnabled: false
columnsReferences: []
object_description:
modelName: meta-llama/Llama-3.3-70B-Instruct
modelProvider: groq
userPrompt: Describe a {{object_name}} with adjectives and short word groups separated by commas. No more than 10 words
searchEnabled: false
columnsReferences:
- object_name
object_image_with_desc:
modelName: multimodalart/isometric-skeumorphic-3d-bnb
modelProvider: fal-ai
userPrompt: RBNBICN, icon, white background, isometric perspective, {{object_name}} , {{object_description}}
searchEnabled: false
columnsReferences:
- object_description
- object_name
object_image_without_desc:
modelName: multimodalart/isometric-skeumorphic-3d-bnb
modelProvider: fal-ai
userPrompt: "RBNBICN, icon, white background, isometric perspective, {{object_name}} "
searchEnabled: false
columnsReferences:
- object_name
glowing_colors:
modelName: multimodalart/isometric-skeumorphic-3d-bnb
modelProvider: fal-ai
userPrompt: "RBNBICN, icon, white background, isometric perspective, {{object_name}}, glowing colours "
searchEnabled: false
columnsReferences:
- object_name
flux:
modelName: black-forest-labs/FLUX.1-dev
modelProvider: fal-ai
userPrompt: Create an isometric icon for the object {{object_name}} based on {{object_description}}
searchEnabled: false
columnsReferences:
- object_description
- object_name
Next steps
You possibly can try AI Sheets without installing anything or download and deploy it locally from the GitHub repo. For running locally and get probably the most out of it, we recommend you to subscribe to PRO and get 20x monthly inference usage.
If you might have questions or suggestions, tell us within the Community tab or by opening a difficulty on GitHub.









