Introducing SyGra Studio

SyGra 2.0.0 introduces Studio, an interactive environment that turns synthetic data generation right into a transparent, visual craft. As an alternative of juggling YAML files and terminals, you compose flows directly on the canvas, preview datasets before committing, tune prompts with inline variable hints, and watch executions stream live—all from a single pane. Under the hood it’s the identical platform, so the whole lot you do visually generates the corresponding SyGra compatible graph config and task executor scripts.

What Studio permits you to do

Configure and validate models with guided forms (OpenAI, Azure OpenAI, Ollama, Vertex, Bedrock, vLLM, custom endpoints).
Connect Hugging Face, file-system, or ServiceNow data sources and preview rows before execution.
Configure nodes by choosing models, writing prompts (with auto-suggested variables), and defining outputs or structured schemas.
Design downstream outputs using shared state variables and Pydantic-powered mappings.
Execute flows end-to-end and review generated results immediately with node-level progress.
Debug with inline logs, breakpoints, Monaco-backed code editors, and auto-saved drafts.
Monitor per-run token cost, latency, and guardrail outcomes with execution history stored in .executions/.

Let’s walk through this experience step-by-step.

Step 1: Configure the information source

Open Studio, click Create Flow, and Start/End nodes appear robotically. Before adding the rest:

Select a connector (Hugging Face, disk, or ServiceNow).
Enter parameters like repo_id, split, or file path, then click Preview to fetch sample rows.
Column names immediately develop into state variables (e.g., {prompt}, {genre}), so you recognize exactly what might be referenced inside prompts and processors.

Once validated, Studio keeps the configuration in sync and pipes those variables throughout the flow—no manual wiring or guesswork.

Step 2: Construct the flow visually

Drag the blocks you would like from the palette. For a story-generation pipeline:

Drop an LLM node named “Story Generator,” select a configured model (say, gpt-4o-mini), write the prompt, and store the end in story_body.
Add a second LLM node named “Story Summarizer,” reference {story_body} contained in the prompt, and output to story_summary.
Toggle structured outputs, attach tools, or add Lambda/Subgraph nodes should you need reusable logic or branching behavior.

Studio’s detail panel keeps the whole lot in context—model parameters, prompt editor, tool configuration, pre/post-process code, and even multi-LLM settings should you want parallel generations. Typing { inside a prompt surfaces every available state variable immediately.

Step 3: Review and run

Open the Code Panel to examine the precise YAML/JSON Studio is generating. This is identical artifact written to tasks/examples/, so what you see is what gets committed.

Once you’re able to execute:

Click Run Workflow.
Select record counts, batch sizes, retry behavior etc.
Hit Run and watch the Execution panel stream node status, token usage, latency, and price in real time. Detailed logs provide observability and make debugging effortless. All executions are written to .executions/runs/*.json.

After the run, download outputs, compare against prior executions, get metadata of latency and usage details.

See it in motion!

Running Existing Workflows

Run the Glaive Code Assistant workflow

SyGra Studio can even execute existing workflow within the tasks. For instance, within the tasks/examples/glaive_code_assistant/ workflow — it ingests the glaiveai/glaive-code-assistant-v2 dataset, drafts answers, critiques them, and loops until the critique returns “NO MORE FEEDBACK.”

Inside Studio you’ll notice:

Canvas layout – two LLM nodes (generate_answer and critique_answer) linked by a conditional edge that either routes back for more revisions or exits to END when the critique is satisfied.
Tunable inputs – the Run modal permits you to switch dataset splits, adjust batch sizes, cap records, or tweak temperatures without touching YAML.
Observable execution – watch each nodes light up in sequence, inspect intermediate critiques, and monitor status in real time.
Generated outputs – synthetic data is generated, ready for model training, evaluation pipelines or annotation tools.

Start

git clone https://github.com/ServiceNow/SyGra.git
cd SyGra && make studio

SyGra Studio turns synthetic data workflows right into a visual, user friendly experience. Configure once, construct with confidence, run with full observability, generate the information without ever leaving the canvas.

Source link

Introducing SyGra Studio

What Studio permits you to do

Step 1: Configure the information source

Step 2: Construct the flow visually

Step 3: Review and run

See it in motion!

Running Existing Workflows

Run the Glaive Code Assistant workflow

Start

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Probabilistic Time Series Forecasting with 🤗 Transformers

Why Is My Code So Slow? A Guide to Py-Spy Python Profiling

OpenAI is hoppin’ mad about Anthropic’s recent Super Bowl TV ads

How one can Construct License-Compliant Synthetic Data Pipelines for AI Model Distillation

Mechanistic Interpretability: Peeking Inside an LLM

Introducing SyGra Studio

What Studio permits you to do

Step 1: Configure the information source

Step 2: Construct the flow visually

Step 3: Review and run

See it in motion!

Running Existing Workflows

Run the Glaive Code Assistant workflow

Start

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.