Introducing SyGra Studio

-



SyGra 2.0.0 introduces Studio, an interactive environment that turns synthetic data generation right into a transparent, visual craft. As an alternative of juggling YAML files and terminals, you compose flows directly on the canvas, preview datasets before committing, tune prompts with inline variable hints, and watch executions stream live—all from a single pane. Under the hood it’s the identical platform, so the whole lot you do visually generates the corresponding SyGra compatible graph config and task executor scripts.



What Studio permits you to do

  1. Configure and validate models with guided forms (OpenAI, Azure OpenAI, Ollama, Vertex, Bedrock, vLLM, custom endpoints).
  2. Connect Hugging Face, file-system, or ServiceNow data sources and preview rows before execution.
  3. Configure nodes by choosing models, writing prompts (with auto-suggested variables), and defining outputs or structured schemas.
  4. Design downstream outputs using shared state variables and Pydantic-powered mappings.
  5. Execute flows end-to-end and review generated results immediately with node-level progress.
  6. Debug with inline logs, breakpoints, Monaco-backed code editors, and auto-saved drafts.
  7. Monitor per-run token cost, latency, and guardrail outcomes with execution history stored in .executions/.

Let’s walk through this experience step-by-step.




Step 1: Configure the information source

Open Studio, click Create Flow, and Start/End nodes appear robotically. Before adding the rest:

  • Select a connector (Hugging Face, disk, or ServiceNow).
  • Enter parameters like repo_id, split, or file path, then click Preview to fetch sample rows.
  • Column names immediately develop into state variables (e.g., {prompt}, {genre}), so you recognize exactly what might be referenced inside prompts and processors.

Once validated, Studio keeps the configuration in sync and pipes those variables throughout the flow—no manual wiring or guesswork.




Step 2: Construct the flow visually

Drag the blocks you would like from the palette. For a story-generation pipeline:

  1. Drop an LLM node named “Story Generator,” select a configured model (say, gpt-4o-mini), write the prompt, and store the end in story_body.
  2. Add a second LLM node named “Story Summarizer,” reference {story_body} contained in the prompt, and output to story_summary.
  3. Toggle structured outputs, attach tools, or add Lambda/Subgraph nodes should you need reusable logic or branching behavior.

Studio’s detail panel keeps the whole lot in context—model parameters, prompt editor, tool configuration, pre/post-process code, and even multi-LLM settings should you want parallel generations. Typing { inside a prompt surfaces every available state variable immediately.




Step 3: Review and run

Open the Code Panel to examine the precise YAML/JSON Studio is generating. This is identical artifact written to tasks/examples/, so what you see is what gets committed.

Once you’re able to execute:

  • Click Run Workflow.
  • Select record counts, batch sizes, retry behavior etc.
  • Hit Run and watch the Execution panel stream node status, token usage, latency, and price in real time. Detailed logs provide observability and make debugging effortless. All executions are written to .executions/runs/*.json.

After the run, download outputs, compare against prior executions, get metadata of latency and usage details.



See it in motion!




Running Existing Workflows



Run the Glaive Code Assistant workflow

SyGra Studio can even execute existing workflow within the tasks. For instance, within the tasks/examples/glaive_code_assistant/ workflow — it ingests the glaiveai/glaive-code-assistant-v2 dataset, drafts answers, critiques them, and loops until the critique returns “NO MORE FEEDBACK.”

Inside Studio you’ll notice:

  1. Canvas layout – two LLM nodes (generate_answer and critique_answer) linked by a conditional edge that either routes back for more revisions or exits to END when the critique is satisfied.
  2. Tunable inputs – the Run modal permits you to switch dataset splits, adjust batch sizes, cap records, or tweak temperatures without touching YAML.
  3. Observable execution – watch each nodes light up in sequence, inspect intermediate critiques, and monitor status in real time.
  4. Generated outputs – synthetic data is generated, ready for model training, evaluation pipelines or annotation tools.



Start

git clone https://github.com/ServiceNow/SyGra.git
cd SyGra && make studio

SyGra Studio turns synthetic data workflows right into a visual, user friendly experience. Configure once, construct with confidence, run with full observability, generate the information without ever leaving the canvas.



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x