Home Artificial Intelligence Graph Theory to Harmonize Model Integration

Graph Theory to Harmonize Model Integration

0
Graph Theory to Harmonize Model Integration

Optimising multi-model collaboration with graph-based orchestration

Orchestra — photographer Arindam Mahanta by unsplash

Integrating the capabilities of assorted AI models unlocks a symphony of potential, from automating complex tasks that require multiple abilities like vision, speech, writing, and synthesis to enhancing decision-making processes. Yet, orchestrating these collaborations presents a major challenge in managing the inner relations and dependencies. Traditional linear approaches often fall short, struggling to administer the intricacies of diverse models and dynamic dependencies.

By translating your machine learning workflow right into a graph, you gain a visualisation of how each model interacts and contributes to the general final result that mixes natural language processing, computer vision, and speech models. With the graph approach, the nodes represent models or tasks, and edges define dependencies between them. This graph-based mapping offers several benefits, identifying which models depend on the output of others and leveraging parallel processing for independent tasks. Moreover, we are able to execute the tasks using existing graph navigation strategies like breadth-first or depth-first in response to the duty priorities.

The road to harmonious AI models collaboration will not be without hurdles. Imagine conducting an orchestra where each individual speaks different languages and instruments operate independently. This challenge mirrors the communication gaps when integrating diverse AI models, requiring a framework to administer the relations and which models can receive each input format.

The graph-based orchestration approach opens doors to exciting possibilities across various domains:

Collaborative tasks for drug discovery

Diagram of three models collaboration as part of information evaluation task — image by writer

Researchers can speed up the drug discovery process with a sequence of AI-powered assistants, each designed for a selected task, for instance, using a three-step discovery mission. Step one involves a language model that scans vast scientific data to focus on potential protein targets strongly linked to specific diseases, followed by a vision model to clarify complex diagrams or images, providing detailed insights into the structures of the identified proteins. This visual is crucial for understanding how potential drugs might interact with the protein. Finally, a 3rd model integrates input from the language and vision models to predict how chemical compounds might affect the targeted proteins, offering the researchers invaluable insights to steer the method efficiently.

Several challenges will emerge in the course of the model integration to deliver the whole pipeline. Extracting relevant images from the scanned content and feeding them to the vision model isn’t so simple as it seems. An intermediate processor is required between the text scan and vision tasks to filter the relevant images. Secondly, the evaluation task itself should merge multiple inputs: the info scan output, the vision model’s explanation, and user-specified instructions. This requires a template to mix the knowledge for the language model to process them. The next sections will describe find out how to utilise a python framework to handle the complex relations.

Creative Content Generation

Diagram of 4 tasks to generate animation — image by writer

The models collaboration can facilitate interactive content creation by integrating elements reminiscent of music composition, animation, and design models to generate animated scenes. As an illustration, in a graph-based collaboration approach, the primary task can plan a scene like a director and pass the input for every music and image generation task. Finally, an animation model will use the output of the art and music models to generate a brief video.

To optimise this process, we aim to realize parallel execution of music and graphics generation as they’re independent tasks. So there’s no need for music to attend for graphics completion. Moreover, we’d like to handle the varied input formats by the animation task. While some models like Stable Video Diffusion work with images only, the music might be combined using a post-processor.

These examples provide only a glimpse of the graph theory potential in model integration. The graph integration approach permits you to tailor multiple tasks to your specific needs and unlock revolutionary solutions.

Tasks represented with a graph — image by writer

Intelli is an open source python module to orchestrate AI workflows, by leveraging graph principles through three key components:

  1. Agents act as representatives of your AI models, you define each agent by specifying its type (text, image, vision, or speech), its provider (openai, gemini, stability, mistral, etc.), and the mission.
  2. Tasks are individual units inside your AI workflow. Each task leveraging an agent to perform a selected motion and applies custom pre-processing and post-processing provided by the user.
  3. Flow binds every thing together, orchestrating the execution of your tasks, adhering to the dependencies you’ve established through the graph structure. Flow management ensures tasks are executed efficiently and in the proper order, enabling each sequential and parallel processing where possible.

Using the flow component to administer the tasks relation as a graph provide several advantages when connecting multiple models, nevertheless for the case of 1 task only this may be overkill and direct call of the model shall be sufficient.

Scaling: As your project grows in complexity, adding more models and tasks requires repetitive code updates to account for data format mismatches and complicated dependency. The graph approach simplifies this by defining a latest node representing the duty, and the framework robotically resolves input/output differences to orchestrates data flow.

Dynamic Adaptation: With traditional approaches, changes for complex tasks will impact the whole workflow, requiring adjustments. When using the flow, it is going to handle adding, removing, or modifying connections robotically.

Explainability: The graph empowers deeper understanding of your AI workflow by visualising how the models interact, and optimise the tasks path navigation.

Note: the writer participated in designing and developing the intelli framework. it’s an open source project with Apache licence.

Getting Began

First, ensure you have got python 3.7+, as intelli leverages the most recent python asyncio features, and install:

pip install intelli

Agents: The Task Executors

Agents in Intelli are designed to interface with specific AI model. Each agent features a unified input layer to access any model type and provides a dictionary allowing to pass custom parameters to the model, reminiscent of the utmost size, temperature and model version.

from intelli.flow.agents import Agent

# Define agents for various AI tasks
text_agent = Agent(
agent_type="text",
provider="openai",
mission="write social media posts",
model_params={"key": OPENAI_API_KEY, "model": "gpt-4"}
)

Tasks: The Constructing Blocks

Tasks represent individual units of labor or operations to be performed by agents, and include the logic to handle the output of the previous task. Each task is usually a easy operation like generating text or a more complex process, like analysing the sentiment of user feedback.

from intelli.flow.tasks import Task
from intelli.flow.input import TextTaskInput

# Define a task for text generation
task1 = Task(
TextTaskInput("Create a post about AI technologies"),
text_agent,
log=True
)

Processors: Tuned I/O

Processors add an additional layer of control by defining a custom pre-process for the duty input and post-process for the output. The instance below demonstrates making a function to shorten the text output of the previous step before calling the image model.

class TextProcessor:
@staticmethod
def text_head(text, size=800):
retupytrn text[:size]

task2 = Task(
TextTaskInput("Generate image in regards to the content"),
image_agent,
pre_process=TextProcessor.text_head,
log=True,
)

Flow: Specifying the dependencies

Flow translates your AI workflow right into a Directed Acyclic Graph (DAG) and leverage the graph theory for dependency management. This lets you easily visualise the duty relations, and optimise the execution order of your tasks.

from intelli.flow.flow import Flow

flow = Flow(
tasks={
"title_task": title_task,
"content_task": content_task,
"keyword_task": keyword_task,
"theme_task": description_theme_task,
"image_task": image_task,
},
map_paths={
"title_task": ["keyword_task", "content_task"],
"content_task": ["theme_task"],
"theme_task": ["image_task"],
},
)

output = await flow.start()

The map_paths dictates the duty dependencies, guiding Flow to orchestrate the execution order and ensuring each task receives the vital output from its predecessors.

Here’s how Flow navigates the nodes:

  1. Mapping the Workflow: Flow constructs a DAG using tasks as nodes and dependencies as edges. This visual representation clarifies the duty execution sequence and data flow.
  2. Topological Sorting: The flow analyses the graph to find out the optimal execution order. Tasks without incoming dependencies are prioritised, ensuring each task receives vital inputs from predecessors before execution.
  3. Task Execution: The framework iterates through the sorted tasks, executing each with corresponding input. Based on the dependency map, inputs might come from previous task outputs and user-defined values.
  4. Input Preparation: Before execution, the duty applies any pre-processing functions defined for the duty, modifying the input data as needed and calls the assigned agent.
  5. Output Management: The agent returns an output, which is stored in a dictionary with task name as a key and returned to the user.

To visualise your flow as a graph:

flow.generate_graph_img()
The visual of the tasks and assigned agents — image by intelli graph function

Using graph theory has transformed the standard linear approaches to orchestrating AI models by providing a symphony of collaboration between diverse models.

Frameworks like Intelli translate your workflow right into a visual representation, where tasks turn into nodes and dependencies are mapped as edges, creating an outline of your entire process to automate complex tasks.

This approach extends to diverse fields requiring collaborative AI models, including scientific research, business decision automation, and interactive content creation. Nevertheless, effective scale requires further refinement in managing the info exchange between the models.

LEAVE A REPLY

Please enter your comment!
Please enter your name here