Next OpenAI Model To Do “5hr Tasks”

In partnership with

Good morning. It’s Wednesday, October 2nd.

Did you already know: On today in 1991, the AIM alliance between Apple, IBM, and Motorola was formed.

OpenAI’s DevDay Announcements
o1 Model Can Handle “5-hour Tasks”
Liquid Foundation Models
Copilot AI Event
5 Latest AI Tools
Latest AI Research Papers

You read. We listen. Tell us what you’re thinking that by replying to this email.

The fastest option to construct AI apps

We’re excited to introduce Author AI Studiothe fastest option to construct AI apps, products, and features. Author’s unique full-stack design makes it easy to prototype, deploy, and test AI apps – allowing developers to construct with APIs, a drag-and-drop open-source Python framework, or a no-code builder, so you’ve gotten flexibility to construct the way in which you wish.

Author comes with a collection of top-ranking LLMs and has built-in RAG for straightforward integration together with your data. Test it out if you happen to’re seeking to streamline the way you construct and integrate AI apps.

Today’s trending AI news stories

OpenAI’s DevDay 2024: 4 Major Updates

At DevDay 2024, OpenAI favored substance over spectacle, rolling out 4 updates to make AI more accessible and inexpensive for developers.

Realtime API: The newly introduced Realtime API grants developers access to 6 AI voices designed for seamless integration into applications. Distinct from those in ChatGPT, these voices enable lifelike conversations across various contexts, including travel planning and phone-based ordering systems at roughly $18/hr. The API supports real-time responses, enhancing user experiences in diverse applications, although developers are liable for disclosing using AI-generated voices.

Vision fine-tuning API: The Vision Advantageous-Tuning API allows developers to bolster GPT-4o by incorporating image data with text, significantly improving the model’s visual comprehension. This feature supports advanced visual search, object detection for autonomous vehicles, and precise medical image evaluation, all achievable with as few as 100 images. OpenAI maintains transparency by granting developers full control over data ownership and usage, complemented by automated safety evaluations to make sure compliance.

Prompt Caching within the API: The Prompt Caching feature enables developers to save lots of costs and reduce latency by reusing input tokens from prior prompts. This functionality is particularly helpful for code editing and multi-turn conversations, providing potential savings of as much as 50% in processing times. The feature robotically applies to the newest GPT-4o and GPT-4o mini versions, activating for prompts longer than 1,024 tokens while ensuring privacy commitments are met.

Model Distillation within the API: OpenAI’s Model Distillation allows developers to refine cost-effective models using outputs from advanced models like GPT-4o and o1-preview. This integrated process simplifies the creation of high-performance models, comparable to GPT-4o mini, without the necessity for multiple tools. Key features include Stored Completions for automatic dataset generation and Evals for performance assessments. Model Distillation is now available, offering 2 million free training tokens day by day for GPT-4o mini and 1 million for GPT-4o until October 31, after which standard fine-tuning pricing will apply.

OpenAI has a leaked prompt for generating system prompts on the playground, geared toward improving clarity and effectiveness.

Leaked prompt for generating system prompts on the playground:

Understand the Task: Grasp the most important objective, goals, requirements, constraints, and expected output.
– Minimal Changes: If an existing prompt is provided, improve it provided that it’s easy. For complex prompts, enhance… x.com/i/web/status/1…

— AmebaGPT (@amebagpt)
6:06 PM • Oct 1, 2024

Moreover, OpenAI announced that access to the o1-preview model is now prolonged to developers on usage tier 3, with increased rate limits similar to those of GPT-4o.

OpenAI’s Marketing Chief Says o1 Can Handle “5-hour tasks”

At HubSpot’s Inbound event, Dane Vahey, OpenAI’s Head of Strategic Marketing, underscored AI’s growing role in marketing’s shifting landscape. He detailed a toolkit of AI-driven essentials—starting from data evaluation and automation to AI-supported research and content generation—urging marketers to harness these skills. AI, he noted, is not just a tool but a “pondering partner,” sharpening ideas through collaboration.

Vahey also gave a highlight to OpenAI’s latest creation, the o1 model, able to handling tasks as much as five hours long, comparable to crafting intricate strategies. A leap from GPT-3’s quick-hit answers and GPT-4’s ability to tackle five-minute tasks, o1 excels in planning, though it’s still susceptible to just a few hiccups. Yet, its potential reaffirms AI’s growing indispensability for marketers grappling with increasingly complex demands. Read more.

Liquid AI debuts recent LFM-based models that appear to outperform most traditional large language models

Liquid AI, an MIT spinoff from Boston, has rolled out its Liquid Foundation Models (LFMs), a sleek recent class of AI systems built for each efficiency and power. These LFMs, unlike their more neuron-hungry large language model (LLM) cousins, depend on liquid neural networks that use fewer neurons and smart mathematical techniques to do more with less.

The lineup includes LFM-1B, a 1.3 billion-parameter model for tight-resource setups, LFM-3B for edge devices like drones, and the heavyweight LFM-40B with 40.3 billion parameters, built for cloud-heavy applications. Early results show these models surpassing Microsoft’s Phi-3.5 and Meta’s Llama in key benchmarks.

Available through Liquid Playground and Lambda, they’re also being fine-tuned for hardware from Nvidia, AMD, and Apple. Liquid AI invites the community to push these models to their limits, welcoming red-team tests to gauge their true potential. Read more.

Microsoft Copilot and Windows AI Event

At its Latest York City event, Microsoft introduced a revamped Copilot experience, featuring a card-based interface for mobile, web, and Windows.

Key updates include Copilot Vision, which visually interprets user environments, and “Copilot Voice,” offering 4 distinct voice options for interactive engagement. “Discover Cards” provide personalized content recommendations, while “Copilot Day by day” delivers news and weather updates read aloud, supported by partnerships with major news outlets.

Integrating Copilot into Microsoft Edge enables users to summarize web pages and translate text without compromising personal data. Experimental features in Copilot Labs, including “Think Deeper,” leverage the brand new OpenAI language model, o1, and can be available across platforms.

Creative tools like Paint and Photos will gain advanced features, including Generative Fill and Generative Eraseallowing users so as to add or remove objects with precision, inspired by Adobe Photoshop’s capabilities. The Photos app may also introduce a Super-Resolution feature for on-device image upscaling, achieving as much as eight times the unique resolution.

This comprehensive overhaul positions Microsoft’s Copilot and Windows ecosystem as more responsive and user-centric, aiming to ascertain itself as a real AI companion. To mark this occasion, AI CEO, Mustafa Suleyman, has authored a memo discussing what he refers to as a “technological paradigm shift” towards AI models able to understanding human visual and auditory experiences. Read more.

Quick Hits

Nvidia just dropped a bombshell: Its recent AI model is open, massive, and able to rival GPT-4: NVLM-D-72B excels in interpreting complex visual inputs, analyzing memes, and solving mathematical problems, improving its text-only task performance by a median of 4.3 points after multimodal training. The AI community has welcomed this release, recognizing its potential to speed up research and development. Nonetheless, it raises concerns regarding potential misuse and poses challenges to existing AI business models. Read more.

Meta Introduces Digital Twin Catalog from Reality Labs Research: Meta has launched the Digital Twin Catalog (DTC), featuring over 2,400 realistic 3D models with sub-millimeter accuracy. This dataset is designed to democratize access to digital twins—3D representations of physical objects—especially for common items like kitchen utensils. With advanced scanning tech, the DTC captures intricate details, leaving no pixel unturned. Read more.