Advanced Voice Mode is Here!

Good morning. It’s Wednesday, September twenty fifth.

Did you recognize: On at the present time in 2007, Halo 3 was released in North America?

Advanced Voice Mode
Altman’s Superintelligence Blog Post
Meta’s “Imagine Yourself”
Turn Docs into Podcasts
Figma’s AI App Generator
4 Latest AI Tools
Latest AI Research Papers

You read. We listen. Tell us what you’re thinking that by replying to this email.

^{In partnership with PROMPTHERO}

Able to take your AI experiments to the subsequent level?

Master scene lighting, subject positioning, poses and posture, generate realistic hands – and create your personal digital photo studio. But let’s not stop there. Take all that, and make it into charming videos. Delve into cutting-edge techniques like ControlNet, Multi-ControlNet, Openpose and Deforum.

Today’s trending AI news stories

OpenAI rolls out Advanced Voice Mode with more voices and a brand new look

OpenAI has expanded ChatGPT’s Advanced Voice Mode to more paying users, rolling it out to those within the Plus and Teams tiers. The update brings a sleeker design, highlighted by a blue animated sphere, and introduces five recent voices—Arbor, Maple, Sol, Spruce, and Vale—to raise the experience.

Note: In case you are a ChatGPT Plus user and don’t have access yet, try uninstalling the app and re-installing it.

Missing from the discharge, nevertheless, are the video and screen-sharing features seen in earlier demos. On the plus side, it now handles accents more easily and works seamlessly with ChatGPT’s Custom Instructions and Memory, offering a more tailored experience. Read more.

Sam Altman anticipates Superintelligence soon, defends AI in rare personal blog post

In a rare blog post, OpenAI CEO Sam Altman articulated his vision of an impending “Intelligence Age,” asserting that deep learning’s capabilities enable the resolution of complex global challenges, comparable to climate change and space colonization. He predicts the appearance of superintelligence ‘inside just a few thousand days’, significantly ahead of most experts anticipate.

Altman asserts that AI’s advancements will depend on increased computational power and data availability, paving the way in which for private AI teams and virtual tutors for everybody. While acknowledging potential job displacement and resource disparities, he believes the general impact of AI will yield profound advantages.

Altman’s post, positioned as a private viewpoint slightly than an official OpenAI statement, coincides with the corporate’s fundraising efforts, aiming for a valuation of $150 billion. He cautions that, without adequate infrastructure, AI could develop into a resource mainly accessible to the rich.

While some predictions, just like the potential for virtual tutors, are plausible, many assertions—comparable to AI making a utopian future—are met with doubt. Critics argue that the passion surrounding AI may mask its limitations and the socio-economic upheaval it’d cause. Read more.

Meta’s recent AI creates custom images from a single photo without extra training

Meta has introduced “Imagine Yourself,” an AI model able to generating quite a lot of personalized images from a single reference photo without requiring additional training. This model can create multiple images of a person in several poses, styles, and settings by processing the reference image together with accompanying text instructions.

Unlike conventional models that necessitate retraining for every individual, “Imagine Yourself” uses synthetic training pairs to boost learning, supported by a sophisticated architecture featuring three parallel text processing modules alongside a trainable image processing module.

While the model demonstrates superior performance in executing complex instructions, it still faces challenges in preserving identity in comparison with some competing models. Read more.

Open-source PDF2Audio tool turns documents into podcasts and audio summaries

MIT researchers, led by Markus J. Buehler, have launched PDF2Audio, an open-source tool that converts complex documents into podcasts, lectures, and audio summaries. This tool serves as a versatile alternative to Google’s “Audio Overviews” feature in NotebookLM, supporting various models, including OpenAI’s GPT-4 and other open-source options.

We’re excited to share #PDF2Audioan open-source alternative to the #podcast feature of #NotebookLM with flexibility & tailored outputs that you may precisely control within the app: You may make a podcast, lecture, discussions, short/long form summaries & more, including the use… x.com/i/web/status/1…

— Markus J. Buehler (@ProfBuehlerMIT)
11:49 AM • Sep 23, 2024

Users can upload multiple PDFs, select prompt templates, and customize audio models and voices, generating content in languages like French, German, and Chinese. PDF2Audio also offers advanced editing features, enabling users to annotate transcripts and adjust tone.

Figma’s AI-powered app generator is back after it was pulled for copying Apple

Figma has relaunched its AI-powered app generator, now called First Draft, after initially withdrawing it resulting from copyright concerns. The tool is designed to help designers in creating layouts for apps and web sites, addressing feedback from early users who noted similarities to Apple’s weather app.

First Draft is now available in a limited beta, featuring several enhancements. Users can pick from 4 specialized design libraries, catering to varied project requirements, from wireframing tools for low-fidelity designs to high-fidelity libraries for detailed visual exploration. The tool utilizes off-the-shelf AI models, including OpenAI’s GPT-4 and Amazon Titan, to generate designs based on user-defined prompts. Figma insists that First Draft doesn’t train on customer data, ensuring user privacy and the originality of generated designs. Read more.