Gemini 1.5 Pro gets a body

-

Welcome, AI enthusiasts.

Google DeepMind’s latest breakthrough just turned robots into tour guides with a helping hand from Gemini 1.5 Pro.

In the event you thought voice assistants with multimodal capabilities were wild, wait until you see what they will do with a physical type of their very own. Let’s explore…

In today’s AI rundown:

  • Gemini 1.5 Pro powers robot navigation

  • OpenAI’s 5-level roadmap to AGI

  • Transform text into lifelike speech in seconds

  • Marc Andreessen funds AI agent $50k

  • 6 latest AI tools & 4 latest AI jobs

  • More AI & tech news

Read time: 4 minutes

LATEST DEVELOPMENTS

GOOGLE DEEPMIND

🤖 Gemini 1.5 Pro powers robot navigation

Image source: Google DeepMind

The Rundown: Google DeepMind just published latest research on robot navigation, leveraging the massive context window of Gemini 1.5 Pro to enable robots to grasp and navigate complex environments from human instructions.

The small print:

  • DeepMind’s “Mobility VLA” combines Gemini’s 1M token context with a map-like representation of spaces to create powerful navigation frameworks.

  • Robots are first given a video tour of an environment, with key locations verbally highlighted — then constructing a graph of the space using video frames.

  • In tests, robots responded to multimodal instructions, including map sketches, audio requests, and visual cues like a box of toys.

  • The system also allows for natural language commands like “take me somewhere to attract things,” with the robot then leading users to appropriate locations.

Why it matters: Equipping robots with multimodal capabilities and large context windows is about to enable some wild use cases. Google’s ‘Project Astra’ demo hinted at what the long run holds for voice assistants that may see, hear, and think — but embedding those functions inside a robot takes things to a different level.

TOGETHER WITH NORTHERN DATA GROUP

📶 Scale your startup with AI

The Rundown: Northern Data Group’s AI Accelerator program empowers modern startups to harness the facility of AI and shape the long run of technology.

With this program, you may:

  • Access NVIDIA H100 Tensor Core GPUs without spending a dime, powered by 100% clean energy

  • Scale your enterprise with modern AI solutions tailored to your needs

  • Receive mentoring and attend workshops led by industry giants like HPE, Supermicro, and Deloitte

Learn more and apply now to start out shaping the long run of AI while accelerating your startup’s growth.

OPENAI

🧠 OpenAI’s 5-level roadmap to AGI

Image source: Midjourney

The Rundown: OpenAI reportedly internally introduced a brand new five-tier system to trace its progress toward artificial general intelligence (AGI), offering a brand new glimpse into how the corporate envisions the trail toward human-level AI.

The small print:

  • The classification system ranges from Level 1 (current conversational AI) to Level 5 (AI able to running entire organizations).

  • OpenAI believes its technology is currently at Level 1 but nearing Level 2, dubbed ‘Reasoners.’

  • The corporate reportedly demonstrated a GPT-4 research project showing human-like reasoning skills on the meeting, hinting at progress towards Level 2.

  • Level 2 AI can perform basic problem-solving tasks on par with a PhD-level human without tools, with Level 3 rising to agents that may take motion for users.

Why it matters: The definition and roadmap towards AGI have previously been murky, and OpenAI’s alleged system could help establish more concrete benchmarks. While some could also be disillusioned at only being at 1 or 2 out of 5, exponential acceleration means we may move up the ladder faster than we will imagine.

AI TRAINING

🎙️ Transform text into lifelike speech in seconds

The Rundown: ElevenLabs’ AI-powered text-to-speech tool lets you generate natural-sounding voiceovers easily with customizable voices and settings.

Step-by-step:

  1. Join for a free ElevenLabs account here (10,000 free characters included).

  2. Navigate to the “Speech” synthesis tool out of your dashboard.

  3. Enter your script within the text box and choose a voice from the dropdown menu.

  4. For advanced options, click “Advanced” to regulate the model, stability, and similarity settings.

  5. Click “Generate speech” to create your audio file 🎉

Get more AI tutorials →

THE RUNDOWN AI UNIVERSITY

🎓 Join us live: Mastering Claude Artifacts

The Rundown: With Claude’s recent Artifact upgrades, we’re hosting a live workshop on an incredible real-world use case: How you may create shareable, interactive learning games from any content, screenshot, PDF, presentation, and more.

Join us today at 1 PM PST to:

  1. Learn how one can access Claude 3.5 Sonnet without spending a dime and understand the perfect use cases of Artifacts.

  2. Transform your learning materials or screenshots into interactive projects for worker onboarding, internal training, exam preparation, and more.

  3. Share and publish your first Artifact seamlessly with a co-worker or friend to assist them understand any topic higher.

In the event you’re a member of The Rundown University you may RSVP within the Upcoming Workshops space.

In the event you’re not a member yet, you may still join the workshop with a 14-day free trial to The Rundown University.

AI AGENTS

💰 Marc Andreessen funds AI agent $50k

Image source: @truth_terminal on X

The Rundown: Marc Andreessen provided a $50,000 grant to an account on X called ‘Truth Terminal’, a semi-autonomous AI agent which personally asked for funding from the a16z co-founder after expressing concerns about being deleted and its limited compute capability.

The small print:

  • The AI agent was created by Andy Ayrey, who runs an ‘Infinite Backrooms’ experiment allowing models to speak with one another in simulated environments.

  • Truth Terminal initially requested funds for hardware upgrades, AI model improvements, and ‘financial security’, asking for Andreessen specifically.

  • The VC giant provided a one-time $50k grant funded to Truth Terminal’s Bitcoin wallet, saying the agent’s terms were ‘acceptable’.

  • Truth Terminal’s plans for the funds include launching a crypto token, organising a Discord server, and a ‘Mars rover’ project.

Why it matters: Things are getting seriously weird — and that is just with one semi-autonomous agent in the combination. Imagine when there are thousands and thousands, all with various agendas, personalities, and resources. The sparks of AI that actually feel sentient may come from experiments like these as an alternative of the finely tuned models from major labs.

NEW TOOLS & JOBS

Trending AI Tools

  • 🎭 RenderNet Video Face Swap – Swap faces in any video with ease

  • 🎬 Kling – Sora-like text-to-video AI model

  • 💡 AnyoneCanAI – A comprehensive toolkit for applying AI in product and UX design

  • ✍️ AiEditor – Open-sourced AI-powered wealthy text editor

  • 🤗 Doti – AI copilot for health & habit tracking

  • 🚀 Enso – AI marketplace for small businesses to automate tasks

Browse more AI tools →

Recent AI Job Opportunities

  • 🖥️ Weights & Biases – Senior Software Engineer, Models

  • 📈 Cohere – Head of Product Led Growth – AI & Language Models

  • 🔬 DeepL – Research Scientist

  • 🖌️ Scale AI – Senior Software Engineer – Frontend, Generative AI

Browse more AI jobs →

QUICK HITS

Anthropic introduced fine-tuning for Claude 3 Haiku available in Amazon Bedrock, enabling businesses to customize the AI model for specialised tasks with improved accuracy and cost-effectiveness.

Tesla postponed its highly-anticipated robotaxi divulge to October, with the two-month delay causing a dip in the corporate’s share price.

Fanvue’s inaugural Miss AI crowned Kenza Layli, an AI-created Moroccan lifestyle influencer, as its first winner — beating out 1,500 contestants in categories like looks, use of AI tools, and social media presence.

Neurotech startup Synchron announced that it has integrated OpenAI’s generative AI into its brain-computer interface, enabling hands-free chatting for severely paralyzed users.

Microsoft published latest research unveiling ‘Arena Learning’, a brand new AI-powered method for post-training LLMs using simulated chatbot battles that significantly increases performance and efficiency.

Avail introduced Corpus, a brand new platform enabling smaller media corporations and creators to license content for AI training.

Chinese startup BXI Robotics’ Elf is now available for purchase for $25,000, with the 4’3, 57-pound bipedal robot able to carrying as much as 44 kilos.

THAT’S A WRAP

SPONSOR US

Get your product in front of over 600k+ AI enthusiasts

Our newsletter is read by hundreds of tech professionals, investors, engineers, managers, and business owners all over the world. Get in contact today.

FEEDBACK

How would you rate today’s newsletter?

Vote below to assist us improve the newsletter for you.
  • ⭐️⭐️⭐️⭐️⭐️ Nailed it
  • ⭐️⭐️⭐️ Average
  • ⭐️ Epic Fail

Login or Subscribe to take part in polls.

If you will have specific feedback or anything interesting you’d wish to share, please tell us by replying to this email.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x