Home Artificial Intelligence Tyler Weitzman, Co-Founder & Head of AI at Speechify – Interview Series

Tyler Weitzman, Co-Founder & Head of AI at Speechify – Interview Series

Tyler Weitzman, Co-Founder & Head of AI at Speechify – Interview Series

Tyler Weitzman is the Co-Founder, Head of Artificial Intelligence & President at Speechify, the #1 text-to-speech app on the planet, totaling over 100,000 5-star reviews. Weitzman is a graduate of Stanford University, where he received a BS in mathematics and a MS in Computer Science within the Artificial Intelligence track. He has been chosen by Inc. Magazine as a Top 50 Entrepreneur, and he has been featured in Business Insider, TechCrunch, LifeHacker, CBS, amongst other publications. Weitzman’s Masters degree research focused on artificial intelligence and text-to-speech, where his final paper was titled: “CloneBot: Personalized Dialogue-Response Predictions.”

You began coding whenever you were only 9 years old, what initially attracted you to computer science?

I used to be pretty obsessed as a child with Dragon Ball Z, and I desired to learn to animate myself. I learned Adobe Flash and Photoshop and put my very own animations of Goku on a fan webpage I built. It was soon after I started learning about systems and algorithms, and once I learned I could actually program for a living that was pretty exciting. I believed it was only a hobby like playing games.

You then began constructing iphone apps whenever you were only 12 years old, what were a few of these apps?

One app known as Black SMS that permits people to send encrypted text messages to one another. One other app was called Frontback that permits users to take selfies and photos of what’s in front of them at the very same time.

Could you discuss your research at Stanford University and the way it was centered around natural language processing and speech synthesis?

My research spanned multiple uses for transformer networks, including language generation models for chat, part-of-speech tagging, punctuation prediction, and text-to-speech. Optimizing neural network inference for mobile CPUs was a primary focus and that directly translated to the offline voices available on Speechify, which work even on airplane mode.

Could you share the genesis story behind Speechify?

I’m blind in a single eye and my brother Cliff is dyslexic. We’ve used audiobooks and text to speech audio technology for so long as we will remember to get through school and once we were young for reading books like Harry Potter. As we got older and commenced to make use of more technology products, we began to comprehend there was a possibility to construct higher text to speech apps on web and mobile with higher voices due to advancements in AI and a greater user experience. So we decided to go for it in Speechify.

What are a number of the different machine learning technologies which are used at Speechify?

We’ve adopted cutting-edge techniques for advanced generative architectures— transformers/conformers, large-scale pretraining, distributed training, gradient accumulation, auto-encoded latent spaces, diffusion, adversarial networks, and language modeling. We employ supporting techniques for feature processing surrounding phonemization, pitch, and emotion, to higher model speech specifically.

What are a number of the challenges behind constructing a text-to-speech app?

One key challenge is constructing prime quality voices that sound like real humans reasonably than robots. Our goal is for people to not give you the option to inform the difference between how our voices sound and the way humans sound, in order that our users are comfortable listening to content on Speechify for long periods of time. A second challenge is distributing our AI models to hundreds of thousands of users. It’s one thing to construct prime quality AI voices and one other to ensure that hundreds of thousands of users the world over actually discover about them and use them.

Speechify is the #1 app in its category within the app store, what do you attribute this success to?

We consider we’ve built one of the best products available in the market for individuals who need to take heed to the reading they should eat – whether it’s students with homework, professionals who’re reading for work, or leisure readers who just need to be entertained. We’ve one of the best number of voices, including celebrities like Snoop Dogg, and one of the best user interface for people to simply upload and access the content that they need to eat. And our user experience is seamless across the Speechify ecosystem – you may start listening to an article in your computer after which easily zap it to maintain listening in your phone.

What are a number of the biggest use cases for this app?

Speechify’s generative AI solves real problems for college kids who need to get through a lot of homework faster, real individuals with Dyslexia and ADHD who’ve trouble reading, seniors with low vision, professionals who need to read more and be more productive, writers who need to take heed to their work, auditory learners, and countless others.

What’s your vision for the longer term of AI?

We would like AI – and specifically AI text to speech voices – to eliminate barriers to learning no matter your income level, learning differences, geography, or language. We see AI as a tool for social good to raise the standard of life humans can pass though improving their education.


Please enter your comment!
Please enter your name here