Home Artificial Intelligence Amr Nour-Eldin, Vice President of Technology at LXT – Interview Series

Amr Nour-Eldin, Vice President of Technology at LXT – Interview Series

1
Amr Nour-Eldin, Vice President of Technology at LXT – Interview Series

Amr Nour-Eldin, is the Vice President of Technology at LXT. Amr is a Ph.D. research scientist with over 16 years of skilled experience within the fields of speech/audio processing and machine learning within the context of Automatic Speech Recognition (ASR), with a selected focus and hands-on experience lately on deep learning techniques for streaming end-to-end speech recognition.

LXT is an emerging leader in AI training data to power intelligent technology for global organizations. In partnership with a global network of contributors, LXT collects and annotates data across multiple modalities with the speed, scale and agility required by the enterprise. Their global expertise spans greater than 145 countries and over 1000 language locales.

You pursued a PhD in Signal Processing from McGill University, what initially interested you on this field?

I all the time wanted to review engineering, and really liked natural sciences normally, but was drawn more specifically to math and physics. I discovered myself all the time attempting to work out how nature works and tips on how to apply that understanding to create technology. After highschool, I had the chance to enter medicine and other professions, but specifically selected engineering because it represented the proper combination in my opinion of each theory and application within the two fields closest to my heart: math and physics. After which once I had chosen it, there have been many potential paths – mechanical, civil, and so forth. But I specifically selected electrical engineering since it’s the closest, and the hardest in my opinion, to the sort of math and physics problems which I all the time found difficult and hence, enjoyed more, in addition to being the muse of recent technology which has all the time driven me.

Inside electrical engineering, there are numerous specializations to select from, which generally fall under two umbrellas: telecommunications and signal processing, and that of power and electrical engineering. When the time got here to make a choice from those two, I selected telecom and signal processing since it’s closer to how we describe nature through physics and equations. You are talking about signals, whether it’s audio, images or video; understanding how we communicate and what our senses perceive, and tips on how to mathematically represent that information in a way that permits us to leverage that knowledge to create and improve technology.

Could you discuss your research at McGill University on the information-theoretic aspect of artificial Bandwidth extension (BWE)?

After I finished my bachelor’s degree, I desired to keep pursuing the Signal Processing field academically. After one yr of studying Photonics as a part of a Master’s degree in Physics, I made a decision to modify back to Engineering to pursue my master’s in Audio and Speech signal processing, specializing in speech recognition. When it got here time to do my PhD, I desired to broaden my field a bit bit into general audio and speech processing in addition to the closely-related fields of Machine Learning and Information Theory, relatively than simply specializing in the speech recognition application.

The vehicle for my PhD was the bandwidth extension of narrowband speech. Narrowband speech refers to standard telephony speech. The frequency content of speech extends to around 20 kilohertz, but nearly all of the data content is concentrated up to only 4 kilohertz. Bandwidth extension refers to artificially extending speech content from 3.4 kilohertz, which is the upper frequency certain in conventional telephony, to above that, as much as eight kilohertz or more. To raised reconstruct that missing higher frequency content given only the available narrow band content, one has to first quantify the mutual information between speech content within the two frequency bands, then use that information to coach a model that learns that shared information; a model that, once trained, can then be used to generate highband content given only narrowband speech and what the model learned in regards to the relationship between that available narrowband speech and the missing highband content. Quantifying and representing that shared “mutual information” is where information theory is available in. Information theory is the study of quantifying and representing information in any signal. So my research was about incorporating information theory to enhance the bogus bandwidth extension of speech. As such, my PhD was more of an interdisciplinary research activity where I combined signal processing with information theory and machine learning.

You were a Principal Speech Scientist at Nuance Communications, now an element of Microsoft, for over 16 years, what were a few of your key takeaways from this experience?

From my perspective, an important profit was that I used to be all the time working on state-of-the-art, cutting-edge techniques in signal processing and machine learning and applying that technology to real-world applications. I got the possibility to use those techniques to Conversational AI products across multiple domains. These domains ranged from enterprise, to healthcare, automotive, and mobility, amongst others. A few of the specific applications included virtual assistants, interactive voice response, voicemail to text, and others where proper representation and transcription is critical, similar to in healthcare with doctor/patient interactions. Throughout those 16 years, I used to be fortunate to witness firsthand and be a part of the evolution of conversational AI, from the times of statistical modeling using Hidden Markov Models, through the gradual takeover of Deep Learning, to now where deep learning proliferates and dominates just about all features of AI, including Generative AI in addition to traditional predictive or discriminative AI. One other key takeaway from that have is the crucial role that data plays, through quantity and quality, as a key driver of AI model capabilities and performance.

You’ve published a dozen papers including in such acclaimed publications as IEEE. In your opinion, what’s probably the most groundbreaking paper that you simply published and why was it vital?

Probably the most impactful one, by variety of citations in line with Google Scholar, can be a 2008 paper titled “Mel-Frequency Cepstral Coefficient-Based Bandwidth Extension of Narrowband Speech”. At a high level, the main target of this paper  is about tips on how to reconstruct speech content using a feature representation that’s widely utilized in the sphere of automatic speech recognition (ASR), mel-frequency cepstral coefficients.

Nevertheless, the more modern paper in my opinion, is a paper with the second-most citations, a 2011 paper titled “Memory-Based Approximation of the Gaussian Mixture Model Framework for Bandwidth Extension of Narrowband Speech“. In that work, I proposed a latest statistical modeling technique that includes temporal information in speech. The advantage of that technique is that it allows modeling long-term information in speech with minimal additional complexity and in a fashion that also also allows the generation of wideband speech in a streaming or real-time fashion.

In June 2023 you were recruited as Vice President of Technology at LXT, what attracted you to this position?

Throughout my academic and skilled experience prior to LXT, I actually have all the time worked directly with data. In actual fact, as I noted earlier, one key takeaway for me from my work with speech science and machine learning was the crucial role data played within the AI model life cycle. Having enough quality data in the fitting format was, and continues to be, vital to the success of state-of-the-art deep-learning-based AI. As such, once I happened to be at a stage of my profession where I used to be looking for a startup-like environment where I could learn, broaden my skills, in addition to leverage my speech and AI experience to have probably the most impact, I used to be fortunate to have the chance to affix LXT. It was the proper fit. Not only is LXT an AI data provider that’s growing at a powerful and consistent pace, but I also saw it as at the proper stage by way of growth in AI know-how in addition to in client size and variety, and hence in AI and AI data types. I relished the chance to affix and assist in its growth journey; to have a big effect by bringing the attitude of a knowledge end user after having been an AI data scientist user for all those years.

What does your average day at LXT seem like?

My average day starts with looking into the most recent research on one topic or one other, which has currently centered around generative AI, and the way we will apply that to our customers’ needs. Luckily, I actually have a superb team that could be very adept at creating and tailoring solutions to our clients’ often-specialized AI data needs. So, I work closely with them to set that agenda.

There may be also, in fact, strategic annual and quarterly planning, and breaking down strategic objectives into individual team goals and keeping on top of things with developments along those plans. As for the feature development we’re doing, we generally have two technology tracks. One is to make certain we’ve got the fitting pieces in place to deliver the very best outcomes on our current and latest incoming projects. The opposite track is improving and expanding our technology capabilities, with a give attention to incorporating machine learning into them.

Could you discuss the forms of machine learning algorithms that you simply work on at LXT?

Artificial intelligence solutions are transforming businesses across all industries, and we at LXT are honored to supply the high-quality data to coach the machine learning algorithms that power them. Our customers are working on a wide selection of applications, including augmented and virtual reality, computer vision, conversational AI, generative AI, search relevance and speech and natural language processing (NLP), amongst others. We’re dedicated to powering the machine learning algorithms and technologies of the longer term through data generation and enhancement across every language, culture and modality.

Internally, we’re also incorporating machine learning to enhance and optimize our internal processes, starting from automating our data quality validation, to enabling a human-in-the-loop labeling model across all data modalities we work on.

Speech and audio processing is rapidly approaching near perfection on the subject of English and specifically white men. How long do you anticipate it is going to be until it’s a fair playing field across all languages, genders, and ethnicities?

That is a sophisticated query, and is determined by plenty of aspects, including the economic, political, social and technological, amongst others. But what is evident is that the prevalence of the English language is what drove AI to where we are actually. So to get to a spot where it is a level playing field really is determined by the speed at which the representation of knowledge from different ethnicities and populations grows online, and the pace at which it grows is what’s going to determine once we get there.

Nevertheless, LXT and similar firms can have a giant hand in driving us toward a more level playing field. So long as the info for less well-represented languages, genders and ethnicities is difficult to access or just not available, that change will come more slowly. But we are attempting to do our part. With coverage for over 1,000 language locales and experience in 145 countries, LXT helps to make access to more language data possible.

What’s your vision for a way LXT can speed up AI efforts for various clients?

Our goal at LXT is to supply the info solutions that enable efficient, accurate, and faster AI development. Through our 12 years of experience within the AI data space, not only have we accrued extensive know-how about clients’ needs by way of all features regarding data, but we’ve got also repeatedly fine-tuned our processes to be able to deliver the best quality data on the quickest pace and best price points. Consequently, consequently of our steadfast commitment to providing our clients the optimal combination of AI data quality, efficiency, and pricing, we’ve got turn into a trusted AI data partner as evident by our repeat clients who keep coming back to LXT for his or her ever-growing and evolving AI data needs. My vision is to cement, improve and expand that LXT “MO” to all of the modalities of knowledge we work on in addition to to all sorts of AI development we now serve, including generative AI. Achieving this goal revolves around strategically expanding our own machine learning and data science capabilities, each by way of technology in addition to resources.

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here