Jay Dawani is Co-founder & CEO of Lemurian Labs – Interview Series

-

Jay Dawani is Co-founder & CEO of Lemurian Labs. Lemurian Labs is on a mission to deliver inexpensive, accessible, and efficient AI computers, driven by the assumption that AI shouldn’t be a luxury but a tool accessible to everyone. The founding team at Lemurian Labs combines expertise in AI, compilers, numerical algorithms, and computer architecture, united by a single purpose: to reimagine accelerated computing.

Are you able to walk us through your background and what got you into AI to start with?

Absolutely. I’d been programming since I used to be 12 and constructing my very own games and such, but I actually got into AI after I was 15 due to a friend of my fathers who was into computers. He fed my curiosity and gave me books to read reminiscent of Von Neumann’s ‘The Computer and The Brain’, Minsky’s ‘Perceptrons’, Russel and Norvig’s ‘AI A Modern Approach’. These books influenced my considering loads and it felt almost obvious then that AI was going to be transformative and I just needed to be an element of this field. 

When it got here time for university I actually wanted to check AI but I didn’t find any universities offering that, so I made a decision to major in applied mathematics as an alternative and a little bit while after I got to school I heard about AlexNet’s results on ImageNet, which was really exciting. At the moment I had this now or never moment occur in my head and went full bore into reading every paper and book I could get my hands on related to neural networks and sought out all of the leaders in the sphere to learn from them, because how often do you get to be there on the birth of a recent industry and learn from its pioneers. 

In a short time I noticed I don’t enjoy research, but I do enjoy solving problems and constructing AI enabled products. That led me to working on autonomous cars and robots, AI for material discovery, generative models for multi-physics simulations, AI based simulators for training skilled racecar drivers and helping with automobile setups, space robots, algorithmic trading, and rather more. 

Now, having done all that, I’m attempting to reign in the associated fee of AI training and deployments because that will probably be the best hurdle we face on our path to enabling a world where one and all and company can have access to and profit from AI in probably the most economical way possible.

Many firms working in accelerated computing have founders which have built careers in semiconductors and infrastructure. How do you think that your past experience in AI and arithmetic impacts your ability to grasp the market and compete effectively?

I actually think not coming from the industry gives me the good thing about having the outsider advantage. I even have found it to be the case very often that not having knowledge of industry norms or conventional wisdoms gives one the liberty to explore more freely and go deeper than most others would since you’re unencumbered by biases. 

I even have the liberty to ask ‘dumber’ questions and test assumptions in a way that the majority others wouldn’t because numerous things are accepted truths. Previously two years I’ve had several conversations with folks inside the industry where they’re very dogmatic about something but they will’t tell me the provenance of the thought, which I find very puzzling. I like to grasp why certain selections were made, and what assumptions or conditions were there at the moment and in the event that they still hold. 

Coming from an AI background I are inclined to take a software view by where the workloads today, and listed below are all of the possible ways they might change over time, and modeling your entire ML pipeline for training and inference to grasp the bottlenecks, which tells me where the opportunities to deliver value are. And since I come from a mathematical background I wish to model things to get as near truth as I can, and have that guide me. For instance, we have now built models to calculate system performance for total cost of ownership and we are able to measure the profit we are able to bring to customers with software and/or hardware and to higher understand our constraints and the several knobs available to us, and dozens of other models for various things. We’re very data driven, and we use the insights from these models to guide our efforts and tradeoffs. 

It looks as if progress in AI has primarily come from scaling, which requires exponentially more compute and energy. It looks as if we’re in an arms race with every company attempting to construct the largest model, and there appears to be no end in sight. Do you think that there may be a way out of this?

There are all the time ways. Scaling has proven extremely useful, and I don’t think we’ve seen the tip yet. We’ll very soon see models being trained with a price of no less than a billion dollars. If you wish to be a frontrunner in generative AI and create bleeding edge foundation models you’ll should be spending no less than just a few billion a 12 months on compute. Now, there are natural limits to scaling, reminiscent of with the ability to construct a big enough dataset for a model of that size, gaining access to individuals with the precise know-how, and gaining access to enough compute. 

Continued scaling of model size is inevitable, but we can also’t turn your entire earth’s surface right into a planet sized supercomputer to coach and serve LLMs for obvious reasons. To get this into control we have now several knobs we are able to play with: higher datasets, recent model architectures, recent training methods, higher compilers, algorithmic improvements and exploitations, higher computer architectures, and so forth. If we do all that, there’s roughly three orders of magnitude of improvement to be found. That’s one of the best ways out. 

You might be a believer in first principles considering, how does this mold your mindset for a way you might be running Lemurian Labs?

We definitely employ numerous first principles considering at Lemurian. I even have all the time found conventional wisdom misleading because that knowledge was formed at a certain cut-off date when certain assumptions held, but things all the time change and you could retest assumptions often, especially when living in such a quick paced world. 

I often find myself asking questions like “this looks as if a very good idea, but why might this not work”, or “what must be true to ensure that this to work”, or “what will we know which can be absolute truths and what are the assumptions we’re making and why?”, or “why will we consider this particular approach is the most effective option to solve this problem”. The goal is to invalidate and kill off ideas as quickly and cheaply as possible. We would like to try to maximize the variety of things we’re trying out at any given cut-off date. It’s about being obsessive about the issue that should be solved, and never being overly opinionated about what technology is best. Too many of us are inclined to overly give attention to the technology and so they find yourself misunderstanding customers’ problems and miss the transitions happening within the industry which could invalidate their approach leading to their inability to adapt to the brand new state of the world.

But first principles considering isn’t all that useful by itself. We are inclined to pair it with backcasting, which principally means imagining a perfect or desired future final result and dealing backwards to discover the several steps or actions needed to comprehend it. This ensures we converge on a meaningful solution that shouldn’t be only progressive but additionally grounded in point of fact. It doesn’t make sense to spend time coming up with the proper solution only to comprehend it’s not feasible to construct due to a wide range of real world constraints reminiscent of resources, time, regulation, or constructing a seemingly perfect solution but in a while checking out you’ve made it too hard for purchasers to adopt.

Now and again we discover ourselves in a situation where we want to make a call but don’t have any data, and on this scenario we employ minimum testable hypotheses which give us a signal as as to if or not something is smart to pursue with the smallest amount of energy expenditure. 

All this combined is to provide us agility, rapid iteration cycles to de-risk items quickly, and has helped us adjust strategies with high confidence, and make numerous progress on very hard problems in a really short period of time. 

Initially, you were focused on edge AI, what caused you to refocus and pivot to cloud computing?

We began with edge AI because at the moment I used to be very focused on trying to unravel a really particular problem that I had faced in attempting to usher in a world of general purpose autonomous robotics. Autonomous robotics holds the promise of being the largest platform shift in our collective history, and it appeared like we had all the things needed to construct a foundation model for robotics but we were missing the best inference chip with the precise balance of throughput, latency, energy efficiency, and programmability to run said foundation model on.

I wasn’t serious about the datacenter at the moment because there have been good enough firms focusing there and I expected they might figure it out. We designed a very powerful architecture for this application space and were on the point of tape it out, after which it became abundantly clear that the world had modified and the issue truly was within the datacenter. The speed at which LLMs were scaling and consuming compute far outstrips the pace of progress in computing, and while you consider adoption it starts to color a worrying picture. 

It felt like that is where we must always be focusing our efforts, to bring down the energy cost of AI in datacenters as much as possible without imposing restrictions on where and the way AI should evolve. And so, we set to work on solving this problem. 

Are you able to share the genesis story of Co-Founding Lemurian Labs?

The story starts in early 2018. I used to be working on training a foundation model for general purpose autonomy together with a model for generative multiphysics simulation to coach the agent in and fine-tune it for various applications, and another things to assist scale into multi-agent environments. But in a short time I exhausted the quantity of compute I had, and I estimated needing greater than 20,000 V100 GPUs. I attempted to boost enough to get access to the compute however the market wasn’t ready for that type of scale just yet. It did nonetheless get me serious about the deployment side of things and I sat all the way down to calculate how much performance I would wish for serving this model within the goal environments and I noticed there was no chip in existence that might get me there. 

A few years later, in 2020, I met up with Vassil – my eventual cofounder – to catch up and I shared the challenges I went through in constructing a foundation model for autonomy, and he suggested constructing an inference chip that might run the muse model, and he shared that he had been considering loads about number formats and higher representations would assist in not only making neural networks retain accuracy at lower bit-widths but additionally in creating more powerful architectures. 

It was an intriguing idea but was way out of my wheelhouse. Nevertheless it wouldn’t leave me, which drove me to spending months and months learning the intricacies of computer architecture, instruction sets, runtimes, compilers, and programming models. Eventually, constructing a semiconductor company began to make sense and I had formed a thesis around what the issue was and easy methods to go about it. And, then towards the tip of the 12 months we began Lemurian. 

You’ve spoken previously in regards to the have to tackle software first when constructing hardware, could you elaborate in your views of why the hardware problem is in the beginning a software problem?

What numerous people don’t realize is that the software side of semiconductors is way harder than the hardware itself. Constructing a useful computer architecture for purchasers to make use of and get profit from is a full stack problem, and if you happen to don’t have that understanding and preparedness entering into, you’ll find yourself with an attractive looking architecture that could be very performant and efficient, but totally unusable by developers, which is what is definitely essential. 

There are other advantages to taking a software first approach as well, after all, reminiscent of faster time to market. That is crucial in today’s fast paced world where being too bullish on an architecture or feature could mean you miss the market entirely. 

Not taking a software first view generally leads to not having derisked the essential things required for product adoption out there, not with the ability to reply to changes out there for instance when workloads evolve in an unexpected way, and having underutilized hardware. All not great things. That’s a giant reason why we care loads about being software centric and why our view is you could’t be a semiconductor company without really being a software company. 

Are you able to discuss your immediate software stack goals?

Once we were designing our architecture and serious about the forward looking roadmap and where the opportunities were to bring more performance and energy efficiency, it began becoming very clear that we were going to see loads more heterogeneity which was going to create numerous issues on software. And we don’t just have to give you the option to productively program heterogeneous architectures, we have now to take care of them at datacenter scale, which is a challenge the likes of which we haven’t encountered before. 

This got us concerned since the last time we needed to undergo a serious transition was when the industry moved from single-core to multi-core architectures, and at the moment it took 10 years to get software working and other people using it. We are able to’t afford to attend 10 years to work out software for heterogeneity at scale, it needs to be sorted out now. And so, we set to work on understanding the issue and what must exist to ensure that this software stack to exist. 

We’re currently engaging with numerous the leading semiconductor firms and hyperscalers/cloud service providers and will probably be releasing our software stack in the subsequent 12 months. It’s a unified programming model with a compiler and runtime able to targeting any type of architecture, and orchestrating work across clusters composed of various sorts of hardware, and is able to scaling from a single node to a thousand node cluster for the very best possible performance.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x