Home Artificial Intelligence Big Savings On Big Data Motivation Key Metrics Reducing Compute Costs Accelerating Development Iterations Conclusion

Big Savings On Big Data Motivation Key Metrics Reducing Compute Costs Accelerating Development Iterations Conclusion

187
Big Savings On Big Data
Motivation
Key Metrics
Reducing Compute Costs
Accelerating Development Iterations
Conclusion

By Anindya Saha & Han Wang

Image by DALL·E

In previous articles, we talked concerning the ML Platform of Lyft, LyftLearn, which manages ML model training in addition to batch predictions. With the quantity of knowledge Lyft has to process, it’s natural that the fee of operating the platform may be very high.

After we talked about how we democratized distributed compute, we described an answer with some key design principles resembling .

In early 2022, we accomplished this migration. Now’s time to guage the impact of the design decisions during the last two years, in each increasing developer productivity and lowering cost.

In this text, we define each run as executing an information/ML task using an ephemeral Spark/Ray cluster. The time and price of runs are measured by their ephemeral Spark/Ray usage.

Runs are the option to use the LyftLearn big data system in each development and production. There are two foremost use cases in the event environment: running ad-hoc tasks and iterating as a way to create a production workflow.

We are going to compare the metrics of runs between 2021 and 2022 in development () and production ().

In 2022, we had an enormous increase in production usage.

Total number of runs (%) in the production and development
Total variety of runs (%) in production and development

The whole variety of runs increased and prod runs increased . In later sections, we’ll explain why the rise isn’t proportional between dev and prod.

We also boosted users’ development speed:

Comparison of average minutes required for one run in Development vs Production
Comparison of average minutes required for one run in Development vs Production

The typical per-iteration time (the blue bars) on big data reduced from 31 minutes to 11 minutes. That shows .

Notice that the prod run time increased barely as a result of recent heavier jobs. This also points to the indisputable fact that the massive increase in prod runs is organic and isn’t as a result of breaking up large existing workloads.

More usage and faster iterations on big data commonly require more compute resource and better cost. How way more did we spend in 2022 vs 2021?

Comparing the cost incurred in Production and Development
Comparing the fee incurred in Production and Development

Surprisingly, in 2022, not only were we successful in controlling the general cost (), but we also managed to .

The whole dev cost reduced 32% despite the fact that the dev usage barely increased in 2022. How did we achieve that?

Comparing cost incurred per run in last 2 years for the development and production environments
Comparing cost incurred per run in last 2 years for the event and production environments

We were able to scale back the common dev per-run cost from $25 to $10.7 (-57%). Meaning .

One other data point price mentioning: .

Within the previous article, we mentioned that the LyftLearn platform enforces ephemeral clusters. Within the LyftLearn notebook experience, users can declare cluster resources for every step of their workflow. Within the image below, a user is requesting a Spark cluster with 8 machines, each with 8 CPUs and 32 GB of RAM. The cluster is ephemeral and only exists during the SparkSQL query.

Defining Spark cluster configuration
Defining Spark cluster configuration

Using ephemeral clusters has contributed a good portion of total savings. Managed platforms like AWS Elastic MapReduce are likely to require an information scientist to spin up a cluster after which develop on top of that cluster. This results in under-utilization (as a result of idling) during project iteration. Ephemeral clusters ensure users are allocated costly resources only when vital.

It’s also essential to say LyftLearn Spark autoscaling. Autoscaling can result in instability or underutilization. It’s less useful when the clusters are already ephemeral. We also found similar patterns discussed in this text published by Sync Computing.

The advantages of being explicit on compute resources are:

  1. Users are aware of the resources they really want for his or her cases.
  2. Resource contention within the K8s clusters is reduced

Loads of LyftLearn users are surprised with the spin-up time (2–5 seconds) due to Kubernetes Spark with cached images. Ephemeral clusters also directly reduce maintenance because different steps of a workflow could be executed using different images to separate packages that conflict with one another (i.e. requiring different versions for dependencies).

One other big a part of cost savings is selecting the tool that’s handiest for the job. That is most evident with Presto and Hive. In this text, we shared the most effective practices for selecting them:

Presto is sweet for aggregation and small output scenarios — it shouldn’t take greater than 10 minutes. If Presto is slow, try Hive.

Hive is slower but generally more scalable. At all times try to avoid wasting the output to files as an alternative of dumping it into Pandas.

As more big data frameworks come into the landscape of knowledge science, we’d like to decide on the most effective tool for every a part of the job. One among the essential pieces of the LyftLearn platform is to offer data practitioners the flexibleness and ease to decide on the most effective tool for every job.

For instance, some data pipelines inside Lyft leverage Spark for preprocessing and Ray for the distributed machine learning portion. This can also be specifically enabled by ephemeral clusters. (Watch our Data AI Summit 2022 Talk)

One other less tracked type of savings are the hours saved as a result of operational efficiencies gained as a result of the LyftLearn platform. The massive reduction on time of dev runs and better ratio of prod to dev variety of runs directly translates to data scientists having more time spent on modeling and scientific computing. More importantly, more projects make it to production to generate real business value.

Our abstraction layer of compute, built on top of the open-source project Fugue, plays the important thing role in accelerating development iterations. It optimizes big data workstreams in 3 ways:

With a backend agnostic design, we . Only well tested code finally ends up running on clusters. This explains why in 2022 the rise of prod and dev runs weren’t proportional. A big portion of the iterations happened locally without using clusters.

That is probably the most essential sources of LyftLearn savings.

Developing a fancy Hive(Spark) query with a whole bunch of lines is one in every of the most important and commonest challenges for Lyft ML practitioners. On account of the Common Table Expression(CTE) syntax, breaking up a SQL query to small subqueries to run isn’t practical. Iterating on such queries requires re-running the entire query each time. In a worse situation, when a fancy query never finishes, the owner can’t even know which step caused the issue. Retrying is inefficient and incurs big cost too.

FugueSQL is a superset of traditional SQL with improved syntax and features: it doesn’t require CTE. As a substitute, the task syntax could make the SQL query easy to interrupt up and mix.

Breaking up and combining complex SQL queries using FugueSQL
Breaking up and mixing complex SQL queries using FugueSQL

Within the above example, let’s assume the unique hive query has unknown issues. We will rewrite it in FugueSQL and break it up into multiple parts to iterate. In the primary cell, YIELD FILE will cache b to a file (saved by Spark) and make the reference available for the next cells. Within the second cell, we are able to directly use b which will probably be loaded from S3. Lastly, we may also print the result to confirm. In this manner we are able to quickly debug issues. More importantly, with caching, finished cells is not going to have to be re-run in the next iterations.

When multiple parts work end to finish, we just copy-paste them together and take away the YIELD. Notice we also add a PERSIST to b, because it’s going to be used twice in the next steps. That is to explicitly tell Spark to cache this result to avoid recompute.

FugueSQL should generate equivalent results as the unique SQL, however it has significant benefits:

  1. Divide-and-conquer becomes possible for SQL, significantly speeding up iteration time on complex problems.
  2. The ultimate FugueSQL is usually faster than the unique SQL (if we explicitly cache the intermediate steps to avoid recompute).

We may also easily construct back the standard Hive SQL after we fix all problems within the iterations. The slowest and most costly part is at all times the event iterations which we are able to improve using the Fugue approach.

We don’t require users to modernize their entire workloads in a single shot. As a substitute, we encourage them to migrate incrementally with vital refactoring.

There are numerous existing workloads written with small data tooling resembling Pandas and scikit-learn. In quite a lot of cases, if one step is compute intensive, then users can refactor their code to separate out the core computing logic, then use one Fugue transform call to distribute the logic.

Subsequently, incremental adoption can also be a natural process for users to adopt good coding practices and rewrite prime quality code that’s scale agnostic and framework (Spark, Ray, Fugue, etc.) agnostic.

The metrics shown from 2021 to 2022 led to each productivity boost and price savings, and doesn’t even include the advantages from human-hours saved from the improved development speed. Lyft’s top line also increased from the ML models that were in a position to reach production with the support of the LyftLearn platform.

Developing big data projects could be significantly expensive in each money and time, but LyftLearn succeeded in bringing down costs by enforcing best practices, simplifying the programming model and accelerating iterations.

As at all times, Lyft is hiring! Should you’re keen about developing state-of-the-art systems join our team.

187 COMMENTS

  1. Today, while I was at work, my cousin stole my apple ipad and tested to see if it can survive a forty foot drop, just so she can be a youtube sensation. My iPad is now destroyed and she has 83 views. I know this is totally off topic but I had to share it with someone!|

  2. Hi there, just became alert to your blog through Google,
    and found that it is truly informative. I’m going to watch out for brussels.

    I’ll appreciate if you continue this in future. Lots of people will
    be benefited from your writing. Cheers!

  3. Your style is very unique in comparison to other folks I’ve read stuff from.
    Thank you for posting when you’ve got the opportunity, Guess I’ll just bookmark this page.

  4. Excellent beat ! I wish to apprentice while you amend your web site, how can i subscribe for a blog website?
    The account aided me a acceptable deal. I had been tiny bit
    acquainted of this your broadcast provided bright clear
    concept

  5. Today, I went to the beach front with my children. I found a sea shell and gave it to my 4 year old daughter and said “You can hear the ocean if you put this to your ear.” She placed the shell to her ear and screamed.
    There was a hermit crab inside and it pinched her ear. She never wants
    to go back! LoL I know this is completely off topic
    but I had to tell someone!

  6. I was recommended this blog by my cousin. I am not sure whether this post is written by him as no one else know such detailed about my problem. You’re amazing! Thanks!|

  7. It is appropriate time to make a few plans for the future and it’s time to be
    happy. I have read this submit and if I may I desire to counsel you some attention-grabbing things or suggestions.
    Maybe you can write subsequent articles referring to this article.
    I want to learn even more things approximately it!

  8. Hey very cool web site!! Man .. Excellent ..
    Amazing .. I’ll bookmark your site and take the feeds also?
    I am glad to find a lot of useful info right here in the post, we’d like work out
    more techniques in this regard, thanks for
    sharing. . . . . .

  9. You’re so interesting! I do not believe I have read through a single thing like this before.
    So great to find someone with some original thoughts on this subject matter.
    Really.. many thanks for starting this up. This website is one thing that’s needed
    on the web, someone with a bit of originality!

  10. Generally I do not read post on blogs, however I would like
    to say that this write-up very forced me to take a look at and do so!
    Your writing taste has been surprised me. Thanks, quite nice post.

  11. Thank you for every other fantastic article. The place else could anyone get that type of information in such a
    perfect way of writing? I’ve a presentation next week,
    and I’m at the search for such info.

  12. Superb blog you have here but I was wanting to know if you knew
    of any forums that cover the same topics discussed here?

    I’d really like to be a part of group where I
    can get suggestions from other experienced individuals that share the
    same interest. If you have any recommendations, please
    let me know. Cheers!

  13. I’ve been surfing on-line more than 3 hours lately, yet I never discovered any attention-grabbing article like yours. It is pretty worth sufficient for me. In my view, if all web owners and bloggers made excellent content material as you probably did, the net will likely be a lot more helpful than ever before.|

  14. I truly love your site.. Very nice colors & theme. Did you make this web site yourself?
    Please reply back as I’m hoping to create my own personal website and would love to learn where you got this from or what the theme is called.
    Cheers!

  15. Аn impressive share! I have just forwarⅾed this onto a
    co-worker who waѕ doing a little resеarch on thiѕ.
    And he iin fact bougһt me breakfast dduе to the fact that I found it fⲟr him…
    lol. So let mе reword thiѕ…. Thanks for the meal!! But yeɑh, thanx for spending time tօ discuss this matter here on your site.

  16. Hi there! I’m at work surfing around your blog from my new apple iphone!
    Just wanted to say I love reading your blog and look forward to all your posts!
    Carry on the excellent work!

  17. Nice post. I learn something totally new and challenging on websites I stumbleupon on a daily basis.
    It’s always useful to read through articles from other writers and
    practice a little something from their websites.

  18. I just like the helpful information you provide to your articles.
    I’ll bookmark your blog and check once more here frequently.
    I’m slightly certain I will learn plenty of new stuff proper right here!

    Best of luck for the next!

  19. Hey there! This is kind of off topic but I need some advice from an established
    blog. Is it difficult to set up your own blog?

    I’m not very techincal but I can figure things
    out pretty fast. I’m thinking about setting up my own but
    I’m not sure where to start. Do you have any points or suggestions?
    Thanks

  20. Its like you read my mind! You appear to know so much about this, like you wrote the book in it or something.
    I think that you could do with a few pics to drive
    the message home a little bit, but instead of that, this is magnificent
    blog. A fantastic read. I’ll certainly be back.

  21. Your style is very unique in comparison to other people I’ve read stuff from.
    Thank you for posting when you have the opportunity, Guess I
    will just bookmark this blog.

  22. Howdy! This is kind of off topic but I need some advice from an established
    blog. Is it difficult to set up your own blog? I’m not very techincal but
    I can figure things out pretty quick. I’m thinking about setting up my
    own but I’m not sure where to begin. Do you have any points or suggestions?
    Cheers

  23. Does your site have a contact page? I’m having problems locating
    it but, I’d like to shoot you an email. I’ve got some recommendations for your blog you might be interested in hearing.

    Either way, great blog and I look forward to seeing it expand
    over time.

  24. Have you ever thought about writing an ebook or guest authoring on other blogs?
    I have a blog centered on the same topics you discuss and would really like to have you share some stories/information. I know my audience
    would appreciate your work. If you are even remotely interested, feel free to send
    me an e mail.

  25. First of all I would like to say wonderful blog! I
    had a quick question that I’d like to ask if you don’t mind.
    I was interested to find out how you center yourself and clear your thoughts prior to writing.
    I’ve had a tough time clearing my thoughts in getting my ideas out there.
    I do enjoy writing but it just seems like the first 10 to 15 minutes
    are usually lost simply just trying to figure out how to begin. Any ideas or tips?
    Appreciate it!

  26. What i don’t realize is if truth be told how you’re no longer really a lot more well-preferred than you might be
    right now. You are so intelligent. You already know therefore significantly with regards
    to this subject, produced me for my part imagine it from so many varied angles.
    Its like women and men aren’t fascinated except it’s something to accomplish with Woman gaga!
    Your individual stuffs excellent. At all times take care of it up!

  27. Thanks for one’s marvelous posting! I truly enjoyed reading it, you can be a great author. I will be sure to bookmark your blog and will often come back at some point. I want to encourage yourself to continue your great job, have a nice evening!|

  28. My coder is trying to convince me to move to .net
    from PHP. I have always disliked the idea because of the costs.
    But he’s tryiong none the less. I’ve been using WordPress on numerous websites
    for about a year and am nervous about switching to another platform.
    I have heard great things about blogengine.net.
    Is there a way I can transfer all my wordpress content into it?

    Any help would be really appreciated!

  29. Thanks for every other great post. Where else may just anyone get that kind of information in such a perfect manner of writing? I have a presentation subsequent week, and I’m at the look for such information.|

  30. hi!,I like your writing very a lot! proportion we be in contact extra approximately your post on AOL? I need a specialist on this house to unravel my problem. May be that’s you! Looking ahead to look you. |

  31. Simply desire to say your article is as astonishing. The
    clearness in your post is simply spectacular and i can assume you are an expert on this
    subject. Well with your permission allow me to grab your RSS feed to keep
    up to date with forthcoming post. Thanks a million and please carry on the enjoyable
    work.

  32. What i do not understood is in truth how you are not really much more well-appreciated than you might be right now. You are so intelligent. You understand therefore considerably on the subject of this matter, made me in my opinion consider it from so many various angles. Its like men and women aren’t interested until it is one thing to do with Girl gaga! Your individual stuffs nice. All the time care for it up!|

  33. Howdy! I know this is kinda off topic but
    I was wondering which blog platform are you using for this site?
    I’m getting tired of WordPress because I’ve
    had problems with hackers and I’m looking at alternatives for
    another platform. I would be fantastic if you could point me in the direction of a good platform.

  34. An impressive share! I have just forwarded this onto a colleague who was conducting a little
    research on this. And he actually ordered me breakfast
    due to the fact that I found it for him… lol. So allow me to reword this….
    Thanks for the meal!! But yeah, thanks for spending time to discuss this
    subject here on your web site.

  35. Undeniably believe that which you said. Your favorite justification seemed to be on the web the easiest thing to be aware of.
    I say to you, I definitely get annoyed while people consider worries
    that they plainly don’t know about. You managed to hit the nail upon the top and also defined out the whole thing without having side-effects
    , people can take a signal. Will likely be back to get more.
    Thanks

  36. Hmm it seems like your website ate my first comment (it was
    super long) so I guess I’ll just sum it up what I wrote
    and say, I’m thoroughly enjoying your blog.

    I as well am an aspiring blog writer but I’m still new to the whole thing.
    Do you have any tips and hints for first-time blog writers?

    I’d genuinely appreciate it.

  37. Hey there would you mind letting me know which hosting company you’re utilizing?
    I’ve loaded your blog in 3 different internet browsers and
    I must say this blog loads a lot quicker then most.
    Can you recommend a good internet hosting provider at a fair price?
    Kudos, I appreciate it!

  38. Today, I went to the beach front with my children. I found a sea shell and gave it to my 4 year old daughter and said “You can hear the ocean if you put this to your ear.” She put the shell to her ear and screamed. There was a hermit crab inside and it pinched her ear. She never wants to go back! LoL I know this is totally off topic but I had to tell someone!|

  39. you’re really a excellent webmaster. The website loading pace is amazing.
    It seems that you’re doing any unique trick.
    Also, The contents are masterwork. you have performed a fantastic activity on this subject!

  40. I was more than happy to find this web site. I need to to thank you for your time just for this fantastic read!! I definitely savored every little bit of it and I have you book-marked to see new stuff in your blog.|

  41. Hey I know this is off topic but I was wondering if you knew of any widgets I could add to my blog that automatically tweet my newest twitter updates. I’ve been looking for a plug-in like this for quite some time and was hoping maybe you would have some experience with something like this. Please let me know if you run into anything. I truly enjoy reading your blog and I look forward to your new updates.|

  42. Hey There. I found your blog using msn. This is a really well written article. I’ll be sure to bookmark it and return to read more of your useful info. Thanks for the post. I will definitely comeback.|

  43. You’re so cool! I do not suppose I’ve read a single thing like this before.
    So good to discover someone with genuine thoughts on this subject.
    Seriously.. thank you for starting this up. This
    website is something that’s needed on the web, someone with some originality!

  44. Hmm is anyone else encountering problems with the pictures on this blog loading?
    I’m trying to figure out if its a problem on my end or
    if it’s the blog. Any feedback would be greatly appreciated.

  45. Your style is very unique in comparison to other folks I’ve read stuff from.
    Thank you for posting when you’ve got the opportunity, Guess I’ll just bookmark this page.

  46. I was excited to discover this great site. I want to to thank
    you for ones time for this particularly wonderful read!!
    I definitely loved every little bit of it and I have you saved to fav to look at new information on your web site.

  47. Today, I went to the beachfront with my children. I found
    a sea shell and gave it to my 4 year old daughter and said “You can hear the ocean if you put this to your ear.” She placed the shell to her ear and screamed.
    There was a hermit crab inside and it pinched her ear.

    She never wants to go back! LoL I know this is entirely off topic but I had to tell someone!

  48. Just desire to say your article is as astounding.
    The clarity in your post is simply nice and i could assume you’re an expert on this
    subject. Fine with your permission allow me to grab your RSS
    feed to keep updated with forthcoming post. Thanks a million and
    please carry on the gratifying work.

  49. For most up-to-date news you have to pay a quick visit world wide web and on web I found this web page as a most excellent web page for most up-to-date updates.|

  50. A Sobreezybabe company.

    Explore Clean Glam’s range of clean beauty cosmetics and natural skincare products.

    From organic skincare routines to vegan beauty products,
    we lead the clean beauty market with eco-friendly, non-toxic solutions.
    Learn how to create effortless Clean Glam Makeup looks
    with our step by step tutorials! Discover holistic beauty solutions for
    a radiant you!

  51. First of all I would like to say terrific blog! I had a quick question that I’d like to ask if you don’t mind. I was interested to find out how you center yourself and clear your mind before writing. I’ve had a tough time clearing my mind in getting my thoughts out. I truly do take pleasure in writing but it just seems like the first 10 to 15 minutes are generally lost just trying to figure out how to begin. Any suggestions or hints? Thanks!|

  52. I’ve been exploring for a little bit for any high-quality articles or blog posts on this sort of space . Exploring in Yahoo I finally stumbled upon this site. Reading this information So i am happy to show that I’ve an incredibly good uncanny feeling I discovered just what I needed. I so much unquestionably will make certain to do not fail to remember this web site and provides it a look on a relentless basis.|

  53. I know this if off topic but I’m looking into starting my own weblog and was wondering what all is
    needed to get setup? I’m assuming having a blog like yours would cost a pretty penny?
    I’m not very web savvy so I’m not 100% positive. Any recommendations or advice would
    be greatly appreciated. Kudos

  54. We’re a bunch of volunteers and opening a brand new scheme in our community.
    Your site offered us with helpful info to work on. You have done an impressive job and our entire neighborhood
    might be thankful to you.

  55. Howdy this is somewhat of off topic but I was wanting to know if blogs use WYSIWYG editors or if you have to manually code with HTML.

    I’m starting a blog soon but have no coding know-how so I wanted to get advice from
    someone with experience. Any help would be greatly appreciated!

  56. I think what you posted was actually very logical.
    However, think about this, suppose you added a little
    content? I am not saying your information isn’t good., but what if you added a
    post title that grabbed folk’s attention? I mean Big Savings On Big Data
    Motivation
    Key Metrics
    Reducing Compute Costs
    Accelerating Development Iterations
    Conclusion – BARD AI is kinda boring. You might
    glance at Yahoo’s home page and note how they create post headlines
    to grab viewers to click. You might add a video or a picture or two to grab readers interested
    about what you’ve got to say. In my opinion, it could bring
    your posts a little livelier.

  57. You actually make it seem so easy with your presentation but
    I find this matter to be actually something which I think I
    would never understand. It seems too complex and extremely broad for me.
    I’m looking forward for your next post, I’ll try to get the hang of
    it!

  58. Heya i am for the first time here. I came across
    this board and I find It really useful & it helped me out a lot.

    I hope to give something back and aid others like you helped me.

  59. Great post. I was checking continuously this blog and I’m impressed! Very helpful info specially the closing part 🙂 I handle such info much. I was looking for this particular info for a very lengthy time. Thank you and good luck. |

  60. Undeniably believe that that you said. Your favorite justification appeared
    to be on the net the simplest factor to understand of.

    I say to you, I certainly get annoyed even as folks think about concerns that they just don’t know about.
    You controlled to hit the nail upon the highest and defined out the
    entire thing with no need side effect , other folks could take a
    signal. Will probably be back to get more. Thank you

  61. Yesterday, while I was at work, my cousin stole my iphone and tested
    to see if it can survive a 25 foot drop, just so she
    can be a youtube sensation. My iPad is now destroyed and she
    has 83 views. I know this is totally off topic but I had to
    share it with someone!

  62. Very nice post. I just stumbled upon your weblog and wished to say that I’ve really enjoyed surfing around your blog posts.
    After all I’ll be subscribing to your feed and I hope you
    write again soon!

  63. Superb site you have here but I was curious if you knew of any message boards that cover the same topics discussed in this article?
    I’d really like to be a part of community where I can get feedback from other knowledgeable individuals that share the same interest.
    If you have any suggestions, please let me know.

    Kudos!

  64. My brother suggested I might like this web site. He was totally right. This post actually made my day. You can not imagine just how much time I had spent for this information! Thanks!|

  65. It is appropriate time to make some plans for the future and it is time to be happy. I have read this post and if I could I desire to suggest you some interesting things or tips. Maybe you can write next articles referring to this article. I desire to read even more things about it!|

  66. Please let me know if you’re looking for a article writer for your weblog.

    You have some really good posts and I think I
    would be a good asset. If you ever want to take some of the load off, I’d absolutely love to write some material for
    your blog in exchange for a link back to mine. Please send me an email if interested.
    Regards!

  67. I have been browsing online more than 4 hours today, yet I never found any interesting article like
    yours. It is pretty worth enough for me. Personally, if all web owners and bloggers made
    good content as you did, the internet will be much more useful than ever before.

Leave a Reply to Call Girls Karachi Cancel reply

Please enter your comment!
Please enter your name here