Home Artificial Intelligence How the insane speed of AI development has modified an AI company — Fyma

How the insane speed of AI development has modified an AI company — Fyma

0
How the insane speed of AI development has modified an AI company — Fyma

Most of us have heard concerning the advances of GPT (Generative Pre-trained Transformer) models which have finally delivered on the promise that AI evangelists like me have been preaching for around 10 years now. If I have a look at my very own way of working, almost 70% of my every day work now involves some type of generative AI. Be it “soft” things like writing marketing texts, sales e-mails, or product descriptions, I can’t really be bothered to put in writing them alone anymore. Just feed it bullet points and you should definitely validate the consequence. I’m gonna come out and say this, this text WAS NOT written by chatGPT. I enjoy writing about things I’m enthusiastic about which suggests you’ll be able to enjoy all of my grammar mistakes that chatGPT would never make.

Recently, nevertheless, was talking to an information scientist working with computer vision object detection and he or she was telling me how hard it still was to gather data for any random object. Sure when you’re detecting people, there are lots of of datasets available but for super-specific things, you continue to must suffer through the info collection process.

This got me considering that I even have not felt the pain of information collection for the last 3–4 months now and the way alien this problem felt to me. You see, ever because the release of GPT3, Dalle2, Midjourney, and all the primary image generation tools, I’ve been running a semi-secret project within Fyma. Let’s call it ObjectX because it doesn’t have a reputation…also, I’ve been calling it ObjectX.

The goal of ObjectX is to utilize Fyma’s high-performance computer vision pipeline(Read more about this at Nvidia) to permit our customers to detect ANY object by just uploading 10 images of that object. Essentially what this implies:

  1. Fyma gets 10 images of the item to be detected.
  2. We routinely label that using our own algorithms (Future plans to hopefully use SAM (https://segment-anything.com/).
  3. We generate around 2000 images using GPT algorithms.
  4. We routinely train a pc vision object detection model.
  5. We are able to immediately deploy that model with Fymas computer vision pipeline.

What does this mean for our customers? Essentially if anyone desires to detect any object and create any automation around that each one they should do is:

  1. Buy an IP camera and add the video stream to Fyma (an example of the platform might be seen here)
  2. Take 10 images of the item they wish to detect.
  3. Upload those 10 images to Fyma.
  4. Generate your individual computer vision model.
  5. Define your automation rules.
  6. Profit.

After I first began working on this, our first goal was to check how AI-generated images might be used for model training. The primary object I tested was my breakfast, a bottle of yogurt I had lying on my desk:

My alternative of an object because it had a definite shape and unique graphics.

I took 10 images of the bottle, generated 200 and pushed it into our training pipeline. The outcomes, well see for yourself:

The primary results looked really promising.

Simply because AI-generated images are all the time a crowd-pleaser, listed below are the input images I used to coach our model, oh and I didn’t use a single real-world image in my training dataset.

“Fake” image dataset example.

That example was around 3 months ago, so where are we now? Well, we’ve gone from detecting bottles to testing this with real-world customers corresponding to cities and airports in addition to gone from single-object models to multi-object models. How’s the accuracy? Well, it’s far below what an actual image dataset can provide but at the identical time, it generalizes the model lots higher. We’re combining AI-generated datasets with synthetic datasets using 3D models and in addition real-world images, and most significantly all of this routinely.

An example of it is a recent model that we trained which was to detect various airport equipment (things like luggage belt vehicles, stair vehicles, etc.). The “airport model” as we call it’s a great example of where real-world images are difficult to come up with due to privacy and security concerns. If we generate these images, nevertheless, we’re all good.

Training image from an actually in-production dataset.
More examples of an in-production dataset.

I assume the one thing to indicate you is all of this working in the true world. Obviously, I can’t show a real-world airport from our product but I can show a stock video, none of which was used to coach the detection algorithm, working in our Fyma platform:

An AI that has thought an AI to detect objects…

Yeah…so principally now you’ll be able to detect any object by just taking 10 images of it. I used to be at an Estonian startup podcast around two years ago where one in every of the hosts asked if he could detect birds from his garden with Fyma. The reply was no back then. Now, all you have got to do is catch those birds together with your camera and also you’re good to go.

LEAVE A REPLY

Please enter your comment!
Please enter your name here