Leveraging Gemini-1.5-Pro-Latest for Smarter Eating

-

Learn use Google’s Germini-1.5-pro-latest model to develop a generative AI app for calorie counting

Photo by Pickled Stardust on Unsplash

Have you ever ever wondered the quantity of calories you eat whenever you eat your dinner, for instance? I do that every one the time. Wouldn’t or not it’s wonderful for those who could simply pass an image of your plate through an app and get an estimate of the full variety of calories before you choose how far in you desire to dip?

This calorie counter app that I created can provide help to achieve this. It’s a Python application that uses Google’s Gemini-1.5-Pro-Latest model to estimate the variety of calories in food items.

The app takes two inputs: a matter concerning the food and a picture of the food or food items, or just, a plate of food. It outputs a solution to the query, the full variety of calories within the image and a breakdown of calories by each food item within the image.

In this text, I’ll explain your complete end-to-end technique of constructing the app from scratch, using Google’s Gemini-1.5-pro-latest (a Large Language generative AI model released by Google), and the way I developed the front-end of the appliance using Streamlit.

It’s price noting here that with advancements on the earth of AI, it’s incumbent on data scientists to step by step shift from traditional deep learning to generative AI techniques as a way to revolutionize their role. That is my essential purpose of teaching on this subject.

Let me start by briefly explaining Gemini-1.5-pro-latest and the streamlit framework, as they’re the main components within the infrastructure of this calorie counter app.

Gemini-1.5-pro-latest is a sophisticated AI language model developed by Google. Because it is the most recent version, it has enhanced capabilities over previous versions in the sunshine of faster response times and improved accuracy when utilized in natural language processing and constructing applications.

This can be a multi-modal model that works with each texts and pictures — an advancement from Google Gemini-pro model which only works with text prompts.

The model works by understanding and generating text, like humans, based on prompts given to it. In this text, this model will likely be used to to generate text for our calories counter app.

Gemini-1.5-pro-latest will be integrated into other applications to strengthen their AI capabilities. On this current application, the model uses generative AI techniques to interrupt the uploaded image into individual food items . Based on its contextual understanding of the food items from its dietary database, it uses image recognition and object detection to estimate the variety of calories, after which totals up the calories for all items within the image.

Streamlit is an open-source Python framework that can manage the user interface. This framework simplifies web development in order that throughout the project, you do not want to write down any HTML and CSS codes for the front end.

Allow us to dive into constructing the app.

I’ll show you construct the app in 5 clear steps.

1. Arrange your Folder structure

For a start, go into your favorite code editor (mine is VS Code) and begin a project file. Call it Calories-Counter, for instance. That is the present working directory. Create a virtual environment (venv), activate it in your terminal, after which create the next files: .env, calories.py, requirements.txt.

Here’s a advice for the look of your folder structure:

Calories-Counter/
├── venv/
│ ├── xxx
│ ├── xxx
├── .env
├── calories.py
└── requirements.txt

Please note that Gemini-1.5-Pro works best with Python versions 3.9 and greater.

2. Get the Google API key

Like other Gemini models, Gemini-1.5-pro-latest is currently free for public use. Accessing it requires that you simply obtain an API key, which you’ll be able to get from Google AI Studio by going to “Get API key” on this link. Once the hot button is generated, copy it for subsequent use in your code. Save this key as an environment variable within the .env file as follows.

GOOGLE_API_KEY="paste the generated key here"

3. Install dependencies

Type the next libraries into your requirements.txt file.

  • streamlit
  • google-generativeai
  • python-dotenv

Within the terminal, install the libraries in requirements.txt with:

python -m pip install -r requirements.txt

4. Write the Python script

Now, let’s start writing the Python script in calories.py. With the next code, import all required libraries:

# import the libraries
from dotenv import load_dotenv
import streamlit as st
import os
import google.generativeai as genai
from PIL import Image

Here’s how the assorted modules imported will likely be used:

  • dotenv — Since this application will likely be configured from a Google API key environment variable, dotenv is used to load configuration from the .env file.
  • Streamlit — to create an interactive user interface for front-end
  • os module is used to handle the present working directory while performing file operations like getting the API key from the .env file
  • google.generativeai module, after all, gives us access to the Gemini model we’re about to make use of.
  • PIL is a Python imaging library used for managing image file formats.

The next lines will configure the API keys and cargo them from the environment variables store.

genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

load_dotenv()

Define a function that, when called, will load the Gemini-1.5-pro-latest and get the response, as follows:

def get_gemini_reponse(input_prompt,image,user_prompt):
model=genai.GenerativeModel('gemini-1.5-pro-latest')
response=model.generate_content([input_prompt,image[0],user_prompt])
return response.text

Within the above function, you see that it takes as input, the input prompt that will likely be specified further down within the script, a picture that will likely be supplied by the user, and a user prompt/query that will likely be supplied by the user. All that goes into the gemini model to return the response text.

Since Gemini-1.5-pro expects input images in the shape of byte arrays, the subsequent thing to do is write a function that processes the uploaded image, converting it to bytes.

def input_image_setup(uploaded_file):
# Check if a file has been uploaded
if uploaded_file isn't None:
# Read the file into bytes
bytes_data = uploaded_file.getvalue()

image_parts = [
{
"mime_type": uploaded_file.type, # Get the mime type of the uploaded file
"data": bytes_data
}
]
return image_parts
else:
raise FileNotFoundError("No file uploaded")

Next, specify the input prompt that can determine the behaviour of your app. Here, we’re simply telling Gemini what to do with the text and image that the app will likely be fed with by the user.

input_prompt="""
You might be an authority nutritionist.
You must answer the query entered by the user within the input based on the uploaded image you see.
You must also have a look at the food items present in the uploaded image and calculate the full calories.
Also, provide the small print of each food item with calories intake within the format below:

1. Item 1 - no of calories
2. Item 2 - no of calories
----
----

"""

The following step is to initialize streamlit and create an easy user interface to your calorie counter app.

st.set_page_config(page_title="Gemini Calorie Counter App")
st.header("Calorie Counter App")
input=st.text_input("Ask any query related to your food: ",key="input")
uploaded_file = st.file_uploader("Upload a picture of your food", type=["jpg", "jpeg", "png"])
image=""
if uploaded_file isn't None:
image = Image.open(uploaded_file)
st.image(image, caption="Uploaded Image.", use_column_width=True) #show the image

submit=st.button("Submit & Process") #creates a "Submit & Process" button

The above steps have all of the pieces of the app. At this point, the user is capable of open the app, enter a matter and upload a picture.

Finally, let’s put all of the pieces together such that when the “Submit & Process” button is clicked, the user will get the required response text.

# Once submit&Process button is clicked
if submit:
image_data=input_image_setup(uploaded_file)
response=get_gemini_reponse(input_prompt,image_data,input)
st.subheader("The Response is")
st.write(response)

5. Run the script and interact along with your app

Now that the app development is complete, you may execute it within the terminal using the command:

streamlit run calories.py

To interact along with your app and see the way it performs, view your Streamlit app in your browser using the local url or network URL generated.

This how your Streamlit app looks like when it’s first opened on the browser.

Demo image of the initial display of the Calorie Counter App: Photo by writer.

Once the user asks a matter and uploads a picture, here is the display:

Demo image of the Calorie Counter App with user input query and user uploaded image: Photo by writer. The food image loaded within the app: Photo by Odiseo Castrejon on Unsplash

Once the user pushes the “Submit & Process” button, the response within the image below is generated at the underside of the screen.

Demo image of the Calories Counter App with the generated response: Photo by writer

For external access, consider deploying your app using cloud services like AWS, Heroku, Streamlit Community Cloud. On this case, let’s use Streamlit Community Cloud to deploy the app without cost.

On the highest right of the app screen, click ‘Deploy’ and follow the prompts to finish the deployment.

After deployment, you may share the generated app URL to other users.

Similar to other AI applications, the outcomes outputed are the most effective estimates of the model, so, before completely counting on the app, please note the next as among the potential risks:

  • The calorie counter app may misclassify certain food items and thus, give the fallacious variety of calories.
  • The app doesn’t have a reference point to estimate the dimensions of the food — portion — based on the uploaded image. This may result in errors.
  • Over-reliance on the app can lead to emphasize and mental health issues as one may turn out to be obsessive about counting calories and worrying about results that might not be too accurate.

To assist reduce the risks that include using the calorie counter, listed below are possible enhancements that may very well be integrated into its development:

  • Adding contextual evaluation of the image, which is able to help to gauge the dimensions of the food portion being analysed. For example, the app may very well be built such that a regular object like a spoon, included within the food image, may very well be used as a reference point for measuring the sizes of the food items. It will reduce errors in resulting total calories.
  • Google could improve the range of the food items of their training set to scale back misclassification errors. They may expand it to incorporate food from more cultures in order that even rare African food items will likely be identified.
ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x