It’s well-known that a sentiment evaluation model determines whether a text is positive, negative, or neutral. Nevertheless, this process typically requires access to unencrypted text, which may pose privacy concerns.
Homomorphic encryption is a variety of encryption that enables for computation on encrypted data while not having to decrypt it first. This makes it well-suited for applications where user’s personal and potentially sensitive data is in danger (e.g. sentiment evaluation of personal messages).
This blog post uses the Concrete-ML library, allowing data scientists to make use of machine learning models in fully homomorphic encryption (FHE) settings with none prior knowledge of cryptography. We offer a practical tutorial on how one can use the library to construct a sentiment evaluation model on encrypted data.
The post covers:
- transformers
- how one can use transformers with XGBoost to perform sentiment evaluation
- how one can do the training
- how one can use Concrete-ML to show predictions into predictions over encrypted data
- how one can deploy to the cloud using a client/server protocol
Last but not least, we’ll finish with an entire demo over Hugging Face Spaces to indicate this functionality in motion.
Setup the environment
First be certain that your pip and setuptools are up to this point by running:
pip install -U pip setuptools
Now we are able to install all of the required libraries for the this blog with the next command.
pip install concrete-ml transformers datasets
Using a public dataset
The dataset we use on this notebook may be found here.
To represent the text for sentiment evaluation, we selected to make use of a transformer hidden representation because it yields high accuracy for the ultimate model in a really efficient way. For a comparison of this representation set against a more common procedure just like the TF-IDF approach, please see this full notebook.
We are able to start by opening the dataset and visualize some statistics.
from datasets import load_datasets
train = load_dataset("osanseviero/twitter-airline-sentiment")["train"].to_pandas()
text_X = train['text']
y = train['airline_sentiment']
y = y.replace(['negative', 'neutral', 'positive'], [0, 1, 2])
pos_ratio = y.value_counts()[2] / y.value_counts().sum()
neg_ratio = y.value_counts()[0] / y.value_counts().sum()
neutral_ratio = y.value_counts()[1] / y.value_counts().sum()
print(f'Proportion of positive examples: {round(pos_ratio * 100, 2)}%')
print(f'Proportion of negative examples: {round(neg_ratio * 100, 2)}%')
print(f'Proportion of neutral examples: {round(neutral_ratio * 100, 2)}%')
The output, then, looks like this:
Proportion of positive examples: 16.14%
Proportion of negative examples: 62.69%
Proportion of neutral examples: 21.17%
The ratio of positive and neutral examples is fairly similar, though we’ve got significantly more negative examples. Let’s keep this in mind to pick out the ultimate evaluation metric.
Now we are able to split our dataset into training and test sets. We’ll use a seed for this code to make sure it’s perfectly reproducible.
from sklearn.model_selection import train_test_split
text_X_train, text_X_test, y_train, y_test = train_test_split(text_X, y,
test_size=0.1, random_state=42)
Text representation using a transformer
Transformers are neural networks often trained to predict the subsequent words to look in a text (this task is often called self-supervised learning). They will also be fine-tuned on some specific subtasks such that they specialize and get well results on a given problem.
They’re powerful tools for every kind of Natural Language Processing tasks. The truth is, we are able to leverage their representation for any text and feed it to a more FHE-friendly machine-learning model for classification. On this notebook, we’ll use XGBoost.
We start by importing the necessities for transformers. Here, we use the favored library from Hugging Face to get a transformer quickly.
The model we’ve got chosen is a BERT transformer, fine-tuned on the Stanford Sentiment Treebank dataset.
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
device = "cuda:0" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained("cardiffnlp/twitter-roberta-base-sentiment-latest")
transformer_model = AutoModelForSequenceClassification.from_pretrained(
"cardiffnlp/twitter-roberta-base-sentiment-latest"
)
This could download the model, which is now able to be used.
Using the hidden representation for some text may be tricky at first, mainly because we could tackle this with many various approaches. Below is the approach we selected.
First, we tokenize the text. Tokenizing means splitting the text into tokens (a sequence of specific characters that will also be words) and replacing each with a number. Then, we send the tokenized text to the transformer model, which outputs a hidden representation (output of the self attention layers which are sometimes used as input to the classification layers) for every word. Finally, we average the representations for every word to get a text-level representation.
The result’s a matrix of shape (variety of examples, hidden size). The hidden size is the variety of dimensions within the hidden representation. For BERT, the hidden size is 768. The hidden representation is a vector of numbers that represents the text that may be used for many various tasks. On this case, we’ll use it for classification with XGBoost afterwards.
import numpy as np
import tqdm
def text_to_tensor(
list_text_X_train: list,
transformer_model: AutoModelForSequenceClassification,
tokenizer: AutoTokenizer,
device: str,
) -> np.ndarray:
tokenized_text_X_train_split = []
tokenized_text_X_train_split = [
tokenizer.encode(text_x_train, return_tensors="pt")
for text_x_train in list_text_X_train
]
transformer_model = transformer_model.to(device)
output_hidden_states_list = [None] * len(tokenized_text_X_train_split)
for i, tokenized_x in enumerate(tqdm.tqdm(tokenized_text_X_train_split)):
output_hidden_states = transformer_model(tokenized_x.to(device), output_hidden_states=True)[
1
][-1]
output_hidden_states = output_hidden_states.mean(dim=1)
output_hidden_states = output_hidden_states.detach().cpu().numpy()
output_hidden_states_list[i] = output_hidden_states
return np.concatenate(output_hidden_states_list, axis=0)
list_text_X_train = text_X_train.tolist()
list_text_X_test = text_X_test.tolist()
X_train_transformer = text_to_tensor(list_text_X_train, transformer_model, tokenizer, device)
X_test_transformer = text_to_tensor(list_text_X_test, transformer_model, tokenizer, device)
This transformation of the text (text to transformer representation) would should be executed on the client machine because the encryption is completed over the transformer representation.
Classifying with XGBoost
Now that we’ve got our training and test sets properly built to coach a classifier, next comes the training of our FHE model. Here it’s going to be very straightforward, using a hyper-parameter tuning tool equivalent to GridSearch from scikit-learn.
from concrete.ml.sklearn import XGBClassifier
from sklearn.model_selection import GridSearchCV
model = XGBClassifier()
parameters = {
"n_bits": [2, 3],
"max_depth": [1],
"n_estimators": [10, 30, 50],
"n_jobs": [-1],
}
grid_search = GridSearchCV(model, parameters, cv=5, n_jobs=1, scoring="accuracy")
grid_search.fit(X_train_transformer, y_train)
print(f"Best rating: {grid_search.best_score_}")
print(f"Best parameters: {grid_search.best_params_}")
best_model = grid_search.best_estimator_
The output is as follows:
Best rating: 0.8378111718275654
Best parameters: {'max_depth': 1, 'n_bits': 3, 'n_estimators': 50, 'n_jobs': -1}
Now, let’s see how the model performs on the test set.
from sklearn.metrics import ConfusionMatrixDisplay
y_pred = best_model.predict(X_test_transformer)
y_proba = best_model.predict_proba(X_test_transformer)
matrix = confusion_matrix(y_test, y_pred)
ConfusionMatrixDisplay(matrix).plot()
accuracy_transformer_xgboost = np.mean(y_pred == y_test)
print(f"Accuracy: {accuracy_transformer_xgboost:.4f}")
With the next output:
Accuracy: 0.8504
Predicting over encrypted data
Now let’s predict over encrypted text. The thought here is that we are going to encrypt the representation given by the transformer fairly than the raw text itself. In Concrete-ML, you’ll be able to do that in a short time by setting the parameter execute_in_fhe=True within the predict function. That is only a developer feature (mainly used to examine the running time of the FHE model). We’ll see how we are able to make this work in a deployment setting a bit further down.
import time
start = time.perf_counter()
best_model.compile(X_train_transformer)
end = time.perf_counter()
print(f"Compilation time: {end - start:.4f} seconds")
tested_tweet = ["AirFrance is awesome, almost as much as Zama!"]
X_tested_tweet = text_to_tensor(tested_tweet, transformer_model, tokenizer, device)
clear_proba = best_model.predict_proba(X_tested_tweet)
start = time.perf_counter()
decrypted_proba = best_model.predict_proba(X_tested_tweet, execute_in_fhe=True)
end = time.perf_counter()
fhe_exec_time = end - start
print(f"FHE inference time: {fhe_exec_time:.4f} seconds")
The output becomes:
Compilation time: 9.3354 seconds
FHE inference time: 4.4085 seconds
A check that the FHE predictions are the identical because the clear predictions can also be essential.
print(f"Probabilities from the FHE inference: {decrypted_proba}")
print(f"Probabilities from the clear model: {clear_proba}")
This output reads:
Probabilities from the FHE inference: [[0.08434131 0.05571389 0.8599448 ]]
Probabilities from the clear model: [[0.08434131 0.05571389 0.8599448 ]]
Deployment
At this point, our model is fully trained and compiled, able to be deployed. In Concrete-ML, you should utilize a deployment API to do that easily:
from concrete.ml.deployment import FHEModelDev
fhe_api = FHEModelDev("sentiment_fhe_model", best_model)
fhe_api.save()
These few lines are enough to export all of the files needed for each the client and the server. You may take a look at the notebook explaining this deployment API intimately here.
Full example in a Hugging Face Space
It’s also possible to have a take a look at the final application on Hugging Face Space. The client app was developed with Gradio while the server runs with Uvicorn and was developed with FastAPI.
The method is as follows:
- User generates a brand new private/public key
- User types a message that shall be encoded, quantized, and encrypted
- Server receives the encrypted data and starts the prediction over encrypted data, using the general public evaluation key
- Server sends back the encrypted predictions and the client can decrypt them using his private key
Conclusion
We’ve got presented a approach to leverage the ability of transformers where the representation is then used to:
- train a machine learning model to categorise tweets, and
- predict over encrypted data using this model with FHE.
The ultimate model (Transformer representation + XGboost) has a final accuracy of 85%, which is above the transformer itself with 80% accuracy (please see this notebook for the comparisons).
The FHE execution time per example is 4.4 seconds on a 16 cores cpu.
The files for deployment are used for a sentiment evaluation app that enables a client to request sentiment evaluation predictions from a server while keeping its data encrypted all along the chain of communication.
Concrete-ML (Remember to star us on Github ⭐️💛) allows straightforward ML model constructing and conversion to the FHE comparable to give you the chance to predict over encrypted data.
Hope you enjoyed this post and tell us your thoughts/feedback!
And special because of Abubakar Abid for his previous advice on how one can construct our first Hugging Face Space!



