Home Artificial Intelligence How Casavo Uses Deep Learning to Anonymise Images on Casavo’s Real Estate Platform Introduction Sharing Images with Potential Buyers Anonymising Images using Deep Learning Orchestration Conclusion

How Casavo Uses Deep Learning to Anonymise Images on Casavo’s Real Estate Platform Introduction Sharing Images with Potential Buyers Anonymising Images using Deep Learning Orchestration Conclusion

1
How Casavo Uses Deep Learning to Anonymise Images on Casavo’s Real Estate Platform
Introduction
Sharing Images with Potential Buyers
Anonymising Images using Deep Learning
Orchestration
Conclusion

Our Anonymisation Pipeline at work

As an actual estate company, Casavo faced a big challenge when it got here to sharing images of properties with potential buyers. While sharing images is a vital a part of the home-buying process, we would have liked to make sure that sensitive information, similar to the present homeowner’s identity and living address, remained anonymous.

To handle this challenge, we decided to leverage the facility of deep learning to create an answer that may allow us to share images while keeping sensitive information private. In this text, we’ll detail the several deep learning technologies we utilised to realize this goal. By sharing our experience, we hope to encourage other firms facing similar challenges to leverage the facility of deep learning to seek out revolutionary solutions to complex problems.

With Casavo’s mobile app, sellers can complete a distant visit of their property autonomously by uploading pictures of their home and floor plan. As a part of this process, we ask sellers for his or her permission to share the photographs and details of their property with potential buyers before listing it on our platform. This permits us to extend the possibilities of finding the precise match between seller and buyer earlier. If the vendor agrees, their house pictures, together with other relevant information and the floor-plan, are shared with all interested potential buyers as a preview.

Casavo’s Mobile App UI to share images with matched potential buyers

Nevertheless, sellers often upload images that contain personal information, similar to personal address, people’s faces, pictures inside frames, without realising the potential consequences of sharing such sensitive data. This could result in privacy concerns for each the vendor and anyone appearing in the photographs, in addition to potentially harming the vendor’s probabilities of a successful sale.

Despite these issues, sharing images with potential buyers is a vital a part of selling a property, because it allows them to get a greater sense of the property’s condition and layout. Subsequently, it’s crucial to seek out a technique to anonymise sensitive information in these images while still providing a transparent and accurate representation of the property. That is where deep learning technologies are available, offering an efficient and effective solution to the issue.

To handle this challenge, we utilised a mix of deep learning models to discover and anonymise sensitive regions of property images. In the next sections, we’ll explore the several models used and the way they were combined to realize the specified results.

With a purpose to anonymise the photographs uploaded by sellers, we utilised three pre-trained deep learning models that process images to perform different tasks. The choice to make use of pre-trained deep learning models for anonymising images uploaded by sellers was based on two predominant aspects.

Firstly, training deep learning models from scratch generally is a time-consuming and resource-intensive process. By utilising pre-trained models, we were able to avoid wasting significant amounts of time and computing resources that may have been required for training our own models.

Secondly, pre-trained models have already been trained on large datasets, which provides a big advantage when it comes to accuracy and performance. The models we use have learned to discover patterns and features in images which can be useful for the precise task they were trained for. Which means we could depend on the pre-trained models to effectively anonymise images with no need to spend additional resources on data collection and annotation.

— One in every of the models used was the CRAFT (Character-Region Awareness For Text detection), which is able to identifying text inside a picture and making a heat-map that highlight the situation of the text as a probability distribution over all of the pixels. This model was particularly useful in identifying personal information, similar to names and addresses, that will have been present in the ground plans uploaded by sellers.

Heat-maps generated by CRAFT — source
Heat-maps generated by CRAFT — source

The warmth-map created by CRAFT was passed through a binary threshold to compute a text segmentation mask, which was ultimately used to find out which areas of the image contained text to be anonymised. We decided to make use of heat-maps with binary threshold as an alternative of bounding boxes to forestall creating masks which were too wide, potentially removing other portions of the image. For each pixel, a rating between 0 and 1 was assigned because the probability that that pixel contained text. Afterwards, all pixel whose values were above a selected threshold were set to 1, while the others were set to 0. Following this approach we got here up with segmentation masks that neatly separate the text from the background.

Detection and anonymisation of sensitive text in a floor plan
Detection and anonymisation of sensitive text in a floor plan

CRAFT is predicated on a completely convolutional neural network that utilises the VGG16 architecture and detects words through word-level bounding boxes. It is understood for its effectiveness in identifying texts of assorted sizes, starting from large to small texts, and its ability to generalise.

— The second model that we employed is known as YOLO (You Only Look Once). The technique of detecting people and faces in images involves the usage of computer vision algorithms which can be trained on large datasets of annotated images to discover specific objects and features inside the image. The most recent version of YOLO (v8) has been trained on massive amounts of image data and may accurately detect and classify objects inside a picture in almost real-time.

Segmentation masks and bounding boxes computed by YOLOv5 — Source

When applied to the duty of detecting people inside images, YOLO v8 uses a mix of deep neural networks and computer vision techniques to analyse the image and discover the presence of individuals. We used this information to create a segmentation mask that isolates the person from the image background. Once more, the segmentation mask is actually a binary image that assigns a price of either 0 or 1 to every pixel within the image, indicating whether it is an element of the person or the background.

YOLO Mask Detection and Anonymisation in our pipeline

Once the masks created by the CRAFT and YOLO models were obtained, we combined them with a logical OR operator to acquire a mask containing all pixels to be anonymised. These sensitive regions could include personal information, faces, or another elements that will have compromised the vendor’s privacy. We then used computer vision techniques, similar to convolutions (to dilate the masks) and blurring, to anonymise these regions and make sure that the photographs were protected to be shared with potential buyers.

— To make sure that only appropriate images were shared with potential buyers, we recognised the necessity for a further step in our pipeline: an NSFW (Not Secure For Work) detector. We incorporated this by utilising the detector from the Stable Diffusion V2 model created by Stability AI, which is able to detecting potentially harmful or sensitive concepts inside a picture (similar to young kids, explicit content and so forth). The model is predicated on CLIP (Contrastive Language-Image Pre-Training) with a projection layer applied to those concepts and associated thresholds to find out whether a picture is suitable for sharing with potential buyers.

High level representation of our anonymisation pipeline

By incorporating the NSFW detector into our pipeline, we were capable of prevent the sharing of inappropriate images with potential buyers. The detector provided a further layer of protection to make sure that only images that were protected and appropriate for public viewing were shared on our platform, after anonymisation.

Deploying such a service in production posed significant challenges. We selected to make use of CPU inference, because the volumes and throughput of images did justify reserving a GPU. Nevertheless, this resulted in inference times exceeding one minute per image, which ruled out a traditional REST interface. Maintaining a TCP connection for that long was undesirable. To handle this issue, we leveraged the RabbitMQ message broker, a well-liked tool in message queuing. Through the use of RabbitMQ, we created consumer and producer queues of images (to be processed and anonymised), enabling us to administer the flow of images efficiently between micro-services in our distributed architecture. Consequently, we were able to take care of optimal efficiency without compromising the system’s performance.

High-level overview of how our micro-services interact with the image anonymiser

To further optimise our service, we leveraged KEDA, a Kubernetes Event-driven Autoscaling tool. With KEDA, we were capable of turn the service on and off as needed. Specifically, KEDA would activate the service container only when latest images were present within the producer queue. It could then turn off the container after a certain amount of time had passed with none images within the queue. This allowed us to conserve resources and reduce costs, because the machine was only lively when needed. Using RabbitMQ and KEDA together proved to be an efficient solution for managing our image processing service in a scalable and cost-effective manner.

In the long run, we could enhance this micro-service by integrating generative AI as a type of post-processing. This might allow us to seamlessly replace any content that was identified within the image, as if it was never there. For those who’re considering exploring the probabilities of generative AI in the actual estate industry, you may find my previous article on the subject informative and interesting

In conclusion, Casavo’s use of computer vision and machine learning technologies has enabled us to enhance our real estate services and offer a more seamless experience for our clients, enhancing the privacy of people in uploaded images and floor plans.

These results reflect our commitment to innovation and our dedication to making a more transparent and efficient real estate marketplace. For those who are obsessed with transforming the industry and wish to be a part of a dynamic team that’s leading the way in which, we invite you to explore our open positions and join us in revolutionising real estate. Together, we will use cutting-edge technology to make buying and selling homes simpler, faster, and more accessible for everybody. 🏠 🚀

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here