Using Machine Learning to Aid Survivors and Race through Time

-


merve's avatar

Alara Dirik's avatar


On February 6, 2023, earthquakes measuring 7.7 and seven.6 hit South Eastern Turkey, affecting 10 cities and leading to greater than 42,000 deaths and 120,000 injured as of February 21.

A number of hours after the earthquake, a bunch of programmers began a Discord server to roll out an application called afetharita, literally meaning, disaster map. This application would serve search & rescue teams and volunteers to search out survivors and produce them help. The necessity for such an app arose when survivors posted screenshots of texts with their addresses and what they needed (including rescue) on social media. Some survivors also tweeted what they needed so their relatives knew they were alive and that they need rescue. Needing to extract information from these tweets, we developed various applications to show them into structured data and raced against time in developing and deploying these apps.

After I got invited to the discord server, there was quite loads of chaos regarding how we (volunteers) would operate and what we’d do. We decided to collaboratively train models so we would have liked a model and dataset registry. We opened a Hugging Face organization account and collaborated through pull requests as to construct ML-based applications to receive and process information.

organization

We had been told by volunteers in other teams that there is a need for an application to post screenshots, extract information from the screenshots, structure it and write the structured information to the database. We began developing an application that might take a given image, extract the text first, and from text, extract a reputation, telephone number, and address and write these informations to a database that might be handed to authorities. After experimenting with various open-source OCR tools, we began using easyocr for OCR part and Gradio for constructing an interface for this application. We were asked to construct a standalone application for OCR as well so we opened endpoints from the interface. The text output from OCR is parsed using transformers-based fine-tuned NER model.

OCR

To collaborate and improve the applying, we hosted it on Hugging Face Spaces and we have received a GPU grant to maintain the applying up and running. Hugging Face Hub team has set us up a CI bot for us to have an ephemeral environment, so we could see how a pull request would affect the Space, and it helped us during pull request reviews.

Afterward, we got labeled content from various channels (e.g. twitter, discord) with raw tweets of survivors’ calls for help, together with the addresses and private information extracted from them. We began experimenting each with few-shot prompting of closed-source models and fine-tuning our own token classification model from transformers. We’ve used bert-base-turkish-cased as a base model for token classification and got here up with the primary address extraction model.

NER

The model was later utilized in afetharita to extract addresses. The parsed addresses can be sent to a geocoding API to acquire longitude and latitude, and the geolocation would then be displayed on the front-end map. For inference, we have now used Inference API, which is an API that hosts model for inference and is mechanically enabled when the model is pushed to Hugging Face Hub. Using Inference API for serving has saved us from pulling the model, writing an app, constructing a docker image, establishing CI/CD, and deploying the model to a cloud instance, where it will be extra overhead work for the DevOps and cloud teams as well. Hugging Face teams have provided us with more replicas in order that there can be no downtime and the applying can be robust against loads of traffic.

backend_pipeline

Afterward, we were asked if we could extract what earthquake survivors need from a given tweet. We got data with multiple labels for multiple needs in a given tweet, and these needs might be shelter, food, or logistics, because it was freezing cold over there. We’ve began experimenting first with zero-shot experimentations with open-source NLI models on Hugging Face Hub and few-shot experimentations with closed-source generative model endpoints. Now we have tried xlm-roberta-large-xnli and convbert-base-turkish-mc4-cased-allnli_tr. NLI models were particularly useful as we could directly infer with candidate labels and alter the labels as data drift occurs, whereas generative models could have made up labels and cause mismatches when giving responses to the backend. We initially didn’t have labeled data so anything would work.

Ultimately, we decided to fine-tune our own model as it will take roughly three minutes to fine-tune BERT’s text classification head on a single GPU. We had a labelling effort to develop the dataset to coach this model. We logged our experiments within the model card’s metadata so we could later provide you with a leaderboard to maintain track of which model must be deployed to production. For base model, we have now tried bert-base-turkish-uncased and bert-base-turkish-128k-cased and realized they perform higher than bert-base-turkish-cased. You will discover our leaderboard here.

intent_model

Considering the duty at hand and the imbalance of our data classes, we focused on eliminating false negatives and created a Space to benchmark the recall and F1-scores of all models. To do that, we added the metadata tag deprem-clf-v1 to all relevant model repos and used this tag to mechanically retrieve the logged F1 and recall scores and rank models. We had a separate benchmark set to avoid leakage to the train set and consistently benchmark our models. We also benchmarked each model to discover the very best threshold per label for deployment.

We wanted our NER model to be evaluated and crowd-sourced the trouble because the info labelers were working to offer us higher and updated intent datasets. To judge the NER model, we’ve arrange a labeling interface using Argilla and Gradio, where people could input a tweet and flag the output as correct/incorrect/ambiguous.

active_learning

Later, the dataset was deduplicated and used to benchmark our further experiments.

One other team under machine learning has worked with generative models (behind a gated API) to get the precise needs (as labels were too broad) as free text and pass the text as a further context to every posting. For this, they’ve done prompt engineering and wrapped the API endpoints as a separate API, and deployed them on the cloud. We found that using few-shot prompting with LLMs helps adjust to fine-grained needs within the presence of rapidly developing data drift, because the only thing we want to regulate is the prompt and we don’t need any labeled data for this.

These models are currently getting used in production to create the points in the warmth map below in order that volunteers and search and rescue teams can bring the must survivors.

afetharita

We’ve realized that if it wasn’t for Hugging Face Hub and the ecosystem, we wouldn’t give you the chance to collaborate, prototype, and deploy this fast. Below is our MLOps pipeline for address recognition and intent classification models.

mlops

There are tens of volunteers behind this application and its individual components, who worked with no sleep to get these out in such a short while.



Distant Sensing Applications

Other teams worked on distant sensing applications to evaluate the damage to buildings and infrastructure in an effort to direct search and rescue operations. The shortage of electricity and stable mobile networks in the course of the first 48 hours of the earthquake, combined with collapsed roads, made it extremely difficult to evaluate the extent of the damage and where help was needed. The search and rescue operations were also heavily affected by false reports of collapsed and damaged buildings as a result of the difficulties in communication and transportation.

To deal with these issues and create open source tools that may be leveraged in the longer term, we began by collecting pre and post-earthquake satellite images of the affected zones from Planet Labs, Maxar and Copernicus Open Access Hub.

input_satellite

Our initial approach was to rapidly label satellite images for object detection and instance segmentation, with a single category for “buildings”. The aim was to guage the extent of harm by comparing the variety of surviving buildings in pre- and post-earthquake images collected from the identical area. In an effort to make it easier to coach models, we began by cropping 1080×1080 satellite images into smaller 640×640 chunks. Next, we fine-tuned YOLOv5, YOLOv8 and EfficientNet models for constructing detection and a SegFormer model for semantic segmentation of buildings, and deployed these apps as Hugging Face Spaces.

app

Once more, dozens of volunteers worked on labeling, preparing data, and training models. Along with individual volunteers, corporations like Co-One volunteered to label satellite data with more detailed annotations for buildings and infrastructure, including no damage, destroyed, damaged, damaged facility, and undamaged facility labels. Our current objective is to release an in depth open-source dataset that may expedite search and rescue operations worldwide in the longer term.

output_satellite



Wrapping Up

For this extreme use case, we had to maneuver fast and optimize over classification metrics where even one percent improvement mattered. There have been many ethical discussions within the progress, as even picking the metric to optimize over was an ethical query. Now we have seen how open-source machine learning and democratization enables individuals to construct life-saving applications.
We’re thankful for the community behind Hugging Face for releasing these models and datasets, and team at Hugging Face for his or her infrastructure and MLOps support.



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x