Detecting Cancer Growth Using AI and Computer Vision Introduction Methodology and Approach Data Collection and Preprocessing Model Training Inference Concluding remarks

Artificial Intelligence

Detecting Cancer Growth Using AI and Computer Vision Introduction Methodology and Approach Data Collection and Preprocessing Model Training Inference Concluding remarks

admin

June 18, 2023

Detecting Cancer Growth Using AI and Computer Vision
Introduction
Methodology and Approach
Data Collection and Preprocessing
Model Training
Inference
Concluding remarks

AI for social good: applications in medical imaging

Breast cancer is considered one of the deadliest types of cancer in women. As per the World Health Organization (WHO), in 2020 alone, around 2.3 million recent cases of invasive breast cancer were diagnosed worldwide that resulted in 685,000 deaths.

Although developing countries represent one-half of all breast cancer cases, they represent 62% of all deaths brought on by breast cancer. Survival of breast cancer for no less than 5 years after diagnosis ranges from greater than 90% in high-income countries, to 66% in India and 40% in South Africa.

Figure 1: Various steps in breast cancer metastases detection as performed by pathologists | Top Left: Image from Camelyon17 challenge | Top Right: Image from unsplash.com | Center: Image from unsplash.com | Bottom Left and Bottom Right: Images by the Writer

A key step in determining what stage the cancer is in is by microscopic examination of lymph nodes adjoining to the breast to know whether cancer has metastasised (a medical term meaning spread to other sites within the body). This step isn’t only sensitive, time intensive and laborious but in addition requires highly-skilled medical pathologists. It impacts decisions related to treatment, which incorporates considerations about radiation therapy, chemotherapy, and the potential surgical removal of more lymph nodes

With the arrival and advancement of AI and computer vision techniques, particularly Convolutional Neural Networks (CNNs), we’ve been capable of improve accuracy on a wide selection of computer vision tasks resembling image recognition, object detection, and segmentation. These have been useful in solving a number of the most difficult healthcare problems, especially in regions with limited access to advanced medical facilities.

Constructing on that, in this text I’ll present a framework leveraging a state-of-art CNNs and computer vision technologies to assist the detection of metastases in lymph nodes. A successful solution holds great promise to scale back the workload of pathologies, while at the identical time reduce the subjectivity in diagnosis.

Given a whole slide image of lymph node sections, our objective is to generate a mask that indicates potential cancerous regions (cells with tumors) within the section. An example is depicted in Figure 2, which shows a picture of a tissue on the slide alongside a mask where the yellow region depicts areas within the tissue which can be cancerous.

Figure 2: Left: A WSI from the dataset | Right: Binary Mask with yellow regions indicating cancerous regions — Images by the Writer

Image segmentation is considered one of the classic computer vision tasks, where the target is to coach a neural network to output a pixel-wise mask of the image (something just like the mask in Figure 2). There are several deep-learning techniques available for image segmentation which can be elaborately described in this paper. TensorFlow by Google also has a fantastic tutorial that uses an encoder-decoder approach to image segmentation.

As an alternative of using an encoder-decoder which is often utilized in image segmentation problems, we are going to treat this as a binary classification problem where each custom defined region on the slide is classed as healthy or tumorous using a neural network. These individual regions of a who slide image may be stitched together to desired mask.

We’ll use the usual ML process for constructing the CV model:

The dataset is sourced from the CAMELYON16 Challenge which as per the challenge website comprises a complete of 400 whole-slide images (WSIs) of sentinel lymph nodes collected in Radboud University Medical Center (Nijmegen, the Netherlands), and the University Medical Center Utrecht (Utrecht, the Netherlands)”.

Whole-slide images are stored in a multi-resolution pyramid structure and every image file comprises multiple downsampled versions of the unique image. Each image within the pyramid is stored as a series of tiles to facilitate rapid retrieval of subregions of the image (see Figure 3 for illustration).

More details about Whole Slide Imaging may be found here.

The bottom truth for the slides is provided as WSI binary masks indicating the regions within the slides that contain cells with tumors (see figure 2 above for example).

Figure 3: Illustration of various magnification levels in whole-slide images (WSI). Image sourced from https://camelyon16.grand-challenge.org/Data/

WSI’s in our dataset have 8 zoom levels that allow us to zoom the photographs from 1x all of the approach to 40x. Level 0 is taken into account the best resolution (40x) and level 7 is the bottom (1x).

Resulting from their enormous size (each WSI in our dataset ranges well over 2GB), standard image tools are incapable of reading and compressing them into system RAM. We used the OpenSlide library’s implementation in Python to efficiently read the photographs in our dataset and in addition provide an interface to navigate across different zoom levels.

Training a CNN on a complete dataset of 400 WSIs is computationally very expensive (imagine training on 2 x 400 = 800GB dataset). We had access to the free tier of Google Collab which has limited GPU support available. Subsequently, we randomly subsampled 22 WSIs from the dataset. At first, a set of twenty-two images might seem to be a tiny dataset to accurately train a Convolutional Neural network but, as I previously mentioned, we extract small patches from each of those enormous WSIs and treat each patch as an independent image that may be used to coach our model, as depicted in Figure 5.

Figure 5: Each WSI is further split into smaller patches to enhance the dataset — Images by the Writer

At the best zoom level (level 0 = 40x zoom), each image is roughly 62000 x 54000 pixels — extracting 299 x 299 size patches a would give us about 35,000 individual images from each WSI. We extracted patches from each zoom level. Because the zoom level increases, the resolution decreases and the variety of patches we will extract from the WSI also decreases. At level 7, we will extract lower than 200 patches per image.

Moreover, every WSI has a whole lot of empty area where the tissue cells weren’t present. To keep up data sanity, we avoided patches that had lower than 30% of tissue cells, which was calculate programmatically using intensive of the grey area.

The dataset was to have roughly the identical variety of patches that contained healthy and tumorous cells. An was done on this final dataset.

We built which were trained on the image patches generated using the mechanism described within the previous section.

Objective Function

Our primary optimization objectives were sensitivity and recall, but we also closely monitored the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) to be sure that we weren’t producing an excessive variety of false positives.

Within the context of cancer detection, it’s crucial that we minimize the variety of false negatives, i.e., instances where the model incorrectly classifies a cancerous sample as non-cancerous. A high variety of false negatives could delay the diagnosis and treatment for patients who indeed have cancer. Sensitivity (or recall) measures the proportion of actual positives which can be appropriately identified, and by optimizing for prime recall, we aim to appropriately discover as many actual positive cases as possible.

Nevertheless, specializing in sensitivity alone may lead the model to predict most samples as positive, thereby increasing the variety of false positives (cases where a non-cancerous sample is classed as cancerous). That is undesirable because it may lead to unnecessary medical interventions and cause undue anxiety for patients. That is where monitoring the AUC-ROC becomes extremely vital.

Model Constructing

We started-off by constructing was a quite simple architecture that comprised 2 convolutional layers with max pooling and dropout for regularization. To enhance over the baseline, we fine-tuned state-of-art image recognition models resembling as VGG16 and Inception v3 on our dataset.

As we had images available at different zoom levels, we each of which consumed images from one zoom level to see if viewing images at a selected zoom level enhances performance of the network. Resulting from a limited variety of extracted patches available at lower zoom levels — 3,4,5 images in these zoom levels were combined right into a single training set. Separate models were built for images at 0, 1 and a couple of zoom levels.

Figure 6: Standard Inception v3 model appended with a Global Max Pool Layer and Sigmoid Activation. Inception v3 image sourced from: https://cloud.google.com/tpu/docs/inception-v3-advanced

Interestingly, the most effective performing model was the Inception v3 model pre-trained on ImageNet weights with an extra Global Max Pooling layer (see figure 6). The sigmoid activation function takes any range real number and squashes it into a variety between 0 and 1. This is especially useful in our scenarios, where we’d wish to map predictions to probabilities of two classes (0 and 1).

Model Configurations

We did cross-validation to learn the the most effective hyperparameters for the model. The below shows the ultimate configurations of our augmented ImageNet v3, including the optimizer, learning rate, rho, epochs and batch size used. By utilizing class weights, we enhanced the model’s give attention to the minority class (tumorous cases), improving its ability to appropriately discover and diagnose cancer cases, a necessary requirement on this critical health context.

Model Evaluation

We checked out the loss, AUC and Recall for training runs with different hyperparameters and sampled image patches at different zoom levels.

As aforementioned, images at 3,4,5 zoom levels were combined right into a single training set and separate models were built for images at 0, 1 and a couple of zoom levels. Below charts show the performance for various zoom levels on the validations set. The performance was best at zoom level 1 by way of the AUC and recall, on the modified Imagenet v3.

Figure 8: Configurations and performance of the ultimate fine-tuned model — Image by the Writer

Once the model has been fine-tuned, we will use it to generate ‘masks’ for any recent whole-slide image. To do that, we might first have to generate 299 x 299 resolution (the input size for normal Imagenet v3 architecture) patches from the image on the zoom level that we’re interested by (either level 1 or level 2).

The person images are then passed through the fine-tuned model to categorise each of them as containing tumorous or non-tumorous cells. The photographs are then stitched together to generate the mask.

Listed here are the outputs and the actual masks for 2 whole-slide images in our test set. As you’ll be able to see, the masks output by our model decently resembles that actual mask.

Figure 9: Model results on a few images within the test set — Image by the Writer

On this post, we explored how computer vision models may be fine-tune to detect cancer metastases on gigapixel pathology images. The below image summarizes the workflow for training of the model and the inference process to categorise recent images.

Figure 9: Summary of the training and inference workflow — Image by the Writer

This model integrated in existing workflow of pathologists can act an assistive tool, and may be of high clinical relevance especially in organizations with limited resource capabilities, and may also be used as the primary line of defence to diagnose the underlying disease in a timely manner.

Further work must be done to evaluate the impact on real clinical workflows and patient outcomes. Nonetheless, we maintain a positive outlook that meticulously verified deep learning technologies, alongside thoughtfully crafted clinical instruments, have the potential to reinforce the precision and accessibility of pathological diagnoses globally.

Do take a look at the source code on my Github: https://github.com/saranggupta94/detecting_cancer_metastasis.

You will discover the ultimate results to the CAMELYON competition here: https://jamanetwork.com/journals/jama/article-abstract/2665774

AI for social good: applications in medical imaging

Objective Function

Model Constructing

Model Configurations

Model Evaluation

LEAVE A REPLY Cancel reply