Using Deep Learning to heal people suffering from cancer

DL is cool

Sometimes, we are happily using Deep Learning for futiles things like generating faces or changing horses into zebras. But most of the time, it’s a powerful tool that can help saving lives.

At the INSA of Rouen, I worked in a team of student implementing a solution based on an article published by researchers, some of them being my teachers. The article is called IODA: An input/output deep architecture for image labeling and was written by Julien Lerouge, Romain Herault, Clément Chatelain, Fabrice Jardin and Romain Modzelewski. Image labeling is the act of determining zones in an image and saying : ‘this zone corresponds to the sky’ or ‘this zone corresponds to a pedestrian’. But what’s fantastic with their work is that it also does image segmentation (it also detects where are the frontiers of the zones).

Example of image segmentation

Example of image segmentation


At this point, you’re probably wondering : it’s cool but how does it heal people? or is it called Ioda because of Star Wars ?

The problematic

Well, it doesn’t really *heal* people per se, but it helps the medics a lot. One of the treatment for cancer is chemotherapy : basically, you insert poison in the patient’s body and hope it kills more cancerous cells than sane cells. It’s brutal, the patient suffers and sometime it does more harm than it helps. You understand now how careful the medic have to be when dealing with the doses of this. The problem is that we don’t really know how much the patient can take. We need to evaluate his state to adjust the doses.

One way of doing that is by looking at the sarcopeny (the loss of skeletal muscles) and the amount of fat. It’s a good indicator and it’s simple : if a patient has some fat and isn’t losing muscles, he is in good health.

On a scanner, it’s difficult to estimate the amounts of fat and muscles. For the moment, radiologists spend precious minutes of their time with a colorizing tool to fill zones in scanner, while a software compute various indicators according to the colorized areas. It’s boring, very long, the ratio (time spent)/(time well spent) isn’t very high.

L3 slice : where are the muscles, where are the bones, where is the fat?

L3 slice : where are the muscles, where are the bones, where is the fat?


L3 slice colorised (this is what we want to get) : subcutaneous fat in green, muscle tissue in pink, visceral fat in red, bowels in grey, bones in white

L3 slice colorised (this is what we want to get) : subcutaneous fat in green, muscle tissue in pink, visceral fat in red, bowels in grey, bones in white

An L3 slice is the scanner located at the 3rd vertebrae. It’s the reference point in the body


Here comes the DL on its white horse

I like to see Deep Neural network like a set of bricks with different properties, that you stuck in each other like Lego’s.

The most useful bricks are :

  • the fully-connected layers (or dense layers) : the first invented layers, they learn a function from a dataset to solve your problem
  • the convolutional layers : they apply filters more and more complex on your input. This allow to reduce the size of it (it’s reduced to its most informative features). If your input is an image, you should really consider using it.

Then, Deep Learning is really just cleaning your data (I’m writing that like it’s always easy), sticking layers together to get your model, throwing your training dataset at it and reaping the fruits of the hard labor of your GPU.

Let’s look at the architecture of IODA:

Architecture of IODA

Architecture of IODA


We have 3 elements in the network (from left to right):

  • convolutional layers (in white) that take the scanner as input
  • dense layers (in grey) forming an auto-encoder (I’ll explain what is it)
  • and then, layers that are specific to IODA (in dark grey) : convolutional layers that can work backward. You feed them a compressed representation of an image and they output the image.

The auto-encoder

An auto-encoder is a technique used to create a code that link two objects.
In it, you have 3 parts :

  • the encoder (A), a neural network that take the input to transform it into an intermediate form
  • the intermediate form (d), most of the time it’s a vector
  • the decoder (B), a neural network that take the intermediate state and tries to recreate the input of A.

To use it, you give your input x to A, it produces an intermediate form d that is given to B, and then B tries to recreate the input (let’s call the output of B, x’).
When you are training your neural network, you penalize the difference between x and x’ so that each time, the auto-encoder learns better how to encode and decode without losing information.

And what is even better is that you can decide that x and x’ are not identical! For example, you can decide that x is a picture showing a horse and that x’ is the same picture but with the horse replaced by a zebra. For more informations, I invite you to read this very good article.

Principle of an auto-encoder

Principle of an auto-encoder


So, when we feed IODA with scanner in A and segmented (colorized) scanner in B, the auto-encoder will learn to create a colorised scanner from a non-colorised one. It’s that simple (well in fact, it’s a bit more complicated but maybe I’ll write another article about it). Once the radiologist have the colorised scanner, he can put it into a software that will compute the indicators of sarcopeny according to the areas of the colorised zones. With those indicators, the doctor can evaluate the state of the patient and can choose a more appropriate dosage for the patient.

You might be wondering : but then, what are the other layers for? 

Well, dense layers in the auto-encoder work very badly when the input has a big size. The convolutional layers reduce the size of the inputs to solve that.

This is just a tiny part of the published article, and I’d be happy to tell you more about it in a next blog post.

By Denis Vivies