Subtitles section Play video Print subtitles [MUSIC PLAYING] LILY PENG: Hi everybody. My name is Lily Peng. I'm a physician by training and I work on the Google medical-- well, Google AI health-care team. I am a product manager. And today we're going to talk to you about a couple of projects that we have been working on in our group. So first off, I think you'll get a lot of this, so I'm not going to go over this too much. But because we apply deep learning to medical information, I kind of wanted to just define a few terms that get used quite a bit but are somewhat poorly defined. So first off, artificial intelligence-- this is a pretty broad term and it encompasses that grand project to build a nonhuman intelligence. Machine learning is a particular type of artificial intelligence, I suppose, that teaches machines to be smarter. And deep learning is a particular type of machine learning which you guys have probably heard about quite a bit and will hear about quite a bit more. So first of all, what is deep learning? So it's a modern reincarnation of artificial neural networks, which actually was invented in the 1960s. It's a collection of simple trainable units, organized in layers. And they work together to solve or model complicated tasks. So in general, with smaller data sets and limited compute, which is what we had in the 1980s and '90s, other approaches generally work better. But with larger data sets and larger model sizes and more compute power, we find that neural networks work much better. So there's actually just two takeaways that I want you guys to get from this slide. One is that deep learning trains algorithms that are very accurate when given enough data. And two, that deep learning can do this without feature engineering. And that means without explicitly writing the rules. So what do I mean by that? Well in traditional computer vision, we spend a lot of time writing the rules that a machine should follow to make a certain prediction task. In convolutional neural networks, we actually spend very little time in feature engineering and writing these rules. Most of the time we spend in data preparation and numerical optimization and model architecture. So I get this question quite a bit. And the question is, how much data is enough data for a deep neural network? Well in general, more is better. But there are diminishing returns beyond a certain point. And a general rule of thumb is that we like to have about 5,000 positives per class. But the key thing is good and relevant data-- so garbage in, garbage out. The model will predict very well what you ask it to predict. So when you think about where machine learning, and especially deep learning, can make the biggest impact, it's really in places where there's lots of data to look through. One of our directors, Greg Corrado, puts it best. Deep learning is really good for tasks that you've done 10,000 times, and on the 10,001st time, you're just sick of it and you don't want to do it anymore. So this is really great for health care in screening applications where you see a lot of patients that are potentially normal. It's also great where expertise is limited. So here on the right you see a graph of the shortage of radiologists kind of worldwide. And this is also true for other medical specialties, but radiologists are sort of here. And we basically see a worldwide shortage of medical expertise. So one of the screening applications that our group has worked on is with diabetic retinopathy. We call it DR because it's easier to say than diabetic retinopathy. And it's the fastest growing cause of preventable blindness. All 450 million people with diabetes are at risk and need to be screened once a year. This is done by taking a picture of the back of the eye with a special camera, as you see here. And the picture looks a little bit like that. And so what a doctor does when they get an image like this is they grade it on a scale of one to five from no disease, so healthy, to proliferate disease, which is the end stage. And when they do grading, they look for sometimes very subtle findings, little things called micro aneurysms that are outpouchings in the blood vessels of the eye. And that indicates how bad your diabetes is affecting your vision. So unfortunately in many parts of the world, there are just not enough eye doctors to do this task. So with one of our partners in India, or actually a couple of our partners in India, there is a shortage of 127,000 eye doctors in the nation. And as a result, about 45% of patients suffer some sort of vision loss before the disease is detected. Now as you recall, I said that this disease was completely preventable. So again, this is something that should not be happening. So what we decided to do was we partnered with a couple of hospitals in India, as well as a screening provider in the US. And we got about 130,000 images for this first go around. We hired 54 ophthalmologists and built a labeling tool. And then the 54 ophthalmologists actually graded these images on this scale, from no DR to proliferative. The interesting thing was that there was actually a little bit of variability in how doctors call the images. And so we actually got about 880,000 diagnoses in all. And with this labelled data set, we put it through a fairly well known convolutional neural net. This is called Inception. I think lot of you guys may be familiar with it. It's generally used to classify cats and dogs for our photo app or for some other search apps. And we just repurposed it to do fundus images. So the other thing that we learned while we were doing this work was that while it was really useful to have this five-point diagnosis, it was also incredibly useful to give doctors feedback on housekeeping predictions like image quality, whether this is a left or right eye, or which part of the retina this is. So we added that to the network as well. So how well does it do? So this is the first version of our model that we published in a medical journal in 2016 I believe. And right here on the left is a chart of the performance of the model in aggregate over about 10,000 images. Sensitivity is on the y-axis, and then 1 minus specificity is on the x-axis. So sensitivity is a percentage of the time when a patient has a disease and you've got that right, when the model was calling the disease. And then specificity is the proportion of patients that don't have the disease that the model or the doctor got right. And you can see you want something with high sensitivity and high specificity. And so up and to the right-- or up and to the left is good. And you can see here on the chart that the little dots are the doctors that were grading the same set. So we get pretty close to the doctor. And these are board-certified US physicians. And these are ophthalmologists, general ophthalmologists by training. In fact if you look at the F score, which is a combined measure of both sensitivity and specificity, we're just a little better than the median ophthalmologist in this particular study. So since then we've improved the model. So last year about December 2016 we were sort of on par with generalists. And then this year-- this is a new paper that we published-- we actually used retinal specialists to grade the images. So they're specialists. We also had them argue when they disagreed about what the diagnosis was. And you can see when we train the model using that as the ground truth, the model predicted that quite well as well. So this year we're sort of on par with the retina specialists. And this weighted kappa thing is just agreement on the five-class level. And you can see that, essentially, we're sort of in between the ophthalmologists and the retina specialists, in fact kind of in between the retinal specialists. Another thing that we've been working on beyond improving the models is actually trying to have the networks explain how it's making a prediction. So again, taking a playbook or a play out of the playbook from the consumer world, we started using this technique called show me where. And this is where using an image, we actually generate a heat map of where the relevant pixels are for this particular prediction. So here you can see a picture of a Pomeranian. And the heat map shows you that there is something in the face of the Pomeranian that makes it look Pomeranian-y. And on the right here, you kind of have an Afghan hound, and the network's highlighting the Afghan hound. So using this very similar technique, we applied it to the fundus images and we said, show me where. So this is a case of mild disease. And I can tell it's mild disease because-- well, it looks completely normal to me. I can't tell that there is any disease there. But a highly trained doctor would be able to pick out little thing called microaneurysms where the green spots are. Here's a picture of moderate disease. And this is a little worse because you can see some bleeding at the ends here. And actually I don't know if I can signal, but there's a bleeding there. And the heat map-- so here's a heat map. You can see that it picks up the bleeding. But there's two artifacts in this image. So there is a dust spot, just like a little dark spot. And then there is this little reflection in the middle of the image. And you could tell that the model just ignores it, essentially. So what's next? We trained a model. We showed that it's somewhat explainable. We think it's doing the right thing. What's next? Well, we actually have to deploy this into health-care systems. And we're partnering with health-care providers and companies to bring this to patients. And actually Dr. Jess Mega, who is going to speak after me, is going to have a little more details about this effort there. So I've given the screening application. And here's an application in diagnosis that we're working on. So in this particular example, we're talking about a disease-- well, we're talking about breast cancer, but we're talking about metastases of breast cancer into nearby lymph nodes. So when a patient is diagnosed with breast cancer and the primary breast cancer is removed, the surgeon spends some time taking out what we call lymph nodes so that we can examine to see whether or not the breast cancer has metastasized to those nodes. And that has an impact on how you treat the patient. So reading these lymph nodes is actually not an easy task. And in fact about in 24% of biopsies when they went back to look at them, the 24% had a change in nodal status. Which means that if it was positive, it was read negative, and it was negative, read positive. So that's a really big deal. It's one in four. The interesting thing is that there was another study published that showed that a pathologist with unlimited time, not overwhelmed with data, actually is quite sensitive, so 94% sensitivity in finding the tumors. When you put time constraint on the patient, their sensitivity-- or sorry, on the provider, on the pathologist, the sensitivity drops. And people will start overlooking where little metastases may be. So in this picture there's a tiny metastasis right there. And that's usually small things like this that are missed. And this is not surprising given that so much information is in each slide. So one of these slides, if digitized, is about 10 gigapixels. And that's literally a needle in a haystack. The interesting thing is that pathologists can actually find 73% of the cancers if they spend all their time looking for it with zero false positives per slide. So we trained a model that can help with this task. It actually finds about 95% of the cancer lesions and it has eight false positives per slide. So clearly an ideal system is one that is very sensitive using the model, but also quite specific, that relies on the pathologist to actually look over the false positives and calling them false positives. So this is very promising and we're working on validation in the clinic right now. In terms of reader studies, how this actually interacts with the doctor is really quite important. And clearly there are applications to other tissues. I talked about lymph nodes, but we have some early studies that actually show that this works for prostate cancer, as well, for Gleason grading. So in the previous examples we talked about how deep learning can produce the algorithms that are very accurate. And they tend to make calls that a doctor might already make. But what about predicting things that doctors don't currently do from imaging? So as you recall from the beginning of the talk, one of the great things about deep learning is that you can train very accurate algorithms without explicitly writing rules. So this allows us to make completely new discoveries. So the picture on the left is from a paper that we published recently where we trained deep-learning models to predict a variety of cardiovascular risk factors. And that includes age, self-reported sex, smoking status, blood pressure, things that doctors generally consider right now to assess the patient's cardiovascular risk and make proper treatment recommendations. So it turns out that we can not only predict many of these factors, and quite accurately, but we can actually directly predict a five-year risk of a cardiac event. So this work is quite early, really pulmonary, and the AUC for this prediction is 0.7. What that number is means is that if given two pictures, one picture of a patient that did not have a cardiovascular event and one picture of a patient who did, it is right about 70% of the time. Most doctors is around 50% of time, because it's kind of a random-- like it's hard to do based on a retinal image alone. So why is this exciting? Well normally when a doctor tries to assess your risk for cardiovascular disease, there are needles involved. So I don't know if anyone has gotten blood cholesterol screening. You fast the night before and then we take some blood samples and then we assess your risk. So again, I want to emphasize that this is really early on. But these results support the idea that we may be able to use something like an image to make new predictions that we couldn't make before. And this might be able to be done in sort of a noninvasive manner. So I've given a few examples, three examples of how deep learning can really increase both availability and accuracy in health care. And one of the things that I want to kind of also acknowledge here is the reason why this has become more and more exciting is, I think, because TensorFlow is open source. So this kind of open standard from general machine learning is being applied everywhere. So I've given examples of work that we've done at Google, but there's a lot of work that's being done across the community at other medical centers that are very similar. And so we're really excited about what this technology can bring to the field of health care. And with that, I'd like to introduce Jess Mega. Unlike me, she is a real doctor. And she's the chief medical officer at Verily. JESSICA MEGA: Well thank you all for being here. And thank you Lily for kicking us off. I think the excitement around AI and health care could not be greater. As you heard, my name is Jess Mega. I'm a cardiologist and am so excited to be part of the Alphabet family. Verily grew out of Google and Google X. And we are focused solely on health care and life sciences. And our mission is to take the world's health information and make it useful so that patients live healthier lives. And the example that I'll talk about today focuses on diabetes and really lends itself to the conversation that Lily started. But I think it's very important to pause and think about health data broadly. Right now, any individual who's in the audience today has about several gigabytes of health data. But if you think about health in the years to come and think about genomics, molecular technologies, imaging, sensor data, patient-reported data, electronic health records and claims, we're talking about huge sums of data, gigabytes of data. And at Verily and at Alphabet, we're committed to stay ahead of this so that we can help patients. The reason we're focusing initially some of our efforts on diabetes is this is an urgent health issue. About 1 in 10 people has diabetes. And when you have diabetes, it affects how you handle sugar glucose in the body. And if you think about prediabetes, the condition before someone has diabetes, that's one in three people. That would be the entire center section of the audience today. Now what happens when your body handles glucose in a different way, you can have downstream effects. You heard Lilly talk about diabetic retinopathy. People can have problems with their heart, kidneys, and peripheral neuropathy. So this is the type of disease that we need to get ahead of. But we have two main issues that we're trying to address. The first one is an information gap. So even the most adherent patients with diabetes-- and my grandfather was one of these-- would check his blood sugar four times a day. And I don't know if anyone today has been able to have any of the snacks. I actually had some of the caramel popcorn. Did anyone have any of that? Yeah, that was great, right, except probably our biology and our glucose is going up and down. So if I didn't check my glucose in that moment, we wouldn't have captured that data. So we know biology is happening all of the time. When I see patients in the hospital as a cardiologist, I can see someone's heart rate, their blood pressure, all of these vital signs in real time. And then people go home, but biology is still happening. So there's an information gap, especially with diabetes. The second issue is a decision gap. You may see a care provider once a year, twice a year, but health decisions are happening every single day. They're happening weekly, daily, hourly. And how do we decide to close this gap? At Verily we're focusing on three key missions. And this can be true for almost every project we take on. We're thinking about how to shift from episodic and reactive care to much more proactive care. And in order to do that and to get to the point where we can really use the power of that AI, we have to do three things. We have to think about collecting the right data. And today I'll be talking about continuous glucose monitoring. How do you then organize this data so that it's in a format that we can unlock and activate and truly help patients? So whether we do this in the field of diabetes that you'll hear about today or with our surgical robots, this is the general premise. The first thing to think about is the collection of data. And you heard Lily say garbage in, garbage out. We can't look for insights unless we understand what we're looking at. And one thing that has been absolutely revolutionary is thinking about extremely small biocompatible electronics. So we are working on next-generation sensing. And you can see a demonstration here. What this will lead to, for example, with extremely small continuous glucose monitors where we're partnering to create some of these tools, this will lead to more-seamless integration. So again, you don't just have a few glucose values, but we understand how your body is handling sugar, or someone with type 2 diabetes, in a more continuous fashion. It also helps us understand not only what happens at a population level but what might happen on an individual level when you are ingesting certain foods. And the final thing is to really try to reduce costs of devices so that we can really democratize health. The next aim is, how do we organize all of this data? And I can speak both as a patient and as a physician. The thing that people will say is, data's amazing, but please don't overwhelm us with a tsunami of data. You need to organize it. And so we've partnered with Sanofi on a company called Onduo. And the idea is to put the patient in the center of their care and help simplify diabetes management. This really gets to the heart of someone who is going to be happier and healthier. So what does it actually mean? What we try to do is empower people with their glucose control. So we turned to the American Diabetes Association and look at the glucose ranges that are recommended. People then get a graph that shows you what your day looks like and the percentage of time that you are in range-- again, giving a patient or a user that data so they can be the center of their decisions-- and then finally tracking steps through Google Fit. The next goal then is to try to understand how glucose is pairing with your activity and your diet. So here there's an app that prompts for the photo of the food. And then using image recognition and using Google's TensorFlow, we can identify the food. And this is where the true personal insights start to become real. Because if you eat a certain meal, it's helpful to understand how your body ends up relating to it. And there's some really interesting preliminary data suggesting that the microbiome may change the way I responded to a banana, for example, or you might respond. And that's important to know because all of a sudden those general recommendations that we make as a doc-- so if someone comes to see me in clinic and they have type 2 diabetes I might say, OK, here are the things you need to do. You need to watch your diet, exercise, take your oral medications. I need you to also take insulin, exercise. You've got to see your foot doctor, your eye doctor, your primary-care doctor, and the endocrinologist. And that's a lot to integrate. And so what we try to do is also pair all of this information in a simple way with a care lead. This is a person that helps someone on their journey as this information is surfaced. And if you look in the middle of what I'm showing you here on what the care lead and what the person is seeing, you'll see a number of different lines. And I want us to drill down and look into that. This is showing you the difference between the data you might see in an episodic glucose example or what you're seeing with the continuous glucose monitor enabled by this new sensing. And so let's say we drill down into this continuous glucose monitor and we look at a cluster of days. This is an example. We might start to see patterns. And as Lily mentioned, this is not the type of thing that an individual patient, care lead, or physician would end up digging through, but this is where you start to unlock the power of learning models. Because what we can start to see is a cluster of different mornings. We'll make a positive association that everyone's eating incredibly healthy here at Google I/O, so maybe that's a cluster of the red mornings. But we go back into our regular lives and we get stressed and we're eating a different cluster of foods. But instead of, again, giving general advice, we can use different models to point out, it seems like something is going on. With one patient, for example, we were seeing a cluster around Wednesdays. So what's going on on Wednesdays? Is it that the person is going and stopping by a particular location, or maybe there's a lot of stress that day. But again, instead of giving general care, we can start to target care in the most comprehensive and actionable example. So again, thinking about what we're talking about, collecting data, organizing it, and then activating it and making it extremely relevant. So that is the way we're thinking about diabetes care, and that is the way AI is going to work. We heard this morning in another discussion, we've got to think about the problems that we're going to solve and use these tools to really make a difference. So what are some other ways that we can think about activating information? And we heard from Lily that diabetic retinopathy is one of the leading causes of blindness. So even if we have excellent glucose care, there may be times where you start to have end organ damage. And I had mentioned that elevated glucose levels can end up affecting the fundus and the retina. Now we know that people with diabetes should undergo screening. But earlier in the talk I gave you the laundry list of what we're asking patients to do who have diabetes. And so what we're trying to do with this collaboration with Google is figure out, how do we actually get ahead of the product and think about an end-to-end solution so that we realize and bring down the challenges that exist today. Because the issue, in terms of getting screened, one of it is accessibility, and the other one is having access to optometrists and ophthalmologists. And this is a problem in the United States as well as in developing worlds. So this is a problem, not something just local. This is something that we think very globally about when we think about the solution. We looked at this data earlier and this idea that we can take algorithms and increase both the sensitivity and specificity of diagnosing diabetic retinopathy and macular edema. And this is data that was published in "JAMA" as Lily nicely outlined. The question then is, how do we think about creating this product? Because the beauty of working at places like Alphabet and working with partners like you all here today is we can think about, what problem are we solving, create the algorithms. But we then need to step back and say, what does it mean to operate in the space of health care and in the space of life science? We need to think about the image acquisition, the algorithm, and then delivering that information both to physicians as well as patients. So what we're doing is taking this information and now working with some of our partners. There's a promising pilot that's currently ongoing both here as well as in India, and we're so encouraged to hear the early feedback. And there are two pieces of information I wanted to share with you. One is that looking at this early observations, we're seeing higher accuracy with AI than with a manual greater. And the thing that's important as a physician-- I don't know if there are any other doctors in the room, but the piece I always tell people is there's going to be room for health-care providers. What these tools are doing is merely helping us do our job. So sometimes people ask me, is technology and AI going to replace physicians or replace the health-care system? And the way I think about it is, it just augments the work we do. If you think about the stethoscope-- so I'm a cardiologist, and the stethoscope was invented about 200 years ago. It doesn't replace the work we do. It merely augments the work we do. And I think you're going to see a similar theme as we continue to think about ways of bringing care in a more effective way to patients. So the first thing here is that the AI was performing better than the manual grader. And then the second thing is to think about that base of patients. How do we truly democratize care? And so the other encouraging piece from the pilot was this idea that we could start to increase the base of patients treated with the algorithm. Now as it turns out, I would love to say that it's really easy to do everything in health care and life science. But as it turns out, it takes a huge village to do this kind of work. So what's next? What is on the path to clinical adoption? And this is what makes it incredibly exciting to be a doctor working with so many talented technologists and engineers. We need to now partner with different clinical sites that I noted here. We also partner deeply with the FDA, as well as regulatory agencies in Europe and beyond. And one thing at Verily that we've decided to do is to be part of what's called the FDA precertification program. We know that bringing new technologies and new algorithms into health care is critical, but we now need to figure out how to do that in a way that's both safe and effective. And I'm proud of us at Alphabet for really staying ahead of that and partnering with groups like the FDA. The second thing that's important to note is that we partner deeply at Verily with Google as well as other partners like Nikon and Optus. All of these pieces come together to try to transform care. But I know that if we do this correctly, there's a huge opportunity not only in diabetes but really in this entire world of health information. It's interesting to think about it as a physician who spends most of my time taking care of patients in the hospital, how can we start to push more of the access to care outside of the hospital? But I know that if we do this well and if we stay ahead of it, we can close this gap. We can figure out ways to become more preventative. We can collect the right information. We can create the infrastructure to organize it. And most importantly, we will figure out how to activate it. But I want everyone to know here, this is not the type of work that we can do alone. It really takes all of us together. And we at Verily, we at Google, and we at Alphabet look forward to partnering with all of you. So please help us on this journey. Lily and I will be here after these talks. We're happy to chat with all of you. And thank you for spending time at I/O. [MUSIC PLAYING]
B1 US diabetes data glucose care health deep learning Bringing AI and machine learning innovations to healthcare (Google I/O '18) 53 3 Tony Yu posted on 2019/01/02 More Share Save Report Video vocabulary