Modernizing Cancer Care with Machine Learning

Researchers meeting

Computer science is a growing field in medical research worldwide. As technology advances and new machine learning abilities are developed, their use in health care becomes more common. Researchers at Dartmouth-Hitchcock’s Norris Cotton Cancer Center continue to be on the forefront of machine learning advancements. Saeed Hassanpour, PhD, a member of the Cancer Population Science Research Program at Norris Cotton Cancer Center, is also an associate professor of Biomedical Data Science and associate professor of Epidemiology at Dartmouth’s Geisel School of Medicine, and associate professor of Computer Science at Dartmouth’s College of Arts and Sciences. Hassanpour and his team work closely with clinicians to develop new machine learning technologies that can serve as valuable tools in the clinic toward the ongoing universal goal of improving patient care.

What is machine learning and artificial intelligence?

Artificial intelligence (AI) in very general terms is a system or program that interacts with the environment and takes optimum actions to reach a predetermined goal. Machine learning is a branch of AI that encompasses almost all new advancements and applications. “There are many different approaches to building an artificial intelligence system,” explains Hassanpour. “One of them is to tell the machine what to do in every situation. But as soon as the tasks get too complex, this type of system will break down. Machine learning is a data-driven approach that learns from the data and tries to find indicative patterns in it. As a result, it’s very generalizable. As long as we have high-quality data that’s representative of a certain condition, we can apply these models to identify the features that indicate that condition. The results can then be used to determine a diagnosis and treatment plan.” 

Dr. Hassanpour, a computer and data science engineer who worked for Microsoft as a researcher before returning to academic medicine, studies how to apply new advances in AI methodology to the biomedical setting. For example, facial recognition technology uses the same principles used for processing pathology or radiology images, including detection, segmentation, and classification. “We’re collaborating with clinicians to identify pressing clinical needs—tasks that are burdensome or tricky such as reading certain pathology slides or determining certain classifications. We use their clinical insight to customize machine learning approaches to help find a resolution,” says Hassanpour.

Problems resolved

Hassanpour and his team have recently published two of their clinical studies that led to successfully developed and validated new technologies.

In one study, they created a machine learning approach to identify the risk that atypical ductal hyperplasia (ADH) breast lesions may upgrade to cancer. ADH is a breast lesion associated with a four- to five-fold increase in the risk of breast cancer. ADH is primarily found using mammography and identified on core needle biopsy. Currently, surgical removal is recommended for all ADH cases to determine if the lesion is cancerous. Hassanpour’s team’s machine learning approach can identify 98 percent of all malignant cases prior to surgery. “Our results suggest there are robust clinical differences between women at low versus high risk for ADH upgrade to cancer based on core needle biopsy data that allowed our machine learning model to reliably predict malignancy upgrades,” he says.

Another challenge is the requirement of a pathologist’s visual examination of lobectomy slides to determine lung adenocarcinoma tumor patterns and subtypes. This classification helps to determine prognosis and treatment for lung cancer. However, Hassanpour explains that it’s a difficult and subjective task. His team developed a deep neural network to classify different types of lung adenocarcinoma on histopathology slides, and found that the model performed on par with three practicing pathologists. “Clinical implementation of our system would be able to assist pathologists with accurate classification of lung cancer subtypes,” predicts Hassanpour. “Our machine learning method is also fast and can process a slide in less than one minute, so it could help triage patients before examination by physicians and greatly assist pathologists in the visual examination of slides.”

AI in the clinic

The technology may be available, but introducing it into clinical practice presents its own challenges. “A big question in the field is, are these models generalizable to data coming from different data sources? What if you apply it to someone from a different environment or somewhere where they have totally different hardware or different preparation of the slides, or maybe a different cohort—different countries or ethnicities—is the model generalizable?,” asks Hassanpour.

There is an ongoing clinical trial to validate accuracy and usefulness on a larger scale at different institutions. “We’re developing an AI model focused on the automatic classification of histology slides from polyps in colorectal screening,” notes Hassanpour. “In collaboration with my colleagues in pathology, epidemiology and the NH Colonoscopy Registry, we’re getting access to data from multiple institutions in different states and have been able to show that not only is our model accurate at our own institution but accurate with data from other organizations. It’s something others have raised questions about, and we’re one of the first groups to have access to the appropriate datasets to address this. We are excited to publish these findings soon.”

The team’s AI tool could make colorectal screening options more accessible to rural populations who don’t have access to highly specialized experts. It could also reduce the burden that large volumes of screening have on pathologists. The team is also partnering with global oncology specialists to introduce this technology to underdeveloped countries that have very limited personnel resources.

The use of AI is becoming more prominent in medical image analysis, especially in fields that use visual and image data, such as dermatology, ophthalmology, pathology, and particularly radiology. In pathology, physical specimens are still mostly looked at under the microscope. “While not widely digitized, here at the Department of Pathology and Laboratory Medicine at Dartmouth-Hitchcock, we have a very active digitizing workflow ahead of its peers. It’s one of our strengths in addition to having a highly collaborative clinical environment,” notes Hassanpour. The digitizing workflow gave his team the unique opportunity to access sufficient data for development and evaluation of their results.

Gaining trust and changing views

Hassanpour’s team recognizes the critical importance of building models that can be explained to clinicians, so that a result that a model generates, and the components that influenced the model’s decision can be reviewed and confirmed by the clinician and explained to their patient. “If we want to use these models in practice, they need to be accessible and gain the trust of the clinicians and also patients,” says Hassanpour. “Before iPhone, there were technologists that tried to make handheld devices that had good technology but were not easy to use so they never took off. Once they built a very nice user interface, suddenly a lot of people subscribed to the idea of having that in their pocket. There are a lot of tech-savvy doctors who are willing to try, but to make these models a standard tool in clinical practice, we need to make them easier for everyone to use.”

These days, Hassanpour notes, there is a lot of excitement about the potential of AI in health care. Physicians and machine learning experts have started to partner in these efforts. “If clinicians see that this is a technology in their service, they are all on board,” says Hassanpour. Clinicians who are exposed to machine learning technology, and technical experts who are exposed to challenges of the biomedical domain are looking for ways to team up. That partnership is what will generate excitement about the potential of making health care more cost-efficient and accessible.

“There’s a great deal that can go wrong, especially in medical practice, so the system needs to be well-designed, highly tested and validated before it can be deployed,” confirms Hassanpour. “Right now we’re rigorously testing these models using multi-institutional datasets to validate that we have solid technology. Once we have that, we can figure out the interaction part—the best way to not block the clinical flow but be helpful and make everything more efficient and reduce burden. There are two different areas we need to focus on. Not just building a model, putting it on the shelf and publishing a paper, but evaluating across different institutions to show the model applies to diverse datasets, and making the model easy to interpret and explain so it’s actually useful in practice.”

Machine learning is as good as the data

Currently, patient data is siloed, especially across hospitals. Even within one institution, images from radiology can be totally separate from genetic testing results which are separate from pathology reports and so forth. They are not organized around the patient in an efficient way. Machine learning models are all data-driven so it’s important to use the highest quality and most complete data possible.

Ideally, for every patient, Hassanpour would love to have comprehensive data that’s stored securely that different institutions can contribute to and collaborate around. “At Dartmouth-Hitchcock, data is stored in a rather comprehensive way and is easy to access–but that’s one organization,” says Hassanpour. “If a patient is new to the area or visiting multiple institutions, that gets even more complicated. If we want to make accurate assessments, we need to have high-quality, comprehensive data. We’re moving in the right direction but have a long way to go. These models can be used for screening, diagnosis, prognosis, and more. They have a lot of potential but they are as good as the data we feed them.”

Advancements in machine learning practices are very much a team effort. “I thank our collaborators in pathology, radiology and clinical departments, biomedical data science, epidemiology, computer science and everyone in our lab,” says Hassanpour. “We have great clinical collaborators who bring their insight and expertise to help develop and evaluate our models and give us direction. We have talented technical collaborators and bright students and trainees—we are very excited about the future.”