Developing deep learning technologies for medical image classification
Deep learning technologies can assist in medical image classification, such as helping identify variations of brain diseases or cancers based on CT scans. However, traditional deep-learning approaches are challenging to interpret and often require significant amounts of annotated data. Additionally, handling high-resolution medical images is challenging due to limited computer and memory resources.
Dr. Tianbao Yang, associate professor for the Department of Computer Science and Engineering at Texas A&M University, recently received more than $1 million from the National Science Foundation to develop deep learning technologies for medical image classification by leveraging both the images and associated free-text reports of patients for self-supervised learning. Self-supervised learning is a new machine learning model that enables machines to learn from unprecedented, unlabeled data without human supervision, with the potential to dramatically reduce the costs of human labeling. In addition, his algorithm can make deep models interpretable and improve the training by sampling from the multiple CT scans of a patient for computation.
“Since reading many slices, or images, of a CT scan is time-consuming and costly, we expect our system to be able to quickly flag slices with critical findings, bringing early attention to important data that can save time during diagnostics for radiologists,” said Yang. “Our system could also provide a second opinion with radiologists’ style interpretations for less experienced residents or trainees.”
Yang turned to the area under the receiver operating characteristic curve (AUC) Maximization to accomplish this task. Deep AUC maximization is a technique that helps imbalanced data classification. It is a metric for measuring the performance of a classifier, which has been widely used in medicine and medical fields.
In 2020, Yang developed a large-scale optimization algorithm for learning deep neural networks by maximizing AUC directly, which achieved first place at the Stanford CheXpert Competition. Leading the effort to create deep AUC maximization, his team has also developed an open-source library, LibAUC, which has been downloaded more than 41,000 times.
By using self-supervised deep AUC maximization and the database, the algorithm can identify differences and abnormalities with limited or no annotated notes, surpassing this step in the diagnostic process. Additionally, many existing approaches for handling numerous high-resolution CT scans reduce the image resolution to alleviate computational costs. In contrast, this approach addresses the issue through multi-instance learning and leverages advanced optimization techniques to sample instances for computing without compromising the predictive performance. Lastly, the self-supervised learning algorithms based on deep AUC maximization are better suited for handling imbalanced data, a common scenario in the medical domain.