Malignant Mesothelioma subtyping via sampling driven multiple instance prediction on tissue image and cell morphology data

Artificial Intelligence in Medicine 2023 September [Link]

Mark Eastwood, Silviu Tudor Marc, Xiaohong Gao, Heba Sailem, Judith Offman, Emmanouil Karteris, Angeles Montero Fernandez, Danny Jonigk, William Cookson, Miriam Moffatt, Sanjay Popat, Fayyaz Minhas, Jan Lukas Robertus


Malignant Mesothelioma is a difficult to diagnose and highly lethal cancer usually associated with asbestos exposure. It can be broadly classified into three subtypes: Epithelioid, Sarcomatoid, and a hybrid Biphasic subtype in which significant components of both of the previous subtypes are present. Early diagnosis and identification of the subtype informs treatment and can help improve patient outcome. However, the subtyping of malignant mesothelioma, and specifically the recognition of transitional features from routine histology slides has a high level of inter-observer variability. In this work, we propose an end-to-end multiple instance learning (MIL) approach for malignant mesothelioma subtyping. This uses an adaptive instance-based sampling scheme for training deep convolutional neural networks on bags of image patches that allows learning on a wider range of relevant instances compared to max or top-N based MIL approaches. We also investigate augmenting the instance representation to include aggregate cellular morphology features from cell segmentation. The proposed MIL approach enables identification of malignant mesothelial subtypes of specific tissue regions. From this a continuous characterisation of a sample according to predominance of sarcomatoid vs epithelioid regions is possible, thus avoiding the arbitrary and highly subjective categorisation by currently used subtypes. Instance scoring also enables studying tumor heterogeneity and identifying patterns associated with different subtypes. We have evaluated the proposed method on a dataset of 234 tissue micro-array cores with an AUROC of 0.89±0.05 for this task. The dataset and developed methodology is available for the community at: