Searching molecular structure databases with tandem mass spectra using CSI: FingerID

Published in Proceedings of the National Academy of Sciences, 2015

Kai Dührkop, Huibin Shen, Marvin Meusel, Juho Rousu, Sebastian Böcker. (2017). "Searching molecular structure databases with tandem mass spectra using CSI: FingerID" Proceedings of the National Academy of Sciences https://www.pnas.org/content/112/41/12580.short

Link: [pdf]

Abstract

Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem MS to identify the thousands of compounds in a biological sample. Today, the vast majority of metabolites remain unknown. We present a method for searching molecular structure databases using tandem MS data of small molecules. Our method computes a fragmentation tree that best explains the fragmentation spectrum of an unknown molecule. We use the fragmentation tree to predict the molecular structure fingerprint of the unknown compound using machine learning. This fingerprint is then used to search a molecular structure database such as PubChem. Our method is shown to improve on the competing methods for computational metabolite identification by a considerable margin.