4.8 Article

Machine Vision Methods, Natural Language Processing, and Machine Learning Algorithms for Automated Dispersion Plot Analysis and Chemical Identification from Complex Mixtures

Journal

ANALYTICAL CHEMISTRY
Volume 91, Issue 16, Pages 10509-10517

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.analchem.9b01428

Keywords

-

Funding

  1. NIH [1P30ES023513-01A1, U01 EB0220003-01, UG3-0D023365]
  2. NIH National Center for Advancing Translational Sciences (NCATS) [UL1 TR000002]
  3. US Department of Veterans Affairs
  4. NSF [1343479]
  5. Post-9/11 GI-Bill

Ask authors/readers for more resources

Gas-phase trace chemical detection techniques such as ion mobility spectrometry (IMS) and differential mobility spectrometry (DMS) can be used in many settings, such as evaluating the health condition of patients or detecting explosives at airports. These devices separate chemical compounds in a mixture and provide information to identify specific chemical species of interest. Further, these types of devices operate well in both controlled lab environments and in-field applications. Frequently, the commercial versions of these devices are highly tailored for niche applications (e.g., explosives detection) because of the difficulty involved in reconfiguring instrumentation hardware and data analysis software algorithms. In order for researchers to quickly adapt these tools for new purposes and broader panels of chemical targets, it is critical to develop new algorithms and methods for generating libraries of these sensor responses. Microelectromechanical system (MEMS) technology has been used to fabricate DMS devices that miniaturize the platforms for easier deployment; however, concurrent advances in advanced data analytics are lagging. DMS generates complex three-dimensional dispersion plots for both positive and negative ions in a mixture. Although simple spectra of single chemicals are straightforward to interpret (both visually and via algorithms), it is exceedingly challenging to interpret dispersion plots from complex mixtures with many chemical constituents. This study uses image processing and computer vision steps to automatically identify features from DMS dispersion plots. We used the bag-of words approach adapted from natural language processing and information retrieval to cluster and organize these features. Finally, a support vector machine (SVM) learning algorithm was trained using these features in order to detect and classify specific compounds in these represented conceptualized data outputs. Using this approach, we successfully maintain a high level of correct chemical identification, even when a gas mixture increases in complexity with interfering chemicals present.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available