Exploring chemometrics with synthetic data and machine learning
Disclaimer: I'm learning as I go here — I have no formal background in analytical chemistry or chemometrics. This is very much a "figure it out as you build it" project, and nothing here should be taken as state of the art. If you spot something wrong or know a better way, I'd love to hear about it!
What happens when molecules overlap, and how separating them fixes identification.
From raw CDF files to a realistic data generator using real elution profiles and mass spectra.
Using singular value decomposition and a random forest to estimate overlapping molecule count — 98.5% accuracy.
Upcoming
Upcoming — hopefully that works! 😱😂
Upcoming
By Jonas Berdoz · Data sources: Copenhagen Soft Camel Cheese GC-MS dataset, MassBank mass spectral library