In this interview, Data Scientist Federico Paruzzo discusses how Bruker has used deep learning to develop sigreg, the first machine-learning-based command available in Bruker’s TopSpin software. Sigreg performs parameters-free automatic signal region detection for 1H NMR spectra and sets the stage for a full automatization of spectral analysis. Federico will share details on this new method, discuss its performance, and compare it with other approaches available.
How is NMR signal data typically integrated?
With NMR, it is important to detect and integrate each signal region in the spectrum. You can then use this information to quantify your compounds, for example, or to do relaxation measurements. There are currently several ways to do this using TopSpin.
Manual integration is the most commonly used integration method. In TopSpin, this can be performed through the manual-integration window by selecting and integrating each signal region in the spectrum separately.
While this process is simple and widely used, it can be time-consuming and sometimes can also feel frustrating. It can take about 20 to 25 seconds to integrate a simple NMR spectrum, but imagine if you need to deal with tens of spectra in a day. It takes even longer when you have to deal with more complicated spectra.
There are methods to do this integration automatically, such as, for example, the auto-integration command available in TopSpin. However, the result is not optimal and does not quite match what we would do manually.
This command, in particular, depends on many parameters and if you fine-tune all of them, you can get a better result. However, this fine-tuning is very time-consuming, cumbersome, and it can prevent the use of this command for the automatic integration of many different spectra.
Another alternative is the apbk command. The apbk command is a new command introduced in Topspin, to do automatic phase and baseline correction of spectra of X nuclei.
Now, you might point out that this command is not meant to be used on 1H spectra, which is correct. That’s why you need to force the apbk command to work on 1H spectra using the flag “-f”. By doing that, you will obtain the signal regions automatically detected. But again, this result is far away from what you would select manually. This is not surprising, as the apbk command was not meant to work on 1H spectra. So, this is not a valuable alternative to manual integration.
Image Credit:Shutterstock/ angellodeco
How is Bruker using deep learning to improve automated NMR integration?
The challenge for our team was to ask, “can we do better? Can we develop a command which integrates the signal regions the same way a user would, without requiring the fine-tuning of so many parameters?”
To do this, we trained a deep neural network to interpret NMR spectra using supervised learning. We need a lot of training data, iin the form of NMR spectra with the labels for the property we want to learn (in this case, signal regions), and we need to create our deep neural network.
By giving our training set to the neural net, we train our neural network. Once the network is trained, we can take a new spectrum, give it to the network, and it will output the predicted labels.
As a training set, we used 500,000 artificially generated 1H NMR spectra. We used different base frequencies (from 80 to 800 MHz) to generate spectra, as well as a wide range of signal-to-noise ratios and solvent intensities.
To learn, we decided to build a convolutional neural network which was inspired by the U-Net. The U-Net is a fully convolutional neural network used for image segmentation in biomedical applications.
By combining the net with the training set, we created sigreg, the very first machine-learning—based command available in TopSpin. Sigreg allows you to do a fully automatic, parameter-free, signal region detection in 1H NMR spectra.
How did you test the limits of your model?
To test the limits of this algorithm we created a simple spectrum, an artificial spectrum made of only one singlet center on 7.5 ppm. We then tested the limits of detection of the model by varying signal-to-noise, solvent intensity, and line width.
Testing against signal-to-noise. We kept the intensity of the signal constant and changed the noise value in order to match different values of signal-to-noise ratios.
Sigreg performs well with a signal-to-noise ratio of 100 and of 20. At a signal to noise ratio of 10, sigreg is still able to detect the peaks, but this value is a bit borderline because this is the limit we have imposed in our training set.
As a result, at a signal-to-noise ratio lower than ten, sigreg is not able to detect the peak anymore. We will keep developing this algorithm so, in the future, I will not be surprised if we can go to lower values of signal-to-noise.
Testing against solvent intensity. We slightly shifted our peak of interest from 7.5 to 7.3 ppm and added a second peak with higher intensity that simulates the presence of a solvent peak.
We tested the limits of detection by keeping the signal-to-noise of our peak of interest constant and varying the solvent intensity. Sigreg works very well with solvent peaks that are ten or a hundred times larger than the peak of interest.
When the solvent becomes three orders or larger than our peak of interest, sigreg is still able to detect our signal. However, if the limits of detection are much higher, the area detected is much broader, meaning that sigreg becomes less accurate. If the solvent is more than thousand times larger than our signal of interest, then sigreg is not able to detect the signal anymore.
Testing against line width. Again, we used a single peak and kept the intensity constant. We kept also the signal-to-noise level unchanged, and just changed the line width of our peak. Sigreg performed well for a wide range of line widths that ranged from 5 to 500 Hz.
How does sigreg perform with experimental NMR spectra?
To evaluate the performance of our model, we have run sigreg on 100 experimental NMR spectra. The signals in the experimental NMR spectra were labeled by our NMR experts.
We found that the number of signals detected by our experts correlates well with the number of signals detected by sigreg.
Image Credit:Shutterstock/ Lisa-S
How does this compare to the other commands?
Auto integration gives reasonable results, but the agreement with the experts is much lower compared to the agreement between sigreg and the experts. Apbk, instead, tends to underpick heavily, meaning it detects fewer signals than the experts. This is not surprising, as apbk was not developed to work on 1H NMR spectra.
Having spectrometers that range from 80 MHz to 1.2 GHz, at Bruker we are also very interested in adding a command that performs well over a wide range of base frequencies. Twenty-five of our spectra were obtained at 80 MHz with the new Bruker Fourier 80 benchtop NMR instrument, and 75 were obtained at higher frequencies, starting from 300 MHz and higher. Sigreg has also shown to be less dependent from the base frequency compared to the other two commands.
How does calculating the F1 score help confirm the accuracy of sigreg?
Even if the number of peaks gives us an idea of how this algorithm works, it does not really give an idea of the accuracy. And that is why, to estimate the performance of the model, we decided to calculate the F1 score for each spectrum.
If you are not familiar with the concept, the F1 score is a metric which is used in statistical analysis to evaluate the accuracy of binary classification models. The strength of the F1 score lies in the fact that it depends both on precision and on recall. The precision tells us how many signal region detections are real signal regions. This is given by the true positives (the signal regions detected as signal regions), over the sum of the true positives and the false positives(the noise regions detected as signal regions).
While recall tells us how many of the signal regions are detected by the model. This is given by the true positives (the signals detected as signals), over the sum of the true positives and the false negatives. False negatives are the signal regions which are detected as noise.
We calculated the F1 score for all 100 spectra. Sigreg gives better results compared to the other two commands, in terms of F1 score. We have an average F1 score of 94.8% using sigreg, with most of the spectra having an F1 score over 95%.
Automatic integration, on the other side, has a lower average F1 score of 87.1%. The F1 scores for the single spectra are also much more spread compared to sigreg, with some of the spectra below scoring lower than 60%.
The lowest result was given by apbk, 80%, with much higher spreading. This is still a remarkable result for apbk, considering this command was not developed to work on 1H NMR spectra.
We can also check how these results depend on the base frequency by looking at the results that we obtained at 80 MHz. Auto integration provides some of the best results at low frequency. Apbk on the other side gives the worst result with 80 MHz spectra. Sigreg is the only one that gives comparable results from 80 to 800 MHz.
How easy is it to use sigreg?
Using sigreg is very simple. All you need to do is open your dataset in TopSpin, you type “sigreg,” and you obtain your signal region detector in just a few milliseconds. There are no parameters to set up. You can also easily include the graph in your automatic routine by using the macro ‘SIGREG’ in your AU programs.
Sigreg works with complex spectra. We have shown that it also works very well at detecting broad peaks, with noisy spectra, and with spectra with a large solvent peak. When it comes to phase distortion, sigreg is also able to detect peaks in phases of the spectra providing that the phase distortion is reasonable.
Sigreg is included in the latest version of TopSpin. We hope you will test it out and send us your feedback.