Categories
Publications

Algorithm for evaluation of background correction algorithms

Chromatographic signals comprise three components, (i) low-frequency baseline drift, (ii) high-frequency noise and, for chromatography (iii) the relative mid-frequency peaks. The first two contributions together are the “background” of the signal. Often, there is more background than chromatographic information, as each data point contains a background contribution. In such a case, or if the background is of a frequency very similar to that of the relevant signals, problems may occur with the interpretation of the data. For example, peak detection may be hindered, and errors in classification, discrimination, and, especially, quantification, may occur [1].

We earlier reported how the past decade has seen the development of a plethora of different background correction algorithms [1]. While this is technically a welcome trend, it is currently unclear which tool is useful in what case. There thus is a need for a tool to objectively compare these algorithms. This will allow other users to select the appropriate algorithm for their case.

Often scientists use simulated data for such a comparison, yet such data is often controversial as it is regarded to poorly represent a realistic case. Experimental data solves this issue, but is difficult to generate for each type of chromatographic application. Moreover, it is impossible to determine the statistically true value of, for example, the area of any given peak in the chromatogram.

In this light, scientist Leon Niezen developed a data simulation tool to generate realistic data for use in algorithm comparison studies. The tool was designed to combine experimental baseline drift and noise signals with carefully modeled chromatographic peaks. For the latter, Niezen modelled experimental chromatographic peaks with distribution functions (Figure 1).

Figure 1. A) Fit for Modified Pearson VII and EMG distributions on experimental data, B) AIC values for the five best distribution models for each of the fitted peaks and C) zoomed-in fits and residuals for five individual peaks. Reproduced, with permission, from [2].

Niezen then applied the tool to evaluate a large number of background-correction algorithms that have been developed. By varying signal properties he was able to discern strengths and weaknesses of various algorithms as a function of signal properties. An example is shown in Figure 2.

Figure 2. A) Root mean square error (RMSE) surfaces obtained for the various drift-correction methods in combination with the sparsity-assisted signal (SASS) smoothing algorithm [3]. Methods are indicated by the coloured dots. B) Bottom view (lowest values) resulting from the overlaid RMSE surfaces. Reproduced, with permission, from [2].

The tool was made available to the public and the work can open-access be downloaded. Aside from being useful to scientists, the work also will be of significant importance to the automation (‘AutoLC’) project that is currently commencing in Amsterdam. The work by Niezen was funded by the UNMATCHED project, which is supported by BASF, Covestro, DSM, and Nouryon, and receives funding from the Dutch Research Council (NWO) in the framework of the Innovation Fund for Chemistry and from the Ministry of Economic Affairs in the framework of the “PPS-toeslagregeling.

References

[1] Recent applications of chemometrics in one- and two-dimensional chromatography
T.S. Bos, W.C. Knol, S.R.A. Molenaar, L.E. Niezen, P.J. Schoenmakers, G.W. Somsen, B.W.J. Pirok, J. Sep. Sci. 43(9-10), 2020, 1678-1727, DOI: 10.1002/jssc.202000011 [OPEN ACCESS]

[2] Critical comparison of background correction algorithms used in chromatography, L.E. Niezen, P.J. Schoenmakers, B.W.J. Pirok, Anal. Chim. Acta2022, 1201, 339605, DOI: 10.1016/j.aca.2022.339605 [OPEN ACCESS]

[3] Sparsity-assisted signal smoothing (revisited), I. Selesnick, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, 2017, DOI: 10.1109/ICASSP.2017.7953017

Categories
Publications

Challenges in Obtaining Information from 1D- and 2D-LC

Earlier this June, CAST-member Bob Pirok (Van ‘t Hoff Institute for Molecular Sciences) and Johan Westerhuis (Swammerdam Institute for Life Sciences) published their vision on current challenges in data analysis in one-dimensional (1D) and two-dimensional (2D) chromatography [1].

In their article, the authors discuss the caveats of common data-analysis strategies that are typically employed in processing data obtained from 1D and 2D chromatography. The authors discuss the importance of data pre-processing and the associated challenges. Highlighting one of the conclusions of an earlier review [2], the authors again emphasized that no current studies provide an objective numerical comparison of background correction metrics.

image_2020-11-05_085441

Figure 1. Comparison of commonly applied methods to assess the area of a peak. Reprinted from [1] with permission.

Pirok and Westerhuis furthermore explained the difficulties with common curve resolution methods such as matched filtering (a.k.a. curve-fitting) and derivated-based approaches. While multi-dimensional separations increase the likelihood of resolution, the authors noted that this by no means eases the job of obtaining information of these datasets. The authors also discussed some key opportunities currently in the works by scientists around the globe. You can read the article freely here.

Figure 2. The availability of an additional dimension of data through the detector (in these case DAD) certainly helps to distinguish the peaks, but does not aid in easing extracting the information of the data.

References

[1] Challenges in Obtaining Relevant Information from One- and Two-Dimensional LC Experiments
B.W.J. Pirok & J.A. Westerhuis, LC-GC North America, 6(38), 2020, 8-14 [LINK]

[2] Recent applications of chemometrics in one- and two-dimensional chromatography
T.S. Bos, W.C. Knol, S.R.A. Molenaar, L.E. Niezen, P.J. Schoenmakers, G.W. Somsen, B.W.J. Pirok, J. Sep. Sci. 43(9-10), 2020, 1678-1727, DOI: 10.1002/jssc.202000011