Categories
Publications

Unsupervised LC method development with AutoLC

In an international and interdisciplinary collaboration, CAST members Tijmen Bos, Stef Molenaar, Jim Boelrijk, Leon Niezen and Bob Pirok have demonstrated unsupervised LC method development with AutoLC. This is the first automated LC-MS method development workflow. It was applied it to a complex antibody digest sample. The work was recently published in Analytical Chemistry as cover article [1].

The majority of liquid chromatography (LC) methods are still developed in a conventional manner, that is, by analysts who rely on their knowledge and experience to make method development decisions. To tackle this problem, several tools utilizing design-of-experiment workflows, retention modeling based on experimental data and/or chemical structure information have been developed and even commercialized.

However, these approaches are generally difficult to scale with sample complexity and require significant user input to operate. Consequently, high-resolution separation technology and multi-dimensional systems have not been economically feasible for routine use. To improve the accessibility of state-of-the-art separation technology, the Pirok group at the University of Amsterdam is developing a workflow capable of unsupervised method development.

This has led to the present demonstration of a novel, open-source algorithm for automated and interpretive method development of LC(−mass spectrometry) separations (“AutoLC”). The scientists constructed a closed-loop workflow that interacted directly with the LC system and ran unsupervised in an automated fashion. 

The first demonstration of AutoLC was published as front cover article in Analytical Chemistry
The study was published as feature article in Analytical Chemistry.
Unsupervised LC method development with AutoLC
Schematic overview of the generic workflow employed by the AutoLC algorithm using retention modeling (top, blue) or BO (bottom, pink).

The team tested the algorithm using two newly designed method development strategies. The first utilized retention modeling, whereas the second used a Bayesian-optimization machine learning approach. In both cases, the algorithm could arrive within 4–10 iterations (i.e., sets of method parameters) at an optimum of the objective function, which included resolution and analysis time as measures of performance.

Retention modeling was found to be more efficient while depending on peak tracking, whereas Bayesian optimization was more flexible but limited in scalability. We have deliberately designed the algorithm to be modular to facilitate compatibility with previous and future work (e.g., previously published data handling algorithms).

AutoLC was tested on a peptide digest mixture.
The AutoLC framework was tested on an antibody digest sample. A) example of a generic scouting measurement, B) proposed optimum at the 4th iteration. Reproduced with permission of [1].

The degree of separation is often quantified as the resolution between chromatographic peaks, which can be written as a product of retention, selectivity and chromatographic efficiency. Currently, the AutoLC framework largely focuses on retention, but contemporary efforts have shifted focus to include selectivity. Support of validation is the logical next step thereafter.

AutoLC leverages earlier studies and interdisciplinary expertise

The AutoLC framework is the product of a several years of scientific studies that were conducted within public-private partnerships by the group of Pirok. These projects focused relevant aspects such as peak tracking [2,3], machine learning [4], and gradient deformation [5]. The AutoLC framework was designed to be modular so as to leverage global initiatives by the scientific community that were published in literature. Currently, the development of the framework is supported by funding from several grants from the Dutch Research Council (NWO). It is the prime topic of the UPSTAIRS project.

The present study was conducted in collaboration with Dr. Bernd Ensing (Computational Chemistry, University of Amsterdam), Dr. Saer Samanipour (Analytical Chemistry, University of Amsterdam), Dr. Patrick Forré (Institute for Informatics, University of Amsterdam), as well as scientists from Gustavus Adolphus College.

Special acknowledgement to Peter Schoenmakers

In the article, the authors acknowledged Prof. Peter Schoenmakers for his founding contributions. In one of his first papers in 1978 on gradient selection for RPLC method development Schoenmakers already envisaged the use of scouting data to facilitate automated method development [6].

Schoenmakers was the promotor of Bob Pirok, who first published about this topic in his 2016 paper in which the theoretical possibility of leveraging these concepts for 2D-LC were investigated [7]. This study was marked the start of this research line that, ultimately, led to the present publication of AutoLC.

References

  1. Chemometric Strategies for Fully Automated Interpretive Method Development in Liquid Chromatography S. Bos, J. Boelrijk, S.R.A. Molenaar, B. van ‘t Veer, L.E. Niezen, D. van Herwerden, S. Samanipour, D.R. Stoll, P. Forré, B. Ensing, G.W. Somsen, B.W.J. Pirok, Anal. Chem. 2022, 94(46), 16060–16068, DOI: 10.1021/acs.analchem.2c03160.
  2. Peak-Tracking Algorithm for Use in Automated Interpretive Method-Development Tools in Liquid Chromatography, B.W.J. Pirok, S.R.A. Molenaar, L.S. Roca and P.J. Schoenmakers, Anal. Chem., 2018, 90(23), 14011-14019, DOI: 10.1021/acs.analchem.8b03929.
  3. Peak-tracking algorithm for use in comprehensive two-dimensional liquid chromatography – application to monoclonal antibody peptides, R.A. Molenaar, T.A. Dahlseid, G. Leme, D.R. Stoll, P.J. Schoenmakers, B.W.J. Pirok, J. Chromatogr. A, 2021, 1639, 461922, DOI: 10.1016/j.chroma.2021.461922.
  4. Bayesian Optimization of Comprehensive Two-dimensional Liquid Chromatography Separations, J. Boelrijk, B.W.J. Pirok, B. Ensing, P. Forré, Chromatogr. A, 1659, 2021, 462628, DOI: 10.1016/j.chroma.2021.462628.
  5. Reducing the influence of geometry-induced gradient deformation in liquid chromatographic retention modellingS. Bos, L.E. Niezen, M.J. den Uijl, S.R.A. Molenaar, S. Lege, P.J. Schoenmakers, G.W. Somsen, B.W.J. Pirok, J. Chromatogr. A, 2021, 1635, 461714, DOI: 10.1016/j.chroma.2020.461714.
  6. Gradient selection in reversed-phase liquid chromatography, P.J. Schoenmakers, H.A.H. Billiet, R. Tussen, L. De Galan, J. Chromatogr. A, 1978, 149,  519-537, DOI: 10.1016/S0021-9673(00)81008-0.
  7. Program for the interpretive optimization of two-dimensional resolution, B.W.J. Pirok, S. Pous-Torres, C. Ortiz-Bolsico, G. Vivó-Truyols and P.J. Schoenmakers, J. Chromatogr. A, 2016, 1450, 29–37, DOI: 10.1016/j.chroma.2016.04.061.
Categories
Publications

Simulated Impact of Machine Learning on 2D-LC Optimization

The prospect of simplified method development for 1D and 2D-LC separations has long been sought for. Indeed, past CAST publications, but also those by many other groups, have investigated the classical approach used also extensively for 1D separations using empirical retention models. Meanwhile, machine-learning tools have emerged as an alternative across STEM fields. It is thus not surprising that its application has been of interest to several groups in the chromatographic community.

Together with dr. Patrick Forré from the Institute of Informatics at the University of Amsterdam, as well as researchers from the Van ‘t Hoff Institute for Molecular Sciences dr. Bernd Ensing (Computational Chemistry) and CAST member dr. Bob Pirok (Analytical Chemistry), PhD candidate Jim Boelrijk (Institute of Informatics) studied the feasibility of using Bayesian Optimization for the optimization of method development in 2D-LC separations.

For any machine learning tool to operate effectively within the context of method optimization, the use of a chromatographic response function or objective function is of paramount importance. Such functions quantify a particular quality descriptor that represent the performance of the separation method. Known examples in the field of 2D separations are peak capacity and orthogonality. However, maximised peak capacity or orthogonality does not necessarily translate into a high information yield. Resolution has also been investigated but its use is impaired by scaling issues. Consequently, the present study employed the concept of connected components (Figure 1). 

Figure 1. Example of labelling of a chromatogram by the chromatographic response function. Blue dots denote components separated with resolutions higher than 1 from all other peaks; red dots denote peaks that are within proximity to neighbors and are clustered together, illustrated by the red lines.

The simulated method development cycles yielded a larger number of separated peaks clusters (connected components) relative to the random and grid search algorithms (Figure 2).

Figure 2. Comparison of the random search, grid search and Bayesian optimization algorithm for sample A (top-left), B (top-right), C (bottom-left) and D (bottom-right) for 100 trials. The vertical black dashed line shows the maximum observed in the grid search (out of 11,664 experiments), while the blue and orange bars denote the best score out 104 iterations for the random search and Bayesian optimization algorithm, respectively.

The study by Boelrijk demonstrated that Bayesian optimization is a viable method for optimization of chromatographic experiments with many method parameters, and therefore also for direct experimental optimization of simple to moderate separation problems. This study was conducted under a simplified chromatographic reality (Gaussian peaks and equal concentration of analytes, generated compounds). Boelrijk thus remains interested to continue this research by working towards actual direct experimental optimization.