lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
PLS-Optimal: A stepwise D-Optimal design based on latent variables
Linnaeus University, Faculty of Science and Engineering, School of Natural Sciences.
Linnaeus University, Faculty of Science and Engineering, School of Natural Sciences.
Linnaeus University, Faculty of Science and Engineering, School of Natural Sciences.ORCID iD: 0000-0001-9382-9296
2012 (English)In: Journal of Chemical Information and Modeling, ISSN 1549-9596, Vol. 52, no 4, 975-983 p.Article in journal (Refereed) Published
Abstract [en]

Several applications, such as risk assessment within REACH or drug discovery, require reliable methods for the design of experiments and efficient testing strategies. Keeping the number of experiments as low as possible is important from both a financial and an ethical point of view, as exhaustive testing of compounds requires significant financial resources and animal lives. With a large initial set of compounds, experimental design techniques can be used to select a representative subset for testing. Once measured, these compounds can be used to develop quantitative structure–activity relationship models to predict properties of the remaining compounds. This reduces the required resources and time. D-Optimal design is frequently used to select an optimal set of compounds by analyzing data variance. We developed a new sequential approach to apply a D-Optimal design to latent variables derived from a partial least squares (PLS) model instead of principal components. The stepwise procedure selects a new set of molecules to be measured after each previous measurement cycle. We show that application of the D-Optimal selection generates models with a significantly improved performance on four different data sets with end points relevant for REACH. Compared to those derived from principal components, PLS models derived from the selection on latent variables had a lower root-mean-square error and a higher Q2 and R2. This improvement is statistically significant, especially for the small number of compounds selected.

Place, publisher, year, edition, pages
2012. Vol. 52, no 4, 975-983 p.
National Category
Environmental Sciences Probability Theory and Statistics
Research subject
Natural Science, Environmental Science
Identifiers
URN: urn:nbn:se:lnu:diva-18208DOI: 10.1021/ci3000198OAI: oai:DiVA.org:lnu-18208DiVA: diva2:513860
Available from: 2012-04-03 Created: 2012-04-03 Last updated: 2016-11-15Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full texthttp://pubs.acs.org/doi/abs/10.1021/ci3000198

Search in DiVA

By author/editor
Sahlin, UllrikaÖberg, Tomas
By organisation
School of Natural Sciences
Environmental SciencesProbability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 70 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf