lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Case study: Feature engineering inspired by domain experts on real world medical data
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). Region Kalmar, Sweden. (DISA-IDP)ORCID iD: 0000-0002-8370-2950
Linnaeus University, Faculty of Health and Life Sciences, Department of Medicine and Optometry. Region Kalmar län. (DISA-IDP)ORCID iD: 0000-0003-3106-0754
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (DISA-IDP)ORCID iD: 0000-0002-7565-3714
2023 (English)In: Intelligence-Based Medicine, ISSN 2666-5212, Vol. 8, article id 100110Article in journal (Refereed) Published
Abstract [en]

To perform data mining projects for knowledge discovery based on health data produced in a daily health care stored in electronic health records (EHR) can be time consuming. This study exemplifies that the involvement of a data scientist improves classification performances. We have performed a case study that comprises two real world medical research projects, comparing feature engineering and knowledge discovery based on classification performance. Project (P1) comprised 82,742 patients with the research question “Can we predict patient falls by use of EHR data” and the second project (P2) included 23,396 patients with the focus on “Negative side effects of antiepileptic drug consumption on bone structure”.

The results concluded three salient results. (i) It is valuable for medical researchers to involve a data scientist when medical research based on real world medical data is performed. The findings were justified with an analysis of classification metrics when iteratively engineered features were used. The features were generated from domain experts and computer scientists in collaboration with medical researchers. We gave this process the name domain knowledge-driven feature engineering (KDFE).

To evaluate the classification performance the metric area under the receiver operating characteristic curve (AUROC) was used. (ii) Domain experts are benefited in quantitative terms by KDFE. When KDFE was compared to baseline, the average classification performance measured by AUROC for the engineered features rose for P1 from 0.62 to 0.82 and for P2 from 0.61 to 0.89 (p-values << 0.001). (iii) The engineered features were represented in a systematic structure, which is the foundation of a theoretical model for automated KDFE (aKDFE).

To our knowledge, this is the first study that proves that via quantitative measures KDFE adds value to real-world. However, the method is not limited to the medical domain. Other areas with similar data properties should also benefit from KDFE.

Place, publisher, year, edition, pages
Elsevier, 2023. Vol. 8, article id 100110
Keywords [en]
Feature engineering, Medical registry research, Knowledge discovery in databases (KDD), Quantitative measures, Electronic health record (EHR), Domain knowledge
National Category
Information Systems
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
URN: urn:nbn:se:lnu:diva-125163DOI: 10.1016/j.ibmed.2023.100110Scopus ID: 2-s2.0-85173225229OAI: oai:DiVA.org:lnu-125163DiVA, id: diva2:1805029
Available from: 2023-10-16 Created: 2023-10-16 Last updated: 2023-11-07Bibliographically approved

Open Access in DiVA

fulltext(5603 kB)10 downloads
File information
File name FULLTEXT01.pdfFile size 5603 kBChecksum SHA-512
525bbb73ed8a4d35fd0ca23cee1f24abf0590803c19adac3282db9aea9332ae12b910416bc1b51d4aaae510e90fd163f622ce15bd0f330e76efebc6a311f686b
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Björneld, OlofCarlsson, MartinLöwe, Welf

Search in DiVA

By author/editor
Björneld, OlofCarlsson, MartinLöwe, Welf
By organisation
Department of computer science and media technology (CM)Department of Medicine and Optometry
Information Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 10 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 18 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf