Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, Geunho | - |
dc.contributor.author | Lee, Hyun Beom | - |
dc.contributor.author | Jung, Byung Hwa | - |
dc.contributor.author | Nam, Hojung | - |
dc.date.accessioned | 2024-01-20T01:03:52Z | - |
dc.date.available | 2024-01-20T01:03:52Z | - |
dc.date.created | 2021-09-05 | - |
dc.date.issued | 2017-07 | - |
dc.identifier.issn | 2211-5463 | - |
dc.identifier.uri | https://pubs.kist.re.kr/handle/201004/122591 | - |
dc.description.abstract | Mass spectrometry (MS) data are used to analyze biological phenomena based on chemical species. However, these data often contain unexpected duplicate records and missing values due to technical or biological factors. These 'dirty data' problems increase the difficulty of performing MS analyses because they lead to performance degradation when statistical or machine-learning tests are applied to the data. Thus, we have developed missing values preprocessor (MVP), an open-source software for preprocessing data that might include duplicate records and missing values. MVP uses the property of MS data in which identical chemical species present the same or similar values for key identifiers, such as the mass-to-charge ratio and intensity signal, and forms cliques via graph theory to process dirty data. We evaluated the validity of the MVP process via quantitative and qualitative analyses and compared the results from a statistical test that analyzed the original and MVP-applied data. This analysis showed that using MVP reduces problems associated with duplicate records and missing values. We also examined the effects of using unprocessed data in statistical tests and examined the improved statistical test results obtained with data preprocessed using MVP. | - |
dc.language | English | - |
dc.publisher | WILEY | - |
dc.subject | MULTIPLE IMPUTATION | - |
dc.title | MVP - an open-source preprocessor for cleaning duplicate records and missing values in mass spectrometry data | - |
dc.type | Article | - |
dc.identifier.doi | 10.1002/2211-5463.12247 | - |
dc.description.journalClass | 1 | - |
dc.identifier.bibliographicCitation | FEBS OPEN BIO, v.7, no.7, pp.1051 - 1059 | - |
dc.citation.title | FEBS OPEN BIO | - |
dc.citation.volume | 7 | - |
dc.citation.number | 7 | - |
dc.citation.startPage | 1051 | - |
dc.citation.endPage | 1059 | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.identifier.wosid | 000404762600014 | - |
dc.identifier.scopusid | 2-s2.0-85020498214 | - |
dc.relation.journalWebOfScienceCategory | Biochemistry & Molecular Biology | - |
dc.relation.journalResearchArea | Biochemistry & Molecular Biology | - |
dc.type.docType | Article | - |
dc.subject.keywordPlus | MULTIPLE IMPUTATION | - |
dc.subject.keywordAuthor | dirty data | - |
dc.subject.keywordAuthor | duplicate record | - |
dc.subject.keywordAuthor | mass spectrometry | - |
dc.subject.keywordAuthor | missing value | - |
dc.subject.keywordAuthor | MS data preprocessor | - |
dc.subject.keywordAuthor | R package | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.