Journal
BIOINFORMATICS
Volume 24, Issue 2, Pages 243-249Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btm574
Keywords
-
Categories
Funding
- NLM NIH HHS [1 R01 LM009758-01] Funding Source: Medline
Ask authors/readers for more resources
Motivation: Duplicate publication impacts the quality of the scientific corpus, has been difficult to detect, and studies this far have been limited in scope and size. Using text similarity searches, we were able to identify signatures of duplicate citations among a body of abstracts. Results: A sample of 62 213 Medline citations was examined and a database of manually verified duplicate citations was created to study author publication behavior. We found that 0.04 of the citations with no shared authors were highly similar and are thus potential cases of plagiarism. 1.35 with shared authors were sufficiently similar to be considered a duplicate. Extrapolating, this would correspond to 3500 and 117 500 duplicate citations in total, respectively.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available