Text Mining for plagiarism at arXiv

There’s an article in Nature that shows another yet another way technology is transforming scholarly communicationâ€”this time, quantifying the incidence of plagiarism.

Nature 444, 524-525 (30 November 2006) (subscription required)

Examining 280,000 documents in the arXiv archive, researchers report that blatant deception is quite rare (667 cases or about 0.2% of the archive). Substantial text reuse (defined as significant matching text but at least one matching author on both papers) was quite common (approximately 10% of the archive) but this can be explained by the fact conference abstracts and later publications of the same work are often in the database.

Post Views: 23