The Free Scientific Resource: Evaluating the Accuracy of Wikipedia

Several weeks ago, I came across an article on about how Wikipedia is becoming a scientific resource, whether we like it or not. Scientists are reading Wikipedia, the article said, and it’s affecting how they write. The article cited a study by researchers from MIT and Pitt that found statistical evidence of language in peer-reviewed articles being influenced by Wikipedia articles relevant to the topic. They concluded that journal articles referenced in Wikipedia are subsequently cited more than other similar articles, and that on a semantic level, Wikipedia is influencing the language of scientific journal articles at an astounding rate.

I was intrigued by the idea that reading Wikipedia affects how we later write about a subject. When I start writing about a new topic, the first thing I do is head to Wikipedia to gather a basic understanding before I dive into journal articles. I’ll skim through the overview and most relevant subsections, then check out the references to see what I should continue reading. However, the findings of the study imply that even though I don’t directly use information or language from Wikipedia in my work, it’s still subtly influencing how I write.

The debate over Wikipedia’s credibility is as old as the website itself. There have been several studies that reported Wikipedia to be as consistent and accurate as universally accepted resources, such as this 2005 Nature study that placed it on par with Encyclopedia Brittanica, but these are often controversial and Wikipedia itself often denies the results. The doubts and mistrust are rooted in the fundamental principles of the site: The content is developed collectively, and anyone is welcome to chime in. Theoretically, I could log in and edit the Rolling Stones page to say that Mick Jagger is passionate about real-time cell-based assays. A change so obviously false would probably be rolled back by Wikipedia’s dedicated team of editors, but is the same assumption true for the complex details involved in a scientific article?

If scientists are being influenced by Wikipedia – however slightly – we need to consider the quality of the articles they’re reading. Out of curiosity, I decided to check out the Wikipedia article for Programmed Cell Death Protein 1 (PD-1). As a non-expert in PD-1 protein structure, function and clinical significance, I didn’t see many issues with it. I noticed that T cell was incorrectly hyphenated (T-cell) in a few places, but that seemed like an oversight in copyediting and not a systematic issue with credibility. Luckily, I know a few Promega scientists who work with PD-1 every day, so I reached out to two of them for a more thorough review. I sent them a PDF of the article with simple instructions: “Read the article and note any comments you have about incorrect, misleading, outdated or missing information.”

The reviewers didn’t hold back in their criticisms. (Name redacted for privacy)

The first response I received began with some terse criticism of the overall page content. “It is hard to verify everything cited in this article, and I believe some of them are still controversial,” the reviewer wrote. “Even when one lab published some data, it does not mean the whole field supports it…Certainly it is not hard to find mistakes and outdated statements. It raised the questions [of] how much we can trust the content unless we are already very familiar with the topic.”

The whole article is, to be blunt, far behind the current knowledge and research about PD-1. Of the 40 references with years attached, only seven of them were dated 2016 or later, while well over half of the PubMed listings containing “PD-1” have been published in that same time. From the revision history attached to the article, it appears that several dedicated editors occasionally remove or add references, but there is little energy put into ensuring that each detail is still as accepted as it was when first submitted.

Other comments pointed out how even sentences that weren’t technically incorrect were unintentionally misleading. (Name redacted for privacy)

On a “nuts and bolts” level, the article is already shaky. First, the article lists the incorrect number of amino acids in the protein (268 listed, actually 288), and later says that PD-1 is closely related to CTLA-4, which one reviewer disagreed with. The “Function” section of the article incorrectly states that IFN- γ promotes “T cell inflammatory activity,” though it’s also unclear why this irrelevant sentence was included in the article at all. (Even if the sentence was correct and relevant, “T cell activation” is actually the process being referenced repeatedly throughout the article.) Later, it says that “PD-1 negatively regulates T cell responses,” which is a misleading way to say that it is correlated with reduced T cell proliferation and IL-2 secretion.

The reviewers’ biggest problems were with the “Clinical Significance” section. Almost every detail about anti-PD-1 therapeutics was outdated by several years. Nivolumab (Opdivo – Bristol Myers Squibb) has recently been approved by the FDA for use in several cancers not listed in the article, including MSI-H colorectal cancer (the Wikipedia article states that nivolumab showed no effect in colon cancer). Pidilizumab, which has a full paragraph dedicated to it, was found in 2016 to not be a PD-1 antibody at all. BMS-936559 is an anti-PD-L1 antibody, not an anti-PD-1 antibody as the article states, and several other drugs described as “in early stage development” were actually approved by the FDA in 2016 and 2017. It appears that this section has not been updated with new information since late 2014.

My favorite note on the document was an underlined sentence with the comment, “This statement is difficult to believe. No reference??”

So, the PD-1 article is clearly in bad shape, but what does evaluating one of approximately a million scientific articles on Wikipedia tell us? If scientists’ writing is affected by what they read on Wikipedia, it’s evident that there is a strong need for Wikipedia to be accurate and trustworthy. If we want expert information, we need to get it from experts. I wouldn’t claim to be extremely well-informed about PD-1, and that’s why I won’t be making changes to its respective article. However, there are scientists who dedicate much of their time and energy to learning about this protein and staying informed about recent developments. If we, as scientists, are going to be reading Wikipedia, we also owe it to ourselves and each other to invest energy into its quality. Wikipedia is most effective when we use our individual strengths to continuously raise the quality of the articles most relevant to our expertise.

To get into the spirit, I will be spending this afternoon editing the article about the pinnacle of Central Illinois cuisine, the Horseshoe sandwich.

The following two tabs change content below.
Jordan Villanueva
Jordan Villanueva studied writing and biology at Northwestern University before joining Promega in 2017. As a science writer, he's most interested in the human side of science - the stories and people behind the journal articles. Research interests include immunology and neuroscience, as well as the COVID-19 pandemic. When he isn't working, Jordan loves turning sourdough baking into a science. It's just a symbiotic culture of yeast and lactic acid bacteria, right?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.