NOTE: Enjoy this slightly nerdier post as I experiment with different Provoked styles. This is for everyone trying to conduct or understand scientific research to reach own well-reasoned conclusions. Also, a new angle on well-trodden controversies.
If you've got even a passing interest in higher education, you've likely been following the saga of Harvard University's former president, Dr. Claudine Gay. She stepped down amidst a storm of controversy over alleged antisemitism on campus and a staggering number of plagiarism accusations that afflicted half of her scientific articles (20? 30? 40? 50 times?).
But the plot thickens. After Dr. Gay's departure, a billionaire who played a key role in her downfall found himself in the spotlight when his wife was accused of plagiarizing Wikipedia for her Ph.D. (10? 20? 30 times?).
The irony. Those who were gleefully cheering for Dr. Gay's removal are now defending the billionaire's wife, brushing off her plagiarism as a mere "punctuation oversight." Meanwhile, Dr. Gay's supporters, who had previously excused her repeated slip-ups, are now on the hunt for the billionaire's wife. And just to add a dash of Hollywood to the mix, Brad Pitt had a wild crush on her!
Who needs Netflix when you've got this kind of drama?
But while these stories are fascinating and terrifying and emblematic of the financially lucrative "culture wars," they're overshadowing a crucial issue: how should we cite the work of others? My collaborator, Dr. Patrick McKnight, and I have been mulling over this question.
So, we decided to kickstart a conversation about the mysterious art of citation. How the hell do you decide what articles, book chapters, and books to cite when writing? None of us have been trained in this arena and based on our cumulative experience as scientists, editors, authors, and consumers, neither has anyone else.
-------
When it comes to citing literature, there's only one rule that's set in stone: don't leave out citations where due. This golden rule stems from the age-old practice of acknowledging others' work, showing that others explored the same territory, and guiding readers to original sources. But does this rule apply all the time? And when did this practice of citing even begin? I'm not sure. What I do know is that citations have become a big deal in today's academic world, and yet, we're all a bit clueless about the rules of the game. Let's break this down.
The allure of being cited. As scientists, we're driven by the desire to make an impact. We want to influence our peers, policy makers, and industry leaders. Our career progression, funding, and reputation hinge on these acknowledgments. We even have metrics that measure our impact based on who cites our work. Regardless of our role in a paper, we get credit for inspiring and influencing others. The hope is that our work will add a brick to the edifice of knowledge.
The journey from a research idea to published article can span years. It involves developing the idea, conducting the study, submitting the work for publication, and enduring the grueling process of peer review. After all this effort, every scientist dreams of their work being read. But if you ask any scientist, they'll tell you: being read isn't enough. They want to influence, to persuade. Citations are a tangible measure of this influence. They represent scientists building on the work of their predecessors. It seems straightforward: more citations equal more impact. But the reality is far from simple.
Rules? What rules?
When it comes to citations, there's a lot of freestyling involved. We cite when we feel like it or when we think it's necessary. But when does that feeling or necessity arise? I've noticed that early career researchers tend to cite with abandon. They'd cite the origin of every word if they could, either out of fear of plagiarism accusations or a burning desire to appear well-read.
As we progress in our careers, we learn to recognize seminal literature, but our loyalty to some works and disregard for others can seem arbitrary. Sometimes, a reviewer will call us out on an omission, and our response is usually to just add the citation and move on. Often, the omitted work is that of the reviewer themselves. From these practices, we can infer a few unwritten rules.
Defensive citation is common. As you mature, you become less defensive and more strategic. This isn't so much a rule as it is a practice that evolves with your comfort level.
Editors and reviewers have a say in citations. I can recall at least 20 instances where a reviewer or editor insisted I cite a work I'd left out. These omissions didn't really affect the final product, but they did bump up the citation count for those authors. This isn't a rule, but rather a byproduct of an unchecked process.
No one really knows who to cite. Colleagues in the field cite each other, and newcomers cite more randomly. But do we actually know who to cite? Google researchers found that scientists are citing older material more frequently now. Why? Maybe because it's easier to find and read older articles online? If so, we feel compelled to cite the original material, and we do so more often. Is there a rule here? Nope. Still no rules.
The quality of a work doesn't dictate whether it's cited. We cite works even if we're skeptical about the findings. Here's the kicker: a citation counts the same whether the work is trash or treasure. There should be a category for "hate cites" - when you cite a work because you're critical of it. Even if these citations are about how terrible the work is, the work gets a boost from the increased attention.
In the mid-1990's Martin Seligman published a paper on the " effectiveness of psychotherapy" using data from a Consumer Reports survey. According to Google Scholar, that paper has been cited 3,393 times as of January 8th, 2024. A sizeable portion of those citations are negative. In other words, people cited the paper NOT because it was good science but rather because it was poor science. Nobody worth their weight in spit would consider effectiveness as a defensible position from a non-randomized, single group, post-test only design with data collected via a supermarket magazine mail-in request. Suffice to say not all citations are indicators that work is a net positive contribution; sometimes, the citation may be an indictment.
The Passover Question
What remains from this discussion is a clear lack of guidelines. We practice what we practice, teach students arbitrary rules that we rarely abide by, and then reward one another for their stature based upon a system that is often devoid of rational thinking. Actually, I don't believe that last part completely—at least not without qualification. The rational structure is "every person for him or herself."
The literature is full of tedious citing, even by those anointed as eminent. Consider arguably the most important anxiety researcher in the 20th century—David Barlow. I did a google search of his articles since 2015 and picked one paragraph from the first page of a random paper of his from over 700 (!!!!) Here is the first paragraph:
Emotion regulation is an important set of processes by which an individual manages and responds to their emotions (Gross & Muñoz, 1995). Previous researchers have shown particular interest in the regulation of distressing negative emotional states, such as sadness or anxiety (Campbell-Sills, Barlow, Brown, & Hofmann, 2006; Gross, 1998). As a consequence, emotion regulation has been increasingly incorporated into conceptualizations of psychopathology development and maintenance (Aldao & Nolen-Hoeksema, 2010; Kring & Sloan, 2010) and has also become a focus of treatment (e.g., Barlow, Allen, & Choate, 2004; Hayes & Feldman, 2004; Mennin, 2004).
There are 100's of articles published on emotion regulation every year. This example begs for the Passover Question of Science—why these references over any other?
2 citations are from his lab
Of the other 6 articles, 3 of the researchers are from Yale, 2 are from Stanford, and the final is from Berkeley.
Do we choose the best scientists from the best universities?
Do we do due diligence to find the best work, regardless of author and institution?
And, what about the timing of when this work was published? Do we focus on the seminal work? Most recent work? Or, do we focus on something less deliberate and purposeful—such as whatever happens to have a PDF available in a google scholar search?
For now, there is no solution for how to play the citation game. What we know for sure is that if you publish more, you'll get more citations. The more you publish where other people publish, the more likely you are to get cited. Conversely, publish in an obscure area where very few readers (if any) conduct research or publish and you will likely acquire few citations outside self-citations and friends, family, and the voracious mailman. Moreover, if you are the product of a prestigious university, you gain more attention. Prestige begets prestige irrespective of merit.
Measuring the problem to find a solution
Dr. Patrick McKnight and his MRES group) are measurement fantatics. Without good measurement, we are lost. Lord Kelvin (1872) once said:
[a]ccurate and minute measurement seems to the non-scientific imagination, a less lofty and dignified work than looking for something new. But nearly all the grandest discoveries of science have been but the rewards of accurate measurement and patient long-continued labour in the minute sifting of numerical results.
With that in mind, I invented a few metrics to show you how silly citations can be. These metrics or numbers get calculated like Bill James calculates Major League baseball statistics—by raw counts and ratios. Take a look at a few I computed for two sentences opening an article in a prestigious journal by leading scientists on "the costly pursuit of self-esteem":
The pursuit of self-esteem has become a central preoccupation in American culture (Baumeister, Campbell, Krueger, & Vohs, 2003; Heine, Lehman, Markus, & Kitayama, 1999; Pyszczynski, Greenberg, & Solomon, 1997; Sheldon, Elliot, Kim, & Kasser, 2001). Hundreds of books offer strategies to increase self-esteem, childrearing manuals instruct parents on how to raise children with high self-esteem (Benson, Galbraith, & Espeland, 1998; Glennon, 1999; P. J. Miller, 2001), and schools across the United States have implemented programs aimed at boosting students’ self-esteem in the hopes of reducing problems such as high dropout rates, teenage pregnancy, and drug and alcohol abuse (Dawes, 1994; McElherner & Lisovskis, 1998; Mecca, Smelser, & Vasconcellos, 1989; Seligman, 1998).
METRIC 1:Â Citations (11) per word (108) = 11/108 = 0.10. Not sure what this number conveys other than nearly 10% of the text is muddled by citations. A 1:1 ratio would indicate that half the space gets filled by citations. Here, we have a .10 ratio.
METRIC 1b:Â Citations (11) per word (108) plus number of Citations (11) = 11/(108+11) = .09. 9% of the space gets devoted to citations.
METRIC 2:Â Researchers (27) per word (108) = 27/108 = 0.25. Every fourth word carries with it a winning researcher. That is, a scientist lucky enough to be cited.
METRIC 3:Â Citations to Researchers = 11/27= 0.41. Percentage of researchers involved in papers cited. Not sure this is still relevant since fewer researchers publish solo-authored work. Why would you? The more hard-working, intelligent, challenging lobes, the better.
METRIC 4:Â Sentences (2) to Citations (11) = 2/11 = 0.18. 18% of the space is taken up by citations as opposed to thoughts.
METRIC 5:Â Sentences (2) to Researchers (27) = 2/20 = 0.07.
METRIC 5b:Â Researchers (27) to Sentences (2) = 27/2 = 13.5!!! We have over 13 winners per sentence. That seems like a lot.
METRIC 6:Â Readers of every word of each sentence = ????. One wonders how often readers skip text in these two sentences as the first few lines are overwhelming to read through.
Design your own. The problem is that we don't have guidance for ourselves or colleagues to properly acknowledge existing work. Yet, we use these citations as a high-stakes outcome in our profession.
How should we fix this problem?
We can devise our own metrics instead of leaning on established norms. Take the h-index, for instance. It's a nifty tool that quantifies a researcher's productivity and influence. Imagine having an h-index of 17. This means you've published at least 17 papers, each of which has been cited at least 17 times. It's a testament to the reach and resonance of your work. As of now, my h-index stands at 113, indicating that I've published 113 articles, each cited at least 113 times. Compare this with Dr. Claudine Gay, a Harvard graduate, Stanford professor, and current Harvard faculty member, who earns over $900,000 annually (5.5x my salary) and has an h-index of 14. Food for thought.
However, citation count isn't the ultimate measure of a person's impact. The real test lies in the content of their work. We should be reading and critically evaluating each other's research. This involves approaching each piece with an open mind, a healthy dose of skepticism, and a readiness to provide both praise and constructive feedback.
Additional Ideas In No Particular Order
When making a scientifically testable claim, cite the relevant research (e.g., "Older adults tend to experience more dementia symptoms; Lyketsos, et al., 2002"). If no evidence exists, clarify that it's your opinion.
Cite research that heavily influences your ideas (e.g., "We applied cognitive-behavioral principles to dementia; Pinquart & Sorensen, 2006"). The term "heavily influences" is subjective, but it's a useful guideline.
Cite research that has explored similar questions to yours (e.g., "Previous research tested if behavior therapy helps dementia patients; Teri et al, 1997"). The challenge is determining how similar the research questions need to be to warrant citation.
Journal citation limits force you to be selective, but they don't guide your selection process (e.g., should you cite the oldest, newest, or largest-sample studies?). Some solutions improve readability but don't address the core issue of lax standard practice.
Citations are often used to support arguments and show that our statements are factual, or at least not made up. If a statement is empirically supported, it might carry more weight than those without empirical backing.
Provocations
This conversation highlights the murky waters of citation practices. We cite for various reasons, many of which are unspoken or hidden. Here's another angle to consider: do we cite for others' benefit or our own? We might use citations to assert our knowledge over the uninformed. I remember a grad student from another department presenting a talk and citing fictitious articles. The audience listened attentively, never questioning the authenticity of the references or their relevance to his points.
Be careful about judging others until this mess is sorted out. And be careful of motivated reasoning, where you hold different standards for people you like and don’t like, whose views you share and disagree with.
The Justification Experiment:Â What if every citation had to include a justification for its inclusion? Would this prevent the use of citations as mere rhetorical devices?
The AI Experiment:Â What if a machine-learning algorithm could suggest the most relevant citations for a paper? Would this make the citation process more efficient and accurate?
Like statistics, citations quantify uncertainty; they can either support a claim or suppress the uninformed. Their role has grown significantly over the years, making it crucial for scientists and the public to understand what we're measuring and why it matters.
So, how can we untangle this mess?
Provoked is free today. But if you enjoyed this post, please consider subscribing. And connect with me on: Twitter or Facebook or LinkedIn or Instagram
If you are a free subscriber, upgrade for the full experience.
Dr. Todd B. Kashdan is an author of several books including The Upside of Your Dark Side (Penguin) and The Art of Insubordination: How to Dissent and Defy Effectively (Avery/Penguin) and Professor of Psychology and Leader of The Well-Being Laboratory at George Mason University.