MOTIVATION
A captivating aspect of science is how different fields of study interact and influence one another. Many significant advances have emerged from the synergistic interaction of multiple disciplines. For instance, the conception of quantum mechanics is a theory that coalesced Planck’s idea of quantized energy levels, Einstein’s photoelectric effect, and Bohr’s atom model.
The degree to which the ideas and artifacts of a field of study are helpful to the world is a measure of its influence.
Developing a greater sense of the influence of a field has several advantages, equivalent to understanding what fosters greater innovation and what stifles it, what a field has success at understanding and what stays elusive, or who’re probably the most distinguished stakeholders benefiting and who’re being left behind.
Mechanisms of field-to-field influence are complex, but one notable marker of scientific influence is citations. The extent to which a source field cites a goal field is a rough indicator of the degree of influence of the goal on the source. We note here, though, that not all citations are equal and subject to varied biases. Nonetheless, meaningful inferences may be drawn at an aggregate level; for instance, if the proportion of citations from field x to a goal field y has markedly increased as in comparison with the proportion of citations from other fields to the goal, then it is probably going that the influence of x on y has grown.
WHY NLP?
While studying influence is helpful for any field of study, we give attention to Natural language Processing (NLP) research for one critical reason.
NLP is at an inflection point. Recent developments in large language models have captured the imagination of the scientific world, industry, and most people.
Thus, NLP is poised to exert substantial influence despite significant risks. Further, language is social, and its applications have complex social implications. Subsequently, responsible research and development need engagement with a large swathe of literature (arguably, more so for NLP than other fields).
By tracing lots of of 1000’s of citations, we systematically and quantitatively examine broad trends within the influence of assorted fields of study on NLP and NLP’s influence on them.
We use Semantic Scholar’s field of study attribute to categorize papers into 23 fields, equivalent to math, medicine, or computer science. A paper can belong to 1 or many fields. For instance, a paper that targets a medical application using computer algorithms is perhaps in medicine and computer science. NLP itself is an interdisciplinary subfield of computer science, machine learning, and linguistics. We categorize a paper as NLP when it’s within the ACL Anthology, which is arguably the most important repository of NLP literature (albeit not a whole set of all NLP papers).
- 209m papers and a couple of.5b citations from various fields (Semantic Scholar): For every citation, the field of study of the citing and cited paper.
- Semantic Scholar’s field of study attribute to categorize papers into 23 fields, equivalent to math, medicine, or computer science.
- 77K NLP papers from 1965 to 2022 (ACL Anthology)


