This essay, a sort of encyclopedish entry on the evolution of the idea of information, appeared in Matthew Fuller, ed., Software Studies: A Lexicon (Cambridge MA: MIT Press, 2008). The book grew out of the brilliant Software Studies Workshop held at Rotterdam’s Piet Zwart Institute on 25–26 Feb 2006. That event crystallized a lot of scattered thinking that had been in the air for several years, to say the least. But, more than that, it had an unusual impact: it sparked the creation of an academic study program at UC San Diego, an excellent academic journal (Computational Culture: A Journal of Software Studies), and — thanks to Doug Sery, the editor who spearheaded it — the Lexicon itself inaugurated a new MIT Press series dedicated to software studies.
Information
“Information” can describe everything from a precise mathematical property of communication systems, to discrete statements of fact or opinion, to a staple of marketing rhetoric, to a world-historical phenomenon on the order of agriculture or industrialization. The frequency and disparity of its use, by specialists and lay people alike, to describe countless general and specific aspects of life, makes it difficult to analyze; no single academic discipline or method can offer an adequate explanation of the term or the concept, to say nothing of the phenomena it encompasses.
A typical approach to a problem of this kind is to address it on the level of the word as such: to gather examples of its use, codify their meanings, and arrange them into a taxonomy, whether “synchronic” (limited to a specific period — say, current usage), or “diachronic” (as they have transformed over time). This has been done, of course, with varying degrees of success. One prominent American-English dictionary defines the word in slightly less than two hundred words. These efforts are admirable, but the popularity of claims that we live in an “information society” (or even more grandly in an “information age”) suggest, in their inclusiveness, that information is the sum of the word’s multiple meanings. Apparently, it — the word or, more properly, the category — is sui generis, and in a particularly compelling way. What qualities make it so?
The word itself dates in English to the late fourteenth century, and even so many centuries ago was used in ways that mirror current ambiguities. The Oxford English Dictionary cites early attestations (in, among other sources, Chaucer’s Canterbury Tales) as evidence for defining it variously as “The action of informing” and the “communication of instructive knowledge” (I.1.a); “Communication of the knowledge or ‘news’ of some fact or occurrence” (I.2); and “An item of training; an instruction” (I.1.b) — generally, an action in the first cases, and a thing in the last case. Even the ambiguity of whether it is singular or plural, which is still unclear, seems to date to the early sixteenth century (“an item of information or intelligence,” curiously “with an and pl[ural]” [I.3.b]).
As the word came into wider use in the centuries leading up to the twentieth, it took on a variety of additional meanings. Of these, the most striking trend was its increasingly legalistic aspect. This included informal usages (for example, related to or derived from “informing” on someone [I.4]) as well as narrow technical descriptions of charges lodged “in order to the [sic] institution of criminal proceedings without formal indictment” (I.S.a) This inconsistency — in one instance referring to particular allegations of a more or less precise factual nature and, in another, to a formal description of a class or type of assertion — is still central to current usage of the word; so are connotations that information relates to operations of the state.
Yet it was in the twentieth century that the word was given decisively different meanings. The first of these modern attestations appears in the work of the British statistician and geneticist R. A Fisher. In his 1925 article, “Theory of Statistical Estimation,” published in Proceedings of the Cambridge Philosophical Society,1 he described “the amount of information in a single observation” in the context of statistical analysis. In doing so, he appears to have introduced two crucial aspects to “information”: that it is abstract yet measurable, and that it is an aspect or byproduct of an event or process.
“Fisher information” has had ramifications across the physical sciences, but its most famous and most influential elaboration was in the applied context of electronic communications. These (and related) definitions differ from Fisher’s work, but they remain much closer to his conception than to any earlier meanings.2 Three years after Fisher’s paper appeared, the American-born electronics researcher Ralph V. L. Hartley, who had studied at Oxford University at almost exactly the same time that Fisher studied at Cambridge (1909-1913) before returning to the United States, published a seminal article in Bell System Technical Journal.3 In it, he built upon the work of the Swedish-American engineer Harry Nyquist (who was working mainly at AT&T and Bell Laboratories), specifically on Nyquist’s 1924 paper “Certain Factors Affecting Telegraph Speed,”4 which sought in part to quantify what he called “intelligence” in the context of a communication system’s limiting factors. Hartley’s 1928 article, “Transmission of Information,” fused aspects of Fisher’s conception of information with Nyquist’s technical context (albeit without citing either of them, or any other source). In it, he specifically proposed to “set up a quantitative measure whereby the capacities of various systems to transmit information may be compared.” He also added another crucial aspect by explicitly distinguishing between “physical as contrasted with psychological considerations” — meaning by the latter, more or less, “meaning.” According to Hartley, information is something that can be transmitted but has no specific meaning.
It was on this basis that, decades later, Claude Shannon, the American mathematician and geneticist turned electrical engineer, made the most well known of all modern contributions to the development of the idea of information.5 At no point in his works did he ever actually define “information”; instead, he offered a model of how to quantitatively measure the reduction of uncertainty in receiving a communication, and he referred to that measure as “information.” Shannon’s two-part article in 1948, “A Mathematical Theory of Communication,”6 and its subsequent reprinting with a popularizing explanation in his and Warren Weaver’s book, The Mathematical Theory of Communication,7 are widely heralded as the founding moment of what has since come to be known as “information theory,” a subdiscipline of applied mathematics dealing with the theory and practice of quantifying data.
Shannon’s construction, like those of Nyquist and Hartley, took as its context the problem presented by electronic communications, which by definition are “noisy,” meaning that a transmission does not consist purely of intentional signals. Thus, they pose the problem of how to distinguish the intended signal from the inevitable artifacts of the systems that convey it, or, in Shannon’s words, how to “reproduc[e] at one point either exactly or approximately a message selected at another point.” Shannon was especially clear that he didn’t mean meaning:
Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.8
In The Mathematical Theory of Communication, he and Weaver explained that “information is a measure of one’s freedom of choice when one selects a message” from a universe of possible solutions.9 In everyday usage, “freedom” and “choice” are usually seen as desirable — the more, the better. However, in trying to decipher a message they have a different consequence: The more freedom of choice one has, the more ways one can render the message, and the less sure one can be that a particular reproduction is accurate. Put simply, the more freedom one has, the less one knows.
Small wonder that the author of such a theory would view efforts to apply his ideas in other fields as “suspect.”10 Of course, if Shannon sought to limit the application of his “information” to specific technical contexts — for example, by warning in his popularizing 1949 book that “the word information, in this theory, is used in a special sense that must not be confused with its ordinary usage” — he failed miserably. The applications of his work in computational and communication systems, ranging from read-write operations in storage devices to the principles guiding the design of sprawling networks, have had pervasive effects since their publication.”11 Those effects offer quite enough reason for “nonspecialists” to take a strong interest in information, however it is defined; their interests, and the “popular” descriptions that result, surely carry at least as much weight as Shannon’s mathematical prescription.
However disparate these prescriptions and descriptions may be, both typically have one general and essential thing in common: mediation. Where Shannon’s information is an abstract measure, analogous to the negative space around a sculpture in a crate, the common experience of what is often called information is indirect, distinguished from some notional immediate or immanent experience by mediation — say, through a commodity (hardware, software, distribution, or subscription) and/or an organization (a manufacturer, a developer, or a “resource”). So, to the growing list of paradoxes that have marked information for centuries — whether it is an action or a thing, singular or plural, an informal assertion of fact or a procedure for making a formal statement, its ambivalent relationship to operations of state, and so on — we can add some modern ones: It is abstract yet measurable, it is significant without necessarily being meaningful, and, last but not least, it is everywhere and nowhere.
It’s tempting to ask how a single category that has come to encompass such a babel of ideas could be very useful, of course; the underlying assumption of such a question is that a word’s worth is measured by the consistency or specificity of its meanings. That assumption is false: very common words — “stuff,” say, or “power” — are useful because they are indiscriminate or polysemic. But those are very different qualities;12 for now — which may be very early in terms of historical periodization — information is (or does) both.
On the one hand, it seems to proffer an indiscriminate lumping-together of everything into a single category in common phrases such as “information society,” “information age,” and “information economy.” And those phrases, in turn, are fairly specific compared to the wild (and wildly contradictory) implications attributed to information in commercial communications (for example, advertising and marketing). In those contexts, at one extreme, information appears as a cudgel — a driving, ubiquitous, relentless, inevitable, almost malevolent historical force that overturns assumptions, disrupts and threatens institutions, and forces adaptation. At the other extreme it appears as a carrot — an enticing, endless, immaterial garden of delights in which instantaneous access to timeless knowledge promises the opportunity of transformation for individuals and for the globe as a whole. On the other (equally woolly) hand, information is widely thought to mark a historical divide, for example, in the urban-legend-like claim that people today are exposed to more information in some small unit of time than their indeterminate ancestors were in their lifetime.13 What remains unclear in these popular claims is whether information itself is new in the sense of a recent historical invention (akin to nuclear fission, for example) or, rather, whether its pervasiveness is new.
Even if we limit ourselves to more sober usages, we are still left with a category that variously includes assertions of specific fact or belief; some type of assertion made in a specific (for example, technical) context; a statement or instruction to be acted upon or executed; a kind of knowledge or communication, maybe vaguely related to “intelligence”; a specific communication, which, additionally, may or may not mean something; an aspect of communication that specifically means nothing; an aspect of specific or general communications that can be measured; and, more loosely, archives and catalogs, facts and factoids, static and streaming data, opinions and ideas, accounts and explanations, answers to questions; and/or virtually any combination thereof.
As noted, the theory of information has played a pivotal role in systems automation and integration, a dominant — maybe the dominant — development in postindustrial social and technical innovation. Given the dizzying complexity, breadth, and interdependence of these developments, a single category that provides (if only illusorily) a common reference point for myriad social actors, from individuals right up to nations, might be useful precisely because it is tautological. The reduction to a single term, which itself might mean anything or literally nothing, offers a sort of lexical symbiosis in which technical and popular usages inform each other: Technical usages derive implications of broad social relevance from popular usages, and popular usages derive implications of rigor and effectiveness from technical usages. Yet what’s hardest to hear through this cacophony is what might be most useful of all: Gregory Bateson’s enigmatic and epigrammatic definition of information as “the difference that make a difference.”14
Notes
Footnotes
-
R. A. Fisher, “Theory of Statistical Estimation,” Proceedings of the Cambridge Philosophical Society XXII, (1925) 709. ↩
-
For example, Norbert Wiener, widely credited as the father of cybernetics — that is, the study of feedback systems in living organisms, machines, and organizations — noted in his 1948 book Cybernetics that “the definition [of information]… is not the one given by R. A. Fisher for statistical problems, although it is a statistical definition” (III.76). ↩
-
V. L. Hartley, “Transmission of Information,” Bell System Technical Journal, VII, (July 1928) 540. ↩
-
Harry Nyquist, “Certain Factors Affecting Telegraph Speed,” Bell System Technical Journal, Vol. 3 (April 1924), 324–346. ↩
-
Shannon’s PhD dissertation “An Algebra for Theoretical Genetics” — an application of his “queer algebra,” in the words of Vannevar Bush — was written at MIT in 1940 under the direction of Barbara Burks, an employee of Eugenics Record Office at Cold Spring Harbor Laboratory; Shannon was recruited by Bell Labs to research “fire-control systems” — automated weapon targeting and activation — “data smoothing,” and cryptography during World War II. See Eugene Chiu et al., “The Mathematical Theory of Claude Shannon: A Study of the Style and Context of His Work up to the Genesis of Information Theory.” ↩
-
Claude Shannon, “A Mathematical Theory of Communication.” ↩
-
Claude Shannon and Warren Weaver, A Mathematical Theory of Communication. ↩
-
Ibid, 379. ↩
-
Ibid, 99. ↩
-
David Ritchie, “Shannon and Weaver: Unraveling the Paradox of Information,” in Communication Research, Vol. 13 No. 2. ↩
-
As this account suggests (and as one should expect), Shannon’s work was just one result of many interwoven conceptual and practical threads involving countless researchers and practitioners working across many fields and disciplines. In the two decades that separated Hartley’s 1928 article and Shannon’s publications, myriad advances had already had immense practical impact — for example, on the conduct and outcome of World War II, in fields as diverse as telegraphy, radiotelegraphy, electromechanical systems automation and synchronization, and cryptography. More generally, an important aspect and a notable result of that war were the unparalleled advances in systems integration across government, industry, and academia, from basic research through procurement, logistics, and application. Shannon’s work, as Voltaire might have put it, “had to be invented.” ↩
-
“There is always a moment when, the science of certain facts not yet being reduced to concepts, the facts not even being grouped together organically, these masses of facts receive that signpost of ignorance: ‘miscellaneous.’” Marcel Mauss, “Techniques of the Body,” in Zone 6: Incorporations, 454. ↩
-
For example, “[T]oday’s children…have access to more information in a day than their grandparents did in a lifetime” (House of Representatives, Excellence in Teaching: Heaving Before the Committee on Education and the Workforce, 106th Cong., 2nd Session, June 1, 2000 [Indianapolis, IN, serial no. 106-110, available here; “[a] person today is exposed to more information in one day than a person living in the 1700s was exposed to in an entire lifetime” (“James” of MIT’s Center for Reflective Community Practice, whose “experience” was “captured” by Invent Media [n.d.], available [here](http:// www.inventmedia.com/clients/mitfellows/james/soundfellows.html); “[t]oday’s students are exposed to more information in a day than a person living in the Middle Ages was exposed to in a lifetime” (“Goal 1 Report,” Technology Planning Committee, Howard County [MD] Public School System [2001], available here;”[w]ith the use of satellites, television and computers, you and I receive more information in one day of our lives than our ancestors of several generations ago used to receive in 1000 days!” (Barbara Deangelis, Real Moments, quoted — as “credulously regurgitating factoids” — in Kirkus Reviews 1, [August 1994], available [here](http://www.magusbooks.com/catalog/searchxhtml/detail_0440507294/choice_A/ category_/isbook_0/catlabel_All+Magusbooks+Categories/search_Deangelis,+ Barbara/index.htm); and “a student is exposed to more information in one day than a person living in the Middle Ages was exposed to in a lifetime” (New Jersey State Department of Education, Division of Educational Programs and Student Services, Plan for Educational Technology Task Force, “Educational Technology in New Jersey: A Plan for Action” [Dec. 1992], available [here](http://ftp.msstate.edu/archives/nctp/new •jersey.txt). This “meme” seems to have gained currency among American educational technologists in the late 1980s through the mid-1990s. ↩
-
Gregory Bateson, Steps to an Ecology of Mind, xxv–xxvi. ↩