5 The 1990s

5.1 1994: The Tokyo School

In the early 1990s, the term resurfaced in the context of statistics. It appeared in the title of a 1994 essay by the Japanese statistician Noburu Ohsumi on the application of hypermedia to the problem of organizing data, “New Data and New Tools: A Hypermedia Environment for Navigating Statistical Knowledge in Data Science,” an elaboration of an essay published two years earlier (Ohsumi 1992, 1994).¹ In these essays, Ohsumi described the by now familiar litany of problems associated with data impedance, although this time the focus was on the production of data resulting from its analysis and storage, not its consumption in so-called raw form:

In research organizations handling statistical information, the volume of stored information resources, including research results, materials, and software, is increasing to the point that conventional separate databases and information management systems have become insufficient to deal with the amount. Increasing diversification in the media used these days interferes with the rapid retrieval and use of the information needed by users. A new system that realizes a presentation environment based on new concepts is needed to inform potential users of the value and effectiveness of using the vast amount of diverse data (Ohsumi 1992: 375).

For research facilities around the world, the products of classical data science—the database and data processing software—had become a sorcerer’s apprentice, creating new problems with each solution. Organizations were drowning in the data sets they produced or acquired, the software used to process them, the print and digital libraries of reports and articles resulting from their analyses, and a host of other materials. The requirements, approach, and design goals of Ohsumi’s proposed system, the Meta-Stat Navigator, are strikingly similar to those of a contemporary system designed to solve the information problems of another scientific organization: Berners-Lee’s World Wide Web, famously developed at CERN in 1989 (Berners-Lee and Fischetti 2008). Of course, the latter quickly obviated Ohsumi’s proposal and become synonymous with the Internet, invented decades earlier.

The significance of Meta-Stat for our purposes is that this kind of work was understood clearly as data science at this point in history. Data science continued to be connected with the processing and representation of data, and was distinct from data analysis, but with this important development: statisticians had become embedded in these technologies, and their work had changed significantly as a result. And, as a result of this change in working conditions, the connection between data analysis and data science became closer.

Here we may locate with some precision a crucial transformation in the meaning of the term, associated with its adoption by a new set of users. One clue to this change is the opportunity Ohsumi observed amid the challenges posed by data deluge:

… the information handled by the statistical sciences lies on the boundaries of various other sciences and clarifies the relationships and nature of information that joins these sciences. Development of a system that fully organizes and integrates strategic information is essential (Ohsumi 1992: 375).

The Meta-Stat “system” was designed to realize the opportunity opened up by the central position statisticians had come to occupy among the prolifically data-generating sciences and the computational environment in which these data were made available. Data science, in this view, is meta-statistics, an encompassing concern for understanding data, understood as a universal medium, and its relationship to knowledge.

This perspective was shared by Ohsumi’s senior compatriot and fellow statistician, Chikio Hayashi, whom Ohsumi described as “the pioneer and founder of data science” (Ohsumi 2004: 1).²

In 1993, at a round table discussion during the fourth conference of the International Federation of Classification Societies held in Paris (IFCS-93), Hayashi uttered the phrase “Data Science” and was then asked to explain it. At the next conference (IFCS-96), he presented an answer, in addition to having the conference named to emphasize the importance of the term—“Data Science, Classification, and Related Methods.” His definition is as follows:

Data science is not only a synthetic concept to unify [mathematical] statistics, data analysis and their related methods but also comprises their results. It includes three phases, design for data, collection of data, and analysis on data. Data science intends to analyze and understand actual phenomena with “data.” In other words, the aim of data science is to reveal the features of the hidden structure of complicated natural, human and social phenomena with data from a different point of view from the established or traditional theory and method. This point of view implies multidimensional, dynamic and flexible ways of thinking (Hayashi 1998a: 41).

Hayashi went on to describe the sequence of design, collection, and analysis as a primary and iterative “structure finding” process in which data are transformed from a state of “diversification,” given the inherent “multifariousness” of the phenomena they represent, to one of “conceptualization or simplification” (41). The discovery of structure is accomplished with what we would recognize today as the methods of exploratory data analysis and unsupervised learning. In effect, Hayashi’s definition abstracted the design goals of Ohsumi’s Meta-Stat system and presented them as “a new paradigm” of science, one that would encompass statistics, data analysis, and their vast output of data within in a unified, process-oriented framework—data science (40).

In addition to Hayashi’s own definition, it helpful also to see how the field was defined by the editors (who included Hayashi) of the proceedings of IFCS-96:

The volume covers a wide range of topics and perspectives in the growing field of data science, including theoretical and methodological advances in domains relating to data gathering, classification and clustering, exploratory and multivariate data analysis, and knowledge discovery and seeking.

It gives a broad view of the state of the art and is intended for those in the scientific community who either develop new data analysis methods or gather data and use search tools for analyzing and interpreting large and complex data sets. Presenting a wide field of applications, this book is of interest not only to data analysts, mathematicians, and statisticians but also to scientists from many areas and disciplines concerned with complex data: medicine, biology, space science, geoscience, environmental science, information science, image and pattern analysis, economics, statistics, social sciences, psychology, cognitive science, behavioral science, marketing and survey research, data mining, and knowledge organization [Hayashi (1998b): v; emphasis added].

Of interest here is use of “data science” as a big tent, an inclusive rubric under which to group a series of domains (which match roughly to a process) as well as a broad range of disciplines and levels, from tool builders to scientists and practice to theory. This passage is also significant for including within the scope of data science the methods of machine learning as well as data mining among the list of sciences concerned with “complex data,” suggesting the prominence of these approaches at that time. We will see that not all definitions proceeding from this community were as inclusive.

Hayashi assigned a revolutionary and almost messianic role to data science. In his vision, the statistical sciences had lost their way. Mathematical statisticians had come to overvalue abstract inference and precision, and by choosing to work with the artificial data required to pursue these goals were “prone to be removed from reality” (40). Data analysts, although working with real data, had “come to manipulate or handle only existing data without taking into consideration both the quality of data and the meaning of data … to make efforts only for the refinement of convenient and serviceable computer software and to imitate popular ideas of mathematical statistics without considering the essential meaning” (40). As a result of these divergent attitudes toward data, and the disregard of both for the scientist’s engagement with the primary, existential relationship between data and phenomena, the field had become stagnant and lacking in innovation. Data science emerged as a savior, unifying a divided people, showing their way out of the wilderness, and restoring prosperity and prestige to their community.

If Hayashi’s criticisms of data analysis sound familiar to those leveled today against data scientists, it is because the issues data science was meant to resolve are recurring and systemic. So too is the separation between data analysis and mathematical statistics, which was recognized by Box, and later Tukey, in the 1970s. In his response to Parzen—who, we noted, sought to overcome a methodological split between the two subfields—Tukey wrote:

I concur with the general sentiments expressed by George Box in his Presidential Address … that we have great need for the whole statistician in one body—for the analyst of data as well as for the probability model maker—and the inferential theorist/practitioner. One cannot, however, make a whole man by claiming that one can subsume one important class of mental activity under another class whose style and purposes are not only different but incompatible. To be “whole statisticians” as Box might put it, or to be “whole statistician-data analysts” as I might, means to be single persons who can take quite different views and adopt quite different styles as the needs change. As the title of my paper of yesterday put it, “we need both exploratory and confirmatory”! The twain can—and should—meet, but they need to remain a pair (or two distinct parts of a larger team) if they are to do what they should and can (Tukey 1979: 122).

Tukey implies a solution to the schism, later observed by Hayashi, in better organization, not in a utopian “new man” or in a synthetic science per se, recalling the division of labor proposed by Naur, but here focusing on different roles within that division. Implicit in this approach is the view that the problem with statistics was not epistemic but organizational.

Here it is helpful to recall a property of Kuhn’s concept of paradigm—an obvious lens through which to observe our topic—which is often overlooked by those who use the term: it refers no to an abstract body of ideas that succeed on the basis of their intrinsic rationality or truth value, but to the successful practical application of ideas by means of novel methods and tools in a way that they may be imitated. The concept has both epistemic and social dimensions. Viewed in this light, the question of whether data science is in fact a science—our main question—becomes a matter of determining whether it solves important problems in new ways, by means of an assemblage of ideas, methods, and tools that may be grasped and imitated by others. Hence, although Tukey and Hayashi may appear to be divergent in their approaches to overcoming the problems, they represent the two aspects of a scientific paradigm, the one conceptual, the other practical. This should not be viewed as contradictory.

Following the IFCS meetings, as well as two meetings of the Japan Statistical Society that held “special sessions on data science,” Ohsumi developed Hayashi’s definition as well as its rationale (Ohsumi 2000: 331). We call this the Tokyo school of data science, given the association of both Hayashi and Ohsumi with the Institute of Statistical Mathematics in Tokyo, Japan. In a paper that explicitly addressed the relationship between data analysis and data science, and which is perhaps the first of several to claim the flag of data science for statistics, Ohsumi declared that because of its privileging of”mathematical methodologies” over an engagement with data acquisition, data analysis had become “a canary that has forgotten to sing,” referring to a Japanese children’s song that contemplates a silent bird’s fate (332).³ Amplifying Hayashi, he asserted that “[h]ow data are gathered is the key to defining the relevant information and making it easy to understand and analyze” (331). In making this point, Ohsumi referred to a new figure on the scene, one that contradicted the principles he proposed:

In my opinion, this viewpoint on the meaning of data science is fundamentally different from data mining (DM) and knowledge discovery (KD). These concepts are not of practical use because they neglect the problems of “data acquisition” and its practice (332).

It is significant that Ohsumi excludes these new fields—or field, since the two so frequently co-occur, along with the variant KDD, “knowledge discovery in databases”—from his definition of data science, since many today would consider the two synonymous.

5.2 Data Mining and Knowledge Discovery

The paradox is instructive: the name “data mining,” as used here, made its appearance in the late 1980s and early 1990s as a rubric that included a set of practices motivated by precisely the same conditions that led the Tokyo school to propose the field of data science in the first place. Among these conditions was the relatively sudden appearance of vast amounts of data stored in databases—one of the fruits of classical data science—owing to the rise of relational databases and personal computing in the 1980s, and a suite of tools to work with data, from spreadsheets to programming languages to statistical software packages. Whereas many statisticians viewed these developments with alarm, being acutely aware of the epistemic disruptions they produced for the received workflow of data analysis, the data mining community embraced them as an opportunity to convert data into value. Coming mainly from the field of computer science, data miners developed a set of methods that included the application of machine learning algorithms to the data found in databases in various contexts, from science to industry (such as point-of-sale records generated as by-product of computerized cash registers and credit card use). The relationship between machine learning and data mining was also mutually beneficial—data miners supplied machine learning projects with the large sets of data required for this class of algorithms to perform well. This relationship was greatly reinforced with the rise and development of the Web and social media platforms, which generated enormous amounts of behavioral data.

Although the two fields—for simplicity, let’s call them data analysis and data mining—were responding to the same conditions of data surplus and impedance, their philosophical orientations could not have been more opposed. This difference is clearest in their respective evaluations of data provenance, the source and conditions under which data are produced. For the data miner, data provenance is largely irrelevant to the possibility of converting data into value. Data are data, regardless of how they are generated, and the same methods may be applied to them regardless of source, so long as their structure is understood. Indeed, for the data miner data exists much as natural resources do, as a given part of the environment, which helps explain the success of the metaphor of mining over competing variants, such as harvesting, which implies intentional creation. For the data analyst, as Hayashi and Ohsumi took such pains to emphasize, provenance is, or should be, everything, echoing the statistician’s orthodox preference for experimental over observational data.

This difference was expressed by Ohsumi in his definition of data science in opposition to data mining:

Owing to the qualitative and quantitative changes in data [produced by the conditions described above], it is, indeed, becoming increasingly difficulty to grasp all aspects of a dataset in explaining various phenomena. Therefore, new techniques, such as DM, KD, complexity, and neural networks, are being proposed. However, the potential of these methods to solve any of these problems is questionable (332).

Ohsumi went on to characterize the way data has changed by listing the new kinds of data with which the statistician is confronted. These include prominently data sets found in databases as by-products of various processes, such as passive accumulation (e.g. from point-of-sale devices), unstructured data (included in text fields), and aggregated data generated “spontaneously and accumulating automatically in the electronic data collection environment” (332-333). He explained his concern with data mining:

When it comes to analyzing these datasets, people discuss DM and related techniques. However, the important questions to answer are: what dataset is necessary to explicate a certain phenomenon, why is it necessary, how to design its acquisition, and how difficult the whole process is. This is more important than the dataset itself. Books on DM do contain terms such as “data preparation”, “getting the data”, “sampling procedures”, and “data auditing”, but there is an assumption that the dataset is given and the procedure may start with analysis. Fiddling with a dataset once it is collected is merely a self-contained play of data handling [Ohsumi (2000): 333; emphasis added].

Although his evaluation of data mining seems to be woefully off base—a great deal of Google’s success, to take one example, was founded on their embrace of data mining at the time of Ohsumi’s essay—in fact his concern is not with the success of predictive analytics per se, but with solving what he considered to be the central problem of data science, that of understanding how data are generated in the first place. Given some of the issues that classifiers have encountered with respect to racial bias, for example, he cannot be said to have been wrong.

5.3 The IFCS

5.4 1997: The US

In the late 1990s and early 2000s, a series of papers and presentations by academic statisticians in the US echoed the theme of malaise expressed by the Tokyo school: the field of statistics was suffering from an image problem and needed to redefine itself in order to meet the challenges of a variety of existential threats. These threats included the rise of computational methods and large amounts of data, the emergence of non-traditional predictive methods and areas of research, such as data mining, that were taking the limelight from statistics, and an unflattering public image. Interestingly, given the lack of reference to the Tokyo school in these sources, many of the proposed responses included expanding the scope of statistics to encompass these new methods and to rebrand the field as “data science.” Frequently associated with this suggestion was a concern for updating university curricula for teaching statistics and, apparently for the first time, the creation of a new kind of statistician, the “data scientist”—recalling the mythical figure of the “whole statistician” discussed by Tukey in the 1970s. In these exhortations to the community of statisticians, the data scientist emerged as the “new man” of a reborn statistical science, one that who would overcome the field’s crisis of recognition.

The first of these exhortations appears to have come from Jon R. Kettenring in his Presidential Address to the American Statistical Association on 12 August 1997, where he expressed concern for image of statistics:

Looking ahead, image reconstruction must be one of our top priorities. It must be understood that statistics is the data science of the 21st century—essential for the proper running of government, central to decision making in industry, and a core component of modern curricula at all levels of education. I would like to see ASA make image reconstruction a part of its strategic plan. And I suspect we may need some professional help if we are to succeed (Kettenring 1997: 1230, emphasis added).

In this talk, Kettenring clearly expressed the ambivalence of the statistics community toward the rise of computational methods that we saw in the Tokyo school. On one hand, computer science was welcomed and it’s contributions were viewed as opportunities:

I believe the time has come for us to acknowledge that computer science needs to take its place alongside mathematics (and probability) as fundamental linch pins of statistics and as disciplines that undergird our research and instruction. Examples of relevant topics in computer (and computing) science include databases and database management, algorithm design, computational statistics, artificial intelligence and machine learning (Kettenring 1997: 1232).

On the other hand, data mining—the most notable development coming from the computational field—was excluded from this vision. Not only was it absent from the list of relevant topics above, it was singled out for criticism, echoing the stance of the Tokyo school. In highlighting software development and data mining as “two opportunities that sit at the intersection of computer science and statistics,” Kettenring wrote:

The second area is the current hot topic of data mining, which statisticians might reasonably think of as data analysis of very large databases. In fact, if you dig down and look at what is involved in data mining, you will find a variety of statistical components, such as statistical graphics and cluster analysis. Moreover, there is a great opportunity to bring a variety of statistical concepts to bear—modeling, sampling, robust estimation, outlier detection, dimensionality reduction, etc. Nevertheless, there are new opportunities as well, and we would be wise to pay very close attention and to become seriously involved with these developments.

In fact, if we don’t, there is risk that we will be blown away by the momentum that is flowing in this direction. As Jerry Friedman pointed out in his keynote address at Interface ’97, we are no longer the only game in town. Many other data oriented sciences are competing with us for customers and students.

Also it should not go unnoticed-speaking of image reconstruction!-that the very term, data mining, has captured the fancy of many people, especially in the business community. It is grabbing headlines that statisticians would kill for. The image of data mining is that something powerful is going on there. The reality? Well that may be rather different. It may even turn out that the phrase of the day in the 21st century is

lies, damned lies, and data mining.

But for now I believe we should take advantage of the momentum before it fades into another missed opportunity for statistics (Kettenring 1997: 1232).

As an aside, it is worth noting that Friedman in the keynote referenced above, expressed the same ambivalence toward data mining. It is at once lauded as “having a major impact in business, industry, and science” affording “enormous research opportunities for new methodological developments” and derided as a “device to sell computer hardware and software” (Friedman 1997).

Kettenring’s message was echoed three months later by the academic statistician C. F. Jeff Wu in a lecture delivered at the University of Michigan entitled “Statistics = Data Science?” (D. Wu 1997). Note that although Wu was not the first to suggest that statistics be renamed to, and redefined as, data science, he is often regarded as the first in the US to do so (Donoho 2017). Regarding image, Wu pointed out that statisticians were perceived as either accountants or involved with simple descriptive statistics—and prone to lying with these statistics, as the saying goes—when in fact their work comprised everything from data collection to modeling and analysis to solving problems and making decisions. As a remedy, he implored his colleagues to “think big” and embrace the changes and challenges that we have seen already—the rise of large and complex data sets, the use of neural networks and data mining methods, and the emergence of new fields such as computer vision. In addition to suggesting a name change for the field, he appears to have been the first to suggest a name change of the role of statistician to data scientist, along with a college curriculum to that would embrace all phases of data science and be profoundly interdisciplinary.

Wu’s proposal is stronger than Kettenring’s. Kettenring, whose language suggests that he was aware of the prior existence of data science as a field, believed that statistics should embrace and encompass data science as the rightful heir to its fruits and labors. In contrast, Wu asserted that statistics should expand its scope and rename itself “data science,” because it “is likely the remaining good name reserved for us” (C. F. J. Wu 1997: slide 12). In both cases, however, the usage of the term data science included much of computer science, which is consistent with the classical definition that developed in the 1960s and ’70s.

Practical plans for revamped curricula to train this new kind of statistician—the data scientist—appear after these appeals. The most well-known is found in Cleveland’s essay, “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics” (Cleveland 2001), even though he used the term “data analyst” as his target student. Consistent with previous definitions of data science as statistics augmented by computational methods and tools, he proposed a curriculum comprising six areas, along with percentages denoting the amount of time and resources that should be devoted to each: Multidisciplinary Investigations (25%), Models and Method for Data (20%), Computing with Data (15%), Pedagogy (15%), Tool Evaluation (5%), and Theory (20%). This distribution is more or less consistent with the definitions we have seen proposed by other statisticians, including the Tokyo school. As a measure of how radical this suggestion was, consider Donoho’s remarks, made over a decade and a half later:

Several academic statistics departments that I know well could, at the time of Cleveland’s publication, fit 100% of their activity into the 20% Cleveland allowed for theory. Cleveland’s article was republished in 2014. I cannot think of an academic department that devotes today 15% of its effort on pedagogy, or 15% on computing with data. I can think of several academic statistics departments that continue to fit essentially all their activity into the last category, theory (Donoho 2017: 750).

Radical as it may have appeared, the curriculum was fairly conservative. The area of “models and methods,” for example, focused exclusively on statistical data models, as opposed to algorithmic ones (to use Breiman’s terms, discussed below), while the “computing with data” area focused narrowly on the infrastructure to support these models. Cleveland’s bias toward traditional statistical modeling is made clear in his explanation of it:

The data analyst faces two critical tasks that employ statistical models and methods: (1) specification—the building of a model for the data; (2) estimation and distribution—formal, mathematical-probabilistic inferences, conditional on the model, in which quantities of a model are estimated, and uncertainty is characterized by probability distributions (Cleveland 2001: 22).

That the one is subordinate to the other is evident in Cleveland’s remark: “A collection of models and methods for data analysis will be used only if the collection is implemented in a computing environment that makes the models and methods sufficiently efficient to use” (23). He did make a passing reference to algorithmic models, but his example came from their use to support stochastic modeling:

Historically, the field of data science has concerned itself only with one corner of this large domain [i.e. computing with data]—computational algorithms. Here, even though effort has been small compared with that for other areas, the impact has been large. One example is Bayesian methods, where breakthroughs in computational methods took a promising intellectual current and turned it into a highly practical, widely used general approach to statistical inference (23).

If it is not made clear from this passage that he did not have the wider field of data mining and knowledge discovery in mind, the following passage does:

Computer scientists, waking up to the value of the information stored, processed, and transmitted by today’s computing environments, have attempted to fill the void. Once current of work is data mining. But the benefit to the data analyst has been limited, because knowledge among computer scientists about how to think of and approach the analysis of data is limited, just as the knowledge of computing environments by statisticians is limited (23; emphasis added).

So, Cleveland continued the practice of the Tokyo school to put at arms length the contributions of data miners, for their lack of training in traditional statistics or their lack of attention to data design, and to confine the role of computational expertise to knowledge of “environments.”

Regarding the argument being made here, that the term “data science” has been in continuous circulation since the 1960s, and not independently coined in various contexts, Cleveland’s remarks, similar to those of Kettenring, indicate that the term was already known to statisticians, even as they sought to appropriate it for their own purposes. The sentence—“Historically, the field of data science has concerned itself only with one corner of this large domain …”—indicates awareness of prior usage, as well as a definition that aligns with what we have called classical data science. In any case, as we have seen, CODATA, representing classical data science, began publishing the Data Science Journal in 2001, the time of Cleveland’s proposal.

The task of educating data scientists was also addressed at this time at a workshop sponsored by the American Statistical Association in 2000 on the topic of undergraduate education in statistics. In a report of the proceedings published in the American Statistician, the following understanding of data science was expressed:

… what is needed is a broader conception of statistics, a conception that includes data management and computer skills that assist in managing, exploring, and describing data. The terms “data scientist” or “data specialist” were suggested as perhaps more accurate descriptions of what should be desired in an undergraduate statistics degree. Data specialists would be concerned with the “front end” of a data analysis project: designing and managing data collection, designing and managing databases, manipulating and transforming data, performing exploratory and “basic” analysis (Higgins 1999). Data scientists (or specialists) a might share some course work with computer science majors, but where a computer scientist studies compilers and assembly language, a data scientist studies data analysis and statistics [Bryce et al. (2001): 12; citation in original].

The reference to Higgins is from a paper where he asserted the need for statisticians to pay more attention to “the non-mathematical part of statistics,” so that undergraduate programs may respond to the “explosion in the amount of data available to society” (Higgins 1999: 1). In this area he included “designing scientific studies in a team-oriented environment, ensuring protocol compliance, ensuring data quality, managing the storage/transmission/retrieval of data, and providing descriptive and graphical analyses of data” (1). Here, again, we see a definition of data science as an improved version of statistics, in response to the persistent condition of data impedance, that would include data management and computing skills (and data design)—but not the methods of data mining. Consistent with the implied structural relationship between statistics and data science, data science was sometimes described as a part of statistics. Tellingly, both Bryce, et al., and Higgins equivocated on the use of data scientist, and suggested “data specialist” as an alternate name, perhaps so as not to overshadow the role of the data analyst. In any case, the choice of term was motivated by marketing; as Higgins wrote:

Guttman expressed the opinion that the term statistics carries such a negative connotation that it might be wise to rename our departments something like “Department of Data Science” or “Department of Information and Data Science.” In this vein, I have suggested the term “data specialist” (Higgins 1999).⁴

Berners-Lee, Tim, and Mark Fischetti. 2008. Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor. Paw Prints.

Bryce, G. Rex, Robert Gould, William I. Notz, and Roxy L. Peck. 2001. “Curriculum Guidelines for Bachelor of Science Degrees in Statistical Science.” The American Statistician 55 (1): 7–13. https://www.jstor.org/stable/2685523.

Cleveland, William S. 2001. “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.” International Statistical Review / Revue Internationale de Statistique 69 (1): 21–26. https://doi.org/10.2307/1403527.

Davis, Miriam. n.d. “The Canary That Forgot Its Song.”

Donoho, David. 2017. “50 Years of Data Science.” Journal of Computational and Graphical Statistics 26 (4): 745–66. https://doi.org/10.1080/10618600.2017.1384734.

Friedman, Jerome H. 1997. “Data Mining and Statistics: What’s the Connection?” https://www.academia.edu/28735089/Data_Mining_and_Statistics_Whats_the_Connection.

Hayashi, Chikio. 1998a. Data Science, Classification, and Related Methods: Proceedings of the Fifth Conference of the International Federation of Classification Societies (IFCS-96), Kobe, Japan, March 27-30, 1996. Kobe, Japan: Springer.

———. 1998b. Data Science, Classification, and Related Methods: Proceedings of the Fifth Conference of the International Federation of Classification Societies (IFCS-96), Kobe, Japan, March 27-30, 1996. Kobe, Japan: Springer.

Higgins, James J. 1999. “Nonmathematical Statistics: A New Direction for the Undergraduate Discipline.” The American Statistician 53 (1): 1–6. https://doi.org/10.1080/00031305.1999.10474418.

Kettenring, Jon R. 1997. “Shaping Statistics for Success in the 21st Century.” Journal of the American Statistical Association 92 (440): 12291234. http://www.jstor.org/stable/2965390.

Ohsumi, Noboru. 1992. “An Experimental System for Navigating Statistical Meta-Information The Meta-Stat Navigator.” In, edited by Yadolah Dodge and Joe Whittaker, 375–80. Heidelberg: Physica-Verlag HD. https://doi.org/10.1007/978-3-642-48678-4_48.

———. 1994. “New Data and New Tools: A Hypermedia Environment for Navigating Statistical Knowledge in Data Science.” In, 45–54. Berlin: Springer-Verlag.

———. 2000. “From Data Analysis to Data Science.” In, 329334. Springer.

———. 2002. “Professor Chikio Hayashi.” International Federation of Classification Societies Newsletter 24 (December): 3–5.

———. 2004. “IFCS-2004 in Chicago.” In. https://www.researchgate.net/publication/268719699_Memories_of_Chikio_Hayashi_and_his_great_achievement_2004.

Tukey, John W. 1979. “Nonparametric Statistical Data Modeling: Comment.” Journal of the American Statistical Association 74 (365): 121–22. https://doi.org/10.2307/2286735.

Wu, C. F. Jeff. 1997. “Statistics = Data Science?”

Wu, Dekai. 1997. Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora.

According to Ohsumi, “the term ‘data science’ appeared for the first time” in 1992, at a research exchange meeting between French and Japanese data analysts (so-called) at Montpelier University II in France (Ohsumi 2000: 331). He also claims to have “argued the urgency of the need to grasp the concept ‘data science’” in 1992 (329).↩︎
In his eulogy for Professor Hayashi, Ohsumi explained in some detail the motivation for his mentor’s first usage of data science:

In the last ten years of his life, Professor Hayashi asserted the importance of “data science as a theory of scientific methods.” The starting point was the meaning of the words “data science;” this arose in 1992, when the professor was discussing the titles and introductions of a collection of papers to be published at the 2nd Japanese-French Scientific Seminar. The professor used a Japanese phrase that translates as “data science” and for the rest of his life explored the concept behind this phrase. When the papers at the seminar were published, it was proposed for the first time that the term referring to quantification be standardized as “quantification method” and the subsequent English terms were standardized as such. I can recall the professor reprimanding us with the words “you are talking about different concepts” when we referred to “deta kagaku” while another group called it “deta no kagaku” (data science).

The basis of data science is an extremely straightforward concept; the fact that it needed to be expressed is an indication of the chaos that exists today in statistical data analysis and the deteriorating research environment. I am certain that the professor had intended to lead the way in achieving a breakthrough on these problems [Ohsumi (2002): 5; emphasis added].

5.1.0.1
↩︎
The specific reference is to a poem, later set to music, written by the Japanese poet Saijoo Yaso (西条八十), who lived from 1892 to 1970. According to Miriam Davis, “The moral of the song is that if the canary loses its song it is not worth its existence so it should make the most of the gift of song it has been given.” (Davis, n.d.)↩︎
Higgins here referred to a quote from Guttman found in a paper by Kettenring (Kettenring 1997a).↩︎

5.1 1994: The Tokyo School

5.2 Data Mining and Knowledge Discovery

5.3 The IFCS

5.4 1997: The US

5.1.0.1