Thank you very much, Brad [Wheeler].
I am pleased to welcome all of you, on behalf of Indiana University, to this international workshop on Big Data and Extreme-scale Computing. This workshop, funded by the National Science Foundation, begins the second series of BDEC workshops, hence "BDEC2."
It is the first of six workshops that will be held around the globe over the next two years to focus on the issue of how to establish a shared cyberinfrastructure for science in a data-saturated world.
I want to extend our most grateful thanks to the workshop sponsors, the National Science Foundation and Intel. And I extend a special welcome to Manish Parashar, Distinguished Professor of Computer Science at Rutgers University and the director of the National Science Foundation’s Office of Advanced Cyberinfrastructure—as well as Robert Wisniewski, senior principal engineer and chief software architect for extreme-scale computing at Intel.
The list of presenters and attendees for this workshop reads like something of a “Who’s Who” of Big Data and extreme scale computing.
This morning, you will hear a keynote address from Dan Reed, now senior vice president for academic affairs at the University of Utah, and formerly a faculty member and administrator at the universities of Iowa, Illinois, and North Carolina—and who is well-known for his major contributions to high-performance computing and national science policy.
Later, you will hear from Rick Stevens, associate director at Argonne National Laboratory and the co-founder of the Computation Institute, a joint initiative between The University of Chicago and Argonne National Laboratory—and from his colleague, Professor Ian Foster of the University of Chicago, who currently directs the Computation Institute. I also want to welcome two of their Computation Institute colleagues, IU alumna Kate Keahey, and IU alumnus Pete Beckman, who helped found Indiana University’s Extreme Computing Laboratory and is now at Argonne.
In addition, you will hear this morning from a distinguished panel of international experts, including Sergi Girona and Rosa Badia of the Barcelona Supercomputing Center; one of Japan’s foremost supercomputing authorities, Satoshi Matsuoka, now the director of the RIKEN Center for Computational Science; and Haohuan Fu, deputy director of China's National Supercomputing Center, and a professor at Tsinghua University, an institution, incidentally, with which Indiana University has a number of important partnerships. In fact, I was at Tsinghua a few months ago to help formally inaugurate a new multiyear partnership between our art museums—the first collaboration of its kind between university art museums in the United States and China.
I also want to extend my thanks to all those who have helped to organize this event, including
- Associate Professor Judy Qiu, from IU’s School of Informatics, Computing, and Engineering;
- Distinguished Professor of Informatics, Computing, and Physics, Geoffrey Fox, also from IU’s School of Informatics, Computing, and Engineering;
- Jack Dongarra, Distinguished Professor of Computer Science at the University of Tennessee, where he also directs the Innovative Computing Laboratory;
- as well as Jack’s colleagues at the ICL,
- associate director Terry Moore;
- and program manager Tracy Rafferty.
All of us at Indiana University are grateful to all of you for joining us and for your attention to the important topics of this two-day workshop.
And if I may take a point of personal privilege, as the politicians say, to note that I have known Rick since 1985 when I first visited Argonne, and where I also met Jack. I got to know Ian, Pete, and Paul Messina there later as well, and, through Paul, Geoffrey. I visited Argonne many times after this, including as a Fulbright Senior Fellow for six weeks in 1988 before I moved to the United States. I was then honored to serve for a number of years on the Committee of Visitors for the Mathematics and Computer Science Division at Argonne when Rick was Director. I am sure people here from other institutions around the world will allow me to say just how much I admire all of the extraordinary accomplishments—in high performance computing and other areas—for which Argonne has been responsible over many decades under outstanding leadership, including a number of people present in this room today.
A transnational public infrastructure for big data
Indiana University Bloomington is honored to host this important workshop here in our Cyberinfrastructure Building, the initial planning for which I oversaw when I was still vice president for information technology. And we are pleased to be doing so on the eve of our Bicentennial Year which we will be celebrating in 2019-2020.
Of course, the year 2020 has also been designated as something of a milestone in the world of Big Data and Fog Computing, as it has been estimated that by 2020, there will be 50 billion interconnected smart devices in the world—roughly five times more than the projected population of the planet in 2020.1 Furthermore, it has been estimated that the volume of digital data in 2020 is expected to reach 40 Zettabytes.2 By 2025, it is estimated that the global datasphere will grow to 163 Zettabytes.
The goal of the BDEC2 project, then, is to help address the daunting challenge of creating a transnational public infrastructure that can provide ubiquitous but appropriate access to this shared continuum of scalable computing and data resources.
This is a tremendously important problem both scientifically and for societies all around the world.
For the scientific community, where, among its many other implications, Big Data is leading to a democratization of exploratory research, a major part of the challenge is to determine how to draw sound inferences and conclusions from this abundance of data.
As a society, we face the challenge of how to make best use of the insights and conclusions we obtain from Big Data. How do we maximize societal value and minimize harm to our society? How do we incorporate Big Data and AI analyses in ways that liberate and enable human achievement, rather than become tools of control and repression?
Given that the liberal arts are a major part of Indiana University's essential identity—as they are for many of your home institutions—IU has many faculty across many disciplines, including in the humanities and social sciences, who are concerned about the ethical and societal problems that can be caused by the way in which Big Data research is harnessed.
At IU, we have numerous faculty members who work in data science, network science, AI, machine learning, and Big Data issues more generally across a wide variety of disciplines, and many of them are here today. Our recent establishment of a program in Intelligent Systems Engineering in our School of Informatics, Computing, and Engineering, is enabling us to considerably expand these activities.
I worked in some of these areas earlier in my academic career and served as Principal Investigator of the Analysis and Visualization of Instrument-Driven Data, or AVIDD, project. The AVIDD system was a cyberinfrastructure resource funded by the NSF in 2001 to create at IU a system for handling data from advanced scientific instruments—data that, at that time, seemed like "Big Data," but which, of course, has grown exponentially in scale in the ensuing years.
In my years as IU’s vice president for information technology and then as vice president for research, IU grew its computational, storage, networks, and expertise capabilities to support our faculty and the scientific community with advanced cyberinfrastructure—even before the NSF's 2003 Blue Ribbon Panel, that Dan Atkins chaired, popularized the term. That work continues today through our own continually growing cyberinfrastructure, and through clouds connected by the more than 20 major high-speed national and international networks managed by the IU Global Network Operation Center.
Conclusion
The series of six BDEC2 workshops that begin today have the major objectives of:
- drafting a design for a distributed services platform for science,
- organizing and developing an international demonstration of the feasibility and potential of this platform based on a prototype implementation of this design,
- and developing a corresponding "shaping strategy" that addresses all the relevant stakeholders and that moves the community as a whole toward convergence on a final, standard specification.
I am very pleased that Indiana University is able to host one of these workshops and am sure that knowing so many of the people in this room as I do, that enormous and positive progress will be made.
Again, welcome to Indiana University Bloomington.
Source Notes
1. Zaigham Mahmood, ed., Fog Computing: Concepts, Frameworks and Technologies, (Springer, 2018), 3.
2. Big Data and Extreme-scale Computing (BDEC) Project Prospectus, Web, Accessed November 27, 2018, URL: https://www.dropbox.com/s/lr59xo8rtmtzyuy/BDEC2_Prospectus.pdf?dl=0.