Monday, June 20 - Campanile Room | ||
8:00 - 8:30am Campanile Room | Registration and Light refreshments | |
8:30 - 8:45am Campanile Room |
Opening Dr. Hridesh Rajan and Dr. Pavan Aduri |
|
8:45 - 9:15am Campanile Room | Welcome, and why are we here? Dean Schmittmann |
|
![]() |
About Dr. Schmittmann: Beate Schmittmann has served as dean of the College of Liberal Arts and Sciences at Iowa State University since April 2, 2012. She is leading several key Liberal Arts and Sciences initiatives to promote research, student success, and college advancement. Most notable among these are the college's Signature Themes, which build on existing strengths to grow and sustain an internationally competitive profile in selected research areas. Strategic faculty hiring, seed grant funding, and enhanced support for proposal writing form an integral part of these efforts. She also furthers LAS' reputation as a student-centered college through an integrated approach to student recruitment, academic advising, and career services, a focus on high quality teaching and innovative pedagogy, especially in the STEM fields, and efforts to support the success of traditionally underrepresented students. She has built a strong external relations effort, in which development, alumni relations, and strategic communications support and enhance one another. Schmittmann is a Fellow of the American Association for the Advancement of Science, a Fellow of the American Physical Society and a winner of the organization's 2010 Jesse W. Beams Award. Her research interests focus on statistical and biological physics. She has authored or co-authored more than 100 peer-reviewed articles and one book. Schmittmann earned a diploma (M.S.) in physics from RWTH Aachen University in her native Germany (1981), and a Ph.D. in physics from the University of Edinburgh, Scotland (1984). Prior to joining ISU, she was a member of the physics faculty at Virginia Tech, Blacksburg, since 1991 and served as the physics department chair since 2006. More |
|
9:15 - 10:00am Campanile Room | Perspectives on Data Driven Discovery Dr. Sarah Nusser |
|
![]() |
Data driven science represents a paradigm shift in how we engage with discovery. It affects organizations from all sectors, and is having profound impacts on nearly every field of endeavor and how we think about research collaboration. In this presentation, we will discuss features of data driven discovery and the role data science plays in enabling this transformation, how the NSF Big Data Innovation Hub program is helping to speed the exchange of knowledge across sectors, and what lies ahead with the promise of open data. About Dr. Nusser: Dr. Sarah Nusser is Vice President for Research and a Professor in the Department of Statistics at Iowa State University. As VPR, she works with faculty to spear head the Data Driven Science Initiative at ISU. She is a co-PI for the Midwest Big Data Hub, which seeks to accelerate partnerships that foster the continued development of data driven discovery. Prior to joining the Office of the Vice President for Research in 2014, Dr. Nusser served as the director of the Center for Survey Statistics and Methodology at Iowa State University for 15 years, where she conducted research in statistical sampling and measurement error models for both land-based and human population surveys. More |
|
10:00 - 10:30am Campanile Room |
Big Data - the education and engagement challenge Dr. Wolfgang Kliemann | |
![]() |
Across the US, science seems to be in a phase of having to justify its approach, its results, and its value for decision making in community, state and federal contexts. Communicating research and science is one of the priority topics at many meetings of university associations and of science communities. Big Data has the potential to widen the gap between scientific information and its naive, intuitive perception. Simply because Big Data is as invisible and untouchable as subatomic particles, but it can serve as the basis for decision making in industries and governments on all levels. In this presentation we will explore some approaches for the scientific community to use data, and Big Data, as a tool for interaction with the public of all age groups to foster better understanding of research and science, their potential and their limitations. About Dr. Kliemann: Professor of mathematics Wolfgang Kliemann joined the Office of the Vice President for Research July 1, 2014, as an associate vice president for research. Wolfgang came to Iowa State in 1983. He served as associate dean for research in the College of Liberal Arts and Sciences from 2000 to 2001, as associate vice provost for research from 2001 to 2005, and as chair of the Department of Mathematics from 2008 through 2013. More |
|
10:30 - 11:00am Campanile Room | Break - refreshments provided | |
11:00 - 12:30pm Campanile Room |
Big Data in Context Panel | |
Dr. Arne Hallam (moderator), Dr. Xiaoqiu Huang Dr. Kevin Kane, Dr. Eric Rozier, and Dr. Jason Wille. |
||
12:30 - 2:00pm Campanile Room | Lunch - Box lunches provided | |
2:10 - 3:10pm Campanile Room | Big Data in Context Panel (continued) | |
Dr. Joe Colletti,
Dr. Gary Mirka (moderator), Dr. Sree Nilakanta and Dr. Arun Somani |
||
3:10 - 3:40pm Campanile Room | Break - refreshments provided | |
3:40 - 5:30pm Campanile Room |
Introduction to Statistics Dr. Kris De Brabanter |
|
![]() |
This module will provide summer school participants a gentle introduction to probability and statistics concepts and prepare them for later modules in this summer school.
About Dr. De Brabanter: Dr. Kris De Brabanter is an assistant professor of Statistics at the Department of Statistics at Iowa State University. His research interest are in mathematical statistics, nonparametric regression, analysis of big data sets, machine learning, model selection methods, density estimation, nonparametric inference. More |
|
Tuesday, June 21 - South Ballroom, Campanile Room | ||
8:00 - 8:30am South Ballroom | Light refreshments | |
8:30 - 10:30am South Ballroom |
Introduction to Python Dr. Steve Kautz |
|
![]() |
This module is aimed at introducing audience to the Python programming language and programming concepts.
About Dr. Kautz: Dr. Kautz holds an M.S. in computer science and a Ph.D. in mathematics from Cornell University. Prior to joining the teaching faculty at Iowa State he spent 10 years on the faculty of Randolph College of Lynchburg (Virginia) and then 8 years as a senior software engineer for NewMonics (later acquired by Aonix, Inc), the developers of the PERC(tm) virtual machine, a platform for real-time Java. Dr. Kautz's time as an engineer was divided between work on the virtual machine itself, including contributions to several implentations of threads, and consulting services for customers. As part of the latter effort he developed a one-week, hands-on course on concurrent Java that has been presented to teams of developers worldwide over an 8-year period. Dr. Steve Kautz is currently a lecturer of Computer Science at Iowa State University. More |
|
10:30 - 11:00am South Ballroom | Break - refreshments provided | |
11:00 - 12:30pm South Ballroom |
Introduction to Python (continued) Dr. Steve Kautz |
|
12:30 - 2:00pm South Ballroom | Lunch - Box lunches provided | |
2:10 - 3:10pm Campanile Room |
Introduction to R Dr. Heike Hofmann |
|
![]() |
This module introduces R, a widely popular language and environment for statistical computing and graphics. This module is a prerequisite for the visualization module.
About Dr. Hofmann: Dr. Heike Hofmann is a professor of Statistics at the Department of Statistics at Iowa State University. Her areas of interest are Data Visualization, Multivariate Categorical Data Analysis, Statistical Computing, Exploratory Data Analysis and Interactive Statistical Graphics. More. |
|
3:10 - 3:40pm Campanile Room | Break - refreshments provided | |
3:40 - 5:30pm Campanile Room |
Introduction to R (continued) Dr. Heike Hofmann |
|
Wednesday, June 22 - South Ballroom, Campanile Room | ||
8:00 - 8:30am South Ballroom | Light refreshments | |
8:30 - 10:30am South Ballroom |
Data Acquisition Dr. Adisak Sukul |
|
![]() |
This module will introduce participants to challenges in and solutions to data acquisition. This module will build on the Python module.
About Dr. Sukul: Dr. Adisak Sukul obtained his Ph.D. in Computer Science from Chulalongkorn University. Following his Ph.D., he was a visiting researcher at Iowa State University, lecturer in the computer science department at the King Mongkut's Institute of Technologies Ladkrabang, assistant director of Computer Service Center at the King Mongkut's Institute of Technologies Ladkrabang. He was also EIFL - OA/FOSS Country Coordinator for Thailand, Coordinate for the Open Access and the Free and Open Source Software working groups for EIFL (Electronic Information for Libraries), a global non-profit organization for developing country. Dr. Sukul has over 14 years of experience in IT Project Management and System Architect, and has co-founded three software companies in Thailand. Dr. Sukul has also consulted on e-Library and Institutional repository development project for various organizations including Thailand House of Representatives, Bangkok Metropolitan Administration, numbers of libraries and universities. Dr Adisak Sukul is currently a lecturer of Computer Science at Iowa State University. More |
|
10:30 - 11:00am South Ballroom |
Break - refreshments provided | |
11:00 - 12:30pm South Ballroom |
Data Processing Dr. Adisak Sukul |
|
12:30 - 2:00pm South Ballroom | Lunch - Box lunches provided | |
2:10 - 3:10pm Campanile Room |
Management, Access, and Use of Big and Complex Data Dr. Beth Plale |
|
![]() |
Part I - Data Pipelines in e-Science: What is a data pipeline? Data rarely instantly show up ready to use in whatever exploratory purpose a science researcher may have in mind. Data from creation to use undergoes numerous steps, some of which are end products in themselves. This session discusses data lifecycle, data pipeline, e-Science, cyberinfrastructure, Big Oh notation, and data analysis.
About Dr. Plale: Beth A. Plale is a Full Professor of Informatics and Computing at Indiana University where she directs the Data To Insight Center and serves as Science Director of the Pervasive Technology Institute. Dr. Plale's research interests are in Big Data, long-term preservation and curation of scientific and scholarly data, large-scale data management, metadata and provenance, data trustworthiness and security, and data-driven cyberinfrastructure and cloud computing. Plale is deeply engaged in interdisciplinary research and education in earth and environmental sciences, digital humanities, health, and social sciences. Professor Plale's postdoctoral studies were at Georgia Institute of Technology, and her PhD in computer science from State University of New York Binghamton. Her deep interest in technology for societal change arises in part from the MBA she received at the same time as spending a handful of years working in Southern California as a software developer. Plale is founder and Co-director of the HathiTrust Research Center which provisions analysis to nearly 14 million digitized books from research libraries, past chair of the Technical Advisory Board (TAB) of the 3,500+ member international Research Data Alliance (RDA), and is vice-chair of RDA/US. She is Department of Energy (DOE) Early Career Awardee and past Fellow of the Midwest university consortium CIC's Academic Leadership Program. More |
|
3:10 - 3:40pm Campanile Room | Break - refreshments provided | |
3:40 - 5:30pm Campanile Room |
Management, Access, and Use of Big and Complex Data (continued) Dr. Beth Plale Part II - Pipelines in Business: This session introduces the business perspective of data pipelines. It draws inspiration from a 2011 talk by Wernert Vogels "Data Without Limits". Vogels is CTO of Amazon, and in this nice 2011 talk discusses data pipelines in context of business computing. He argues that cloud computing is core to a business model "without limits". The pipeline he proposes is: collect | store | organize | analyze | share. Vogels talks about mapreduce extensively during his discussion of analysis.
|
|
Thursday, June 23 - South Ballroom, Campanile Room | ||
8:00 - 8:30am South Ballroom | Light refreshments | |
8:30 - 10:30am South Ballroom |
Applied Text Mining Dr. Drew Zhang |
|
![]() |
This module will introduce techniques for applied text mining. It will also explore popular tools for text mining such as the NLTK and SpaCy.
About Dr. Zhang: Zhu ("Drew") Zhang is an associate professor of Information Systems in the College of Business, Iowa state University. He obtained his Ph.D. in Computer and Information Science from University of Michigan. His core expertise is in natural language processing, web search/mining, and applied machine learning. More |
|
10:30 - 11:00am South Ballroom | Break - refreshments provided | |
11:00 - 12:30pm South Ballroom |
Applied Text Mining (continued) Dr. Drew Zhang |
|
12:30 - 2:00pm South Ballroom | Lunch - Box lunches provided | |
2:10 - 3:10pm Campanile Room |
Big Data Visualization Dr. Heike Hofmann This module is designed to help you get started with creating elegant and high quality graphics in R, based on the ggplot2 package. The module will be data centric, with lots of different data sets that illustrate examples of the different techniques used for different problems. The module will be a mix of instruction and follow-up exercises. You are encouraged to bring your own laptops, with software already loaded.
About Dr. Hofmann: Dr. Heike Hofmann is a professor of Statistics at the Department of Statistics at Iowa State University. Her areas of interest are Data Visualization, Multivariate Categorical Data Analysis, Statistical Computing, Exploratory Data Analysis and Interactive Statistical Graphics. More. |
|
3:10 - 3:40pm Campanile Room | Break - refreshments provided | |
3:40 - 5:30pm Campanile Room |
Big Data Visualization (continued) Dr. Heike Hofmann |
|
Friday, June 24 - South Ballroom | ||
8:00 - 8:30am South Ballroom | Light refreshments | |
8:30 - 10:30am South Ballroom | Machine Learning I: Introduction Dr. Kris De Brabanter This module will introduce machine learning concepts and explain their usage via practical examples. The machine learning modules will use the R language.
About Dr. De Brabanter: Dr. Kris De Brabanter is an assistant professor of Statistics at the Department of Statistics at Iowa State University. His research interest are in mathematical statistics, nonparametric regression, analysis of big data sets, machine learning, model selection methods, density estimation, nonparametric inference. More |
|
10:30 - 11:00am South Ballroom | Break - refreshments provided | |
11:00 - 12:30pm South Ballroom |
Machine Learning II: basic to advanced methods Dr. Kris De Brabanter This module will continue introducing advanced machine learning concepts such as support vector machines and k-means clustering.
|
|
12:30 - 2:00pm South Ballroom | Lunch - Box lunches provided | |
2:10 - 3:10pm South Ballroom |
Introduction to Scalable Tools for Big Data Dr. Robert Dyer |
|
![]() |
This module will provide a gentle introduction to scalable tools for Big Data Analytics such as Apache Hadoop and Spark. It will also identify key challenges in effectively using such tools and common pitfalls.
About Dr. Dyer: Robert Dyer is an Assistant Professor in the Department of Computer Science at Bowling Green State University. He received his Ph.D. from Iowa State University in 2013. His research areas are in Software Engineering, Big Data applications, and Programming Languages. Currently his research focuses on the Boa project, that provides a domain-specific language and infrastructure to allow researchers to easily mine a very large number of software repositories. Robert has served on the organizing committee for ICSE, program committee for Modularity and OOPSLA Artifacts, and reviewed for journals such as Empirical Software Engineering. He is a member of ACM SIGSOFT and SIGPLAN and is the ACM SIGSOFT Webinar Coordinator. More |
|
3:10 - 3:40pm South Ballroom | Break - refreshments provided | |
3:40 - 5:30pm South Ballroom |
Introduction to Scalable Tools for Big Data (continued) Dr. Robert Dyer |
|
5:30pm - 5:45pm South Ballroom |
Big Data Summer School Closing Session |