Skip to main content
  • Workshop: Introduction to Data Science & Interdisciplinary Research Teams

  • Monday, May 21, 2018 - Friday, May 25, 2018
    Purdue University

    Scenes from last years workshop.
    Scenes from previous workshops.

    Registration for this workshop is full. We encourage you to enroll in the free online short course Introduction to R for Data Science

    Workshop Audience

    All undergraduate students*, graduate students, and postdocs are welcome to register for this workshop. CSoI member students will be given priority registration.No training in ComputerScienceor expertise in any particular area is needed. Students from alldata-driven disciplines are welcome.

    The intended audience for the workshop are students at all levels who have not (yet) delved into adatascience experience, but want to begin working in this area. As an outcome of this experience, students will be acquainted with several tools fordataanalysis.

    Central to the diversity mission of the Center encourages women and other underrepresented groups including U.S. citizens, nationals, and permanent residents to apply. The spirit of this workshop is to bring together students and postdocs from multiple fields to lower barriers for understanding the language and approaches across multiple disciplines and data science.

    * undergraduate students must be U.S. citizens or permanent residents. Registration for the workshop does not guarantee acceptance.

    Workshop Purpose

    Dr. Mark D. Ward will introduce participants to some of the first principles and concepts from data analysis. Students who complete the workshop will learn several technologies, including skills for data wrangling and data visualization. We will utilize the R platform for data analysis, and will discuss strategies for reproducible research. As time permits, we will discuss how to use R to interact with SQL databases, how to scrape and parse XML code, and how to utilize techniques of data visualization, and the use of LaTeX. See the Pre-workshop section below to access the online short course, Introduction to R for Data Science in preparation for the R and data analysis sessions held during the workshop.

    All of the topics will be offered in team-oriented projects. We will choose several sources of data, so that the workshop will have an interdisciplinary appeal. This enables students to develop a sense of community and confidence about introductory topics in data analysis.

    Dr. Tasi-Wei Wu will lead a session on data visualization, focusing on the ParaView tool.The ParaView tutorial will cover the basic ParaView usage in the aspects ofcreating source, 3D manipulation, applying filters and pipeline and finally ParaView python shell. A simple python script will be given to generate thevtk data to be viewed in ParaView.See the Pre-Workshop section below to access a one-hour introduction video and slides by Dr. Vetria Byrd as an introduction prior to Dr. Wu's presentation on Wednesday, May 23.

    Workshop Instructors

    Lead Instructor Data Science and R: Dr Mark D. Ward , AssociateProfessor and Associate Director of Center for Science of Information, Purdue University

    Data Viz Instructor: Dr. Tsai-wei Wu , Data Visualization Specialist, Purdue University

    Professional Development Instructor: Dr. Carolyn Johnson , Director Diversity Resource Office, Purdue University

    TA:Tyler Netherly, Statistics, and Math, Purdue University

    TA: Elizabeth Bell, Statistics, Purdue University

    TA: Yucong Zhang , PhD candidate, Department of Statistics, Purdue University

    Workshop Co-Chair: Brent T. Ladd , Director of Education, Center for Science of Information, Computer Science, Purdue University

    Workshop Co-Chair: Kelly Andronicos , Director of Diversity,Center for Science of Information, Computer Science, Purdue University


    Agenda for the Week (tentative)

    Pre-Workshop Course & Materials:

    Prior to the workshop, attendees will complete a brief four-week online course - free of charge - introducing them to the R environment and working with data sets during the weeks before the workshop starts approx. 4 hrs per week commitment This short course is presented online through tutorials recorded by Dr. Mark D. Ward, and results in all students starting the workshop with a comparable level of experience using R.

    Complete the Introduction to R for Data Science - These tutorials and exercises are available through the CSoI Learning Hub. An email listserv will be available soon for use during the course starting April 25.

    View the Introduction to Data Visualization one hour webinar video (slides also available) prior to Wednesday, May 23.

    Data Viz Intro video

    Data Viz Intro slides

    Interdisciplinary Research Teams:

    Each attendee is a member of a team based partly on interests and background, and partly on available team spots. We do our best to form teams with balance among postdoc, graduate students, and undergraduate students. Each team has one member that submitted a potential project with data that the team canwork on. Read the one-page overview Best Practices for Successful Formation of Interdisciplinary Science Teams .

    Team Project Titles and Members:

    You can read about all six projects in one-page summaries describing the main idea or problem and the data related efforts of each project. Team members are shown on the workshop roster .

    Project A: The Polarization of Information on the Web

    Project B: Data Shadows \u2013 Exploring Personal Data

    Project C: Changes in Forest Communities of the Eastern United States

    Project D: The Estimation of Surface Heat Fluxes Using Weather Station Data

    Project E: The Role of Migrants in Building City Resilience for Emergency Response and Disaster Risk Reduction.

    Project F: Towards Cyber-Physical Vetting in Critical Infrastructures

    Research Teams Funding Opportunity:
    Following the workshop teams will have an opportunity to submit funding proposals to the Center to continue their research collaborations. You can view currently active and past student-postdoc research teams that received funding from the Center - Research Teams .


    Lodging will be provided and paid for by the Center for all non-Purdue students. Students from outside Purdue requiring lodging will be staying at First Street Towers .

    Off-campus students should plan to arrive at Purdue by Sunday evening, May 20th. The workshop will conclude Noon on Friday, May 25th.


    Lunch will be catered onsite Monday through Thursday. Additionally, all attendees will be issued a dining card for use in the Wiley dining hall for breakfast and dinner Monday, May 21st - Thursday, May 24th, and all meals will be at the Wiley dining hall for Friday, May 25th. Wiley is directly across from the Purdue Co-Rec Gymnasium.For any additional food or drinks that you want to pay for with your own $ can be easily purchased from the Lawson Computer Science "Port" Cafe between the hours of 8am-2pm (this cafe is on the first floor of the building where our workshop is held). Note, for Sunday arrivals and Saturday departures we do not provide any meals and dining halls are closed, however, there are several options available by walking two blocks west of First Street Towers in the Purdue West shopping plaza, or walking about 1 mile east along State Street to the Chauncey Hill Mall area.


    Travel to/from Purdue

    Note that you will want the Purdue West/Follett's location for shuttle drop-off/pick-up to and from the airport. This is along McCutcheon Drive by Follett's bookstore, and is directly two blocks west from the First Street Towers dorm .

    Wireless Network and Server Accounts

    Monday morning when you arrive at Lawson 3102 for the workshop, you will have with your name badge 1)A login for the Scholar server we will be using to access software and data, and for non-Purdue attendees 2) A login for the Purdue wireless network (PAL 3.0) and instructions for connecting.

    Travel Reimbursement

    Students and Postdocs from outside Purdue are eligible to apply for reimbursement of airfare and shuttle expenses to attend the full week workshop. Use the link above, and follow the instructions and forms provided. If there are any questions regarding travel reimbursement contact Robynne McCormick .


    This time in May is the quietest on the Purdue campus. Not a lot is happening this week - other than our workshop! Most evenings your team is encouraged to spend time together. In addition to eating dinner and meeting as a team during evenings, you might be interested in checking out the Purdue CoRec - one of the nicest facilities of its kind in the country. Check the summer hours before heading there. It is a short walk from your dormitory. Your Dining Card will give you access to the Co-Rec.

    There are a number of unique shops, bars, and dining at the east end of campus, about a 15-minute walk directly east from First Street Towers. You can also find out what else is happening in the area during the evenings at . Although Purdue is a very safe campus, we strongly recommend walking at night as a group, and never alone, especially in areas offof campus.