Science of Information for Biological Data
- Student Workshop

July 7-11, 2014
Purdue University, West Lafayette Campus

Return to this page for updated information, links to resources, etc.
Last updated Wed Jul 13, 2022 11:55 AM EDT Changelog
Registration is closed for this workshop
Please view the 2015 workshop page

Workshop Audience:

Graduate and undergraduate students in the life sciences, computer sciences, statistics, mathematics, and electrical engineering or similar fields with an interest in learning data science approaches and tools to apply for analysis and visualization of biological data.

Workshop Purpose:

The spirit of the workshop is to bring together students from multiple fields to lower barriers for understanding the language and approaches in both biology and computer science/data science. Students will gain an understanding of the approaches, methods, and tools of another field of study, while having the opportunity to gain experience working in small interdisciplinary groups focused on cutting edge biology research. Students will be encouraged to work toward potential solutions for data analysis and visualization by the end of the workshop.

Life Science - Information Theory Project Categories

These are the areas of emphasis for the workshop: Three biology research challenges with potential information based solutions

  1. High throughput analysis of the information content of proteins

    A precise temporal-spatial organization of proteins within cells is required for basic cellular activities (e.g., proliferation) and for complex multicellular processes (e.g., embryo development). Importantly, it is well-known that the intracellular localization of proteins is specified by information encoded in the order in which their constituents are linked; i.e., in their amino acid sequence.

    However, current methods for the analysis of protein cargo information often rely on manual inspection of amino acid sequences and on the use of scattered algorithms and resources. One obvious limitation of this “manual” approach is the inability to perform high throughput analysis. Although complete collections of protein sequences from entire organisms exist, we lack the tools to analyze databases looking for cargoes carrying specific sorting information.

    This is a major deficiency as reading and interpreting the sorting sequence code of proteins would have a profound impact on many areas of the life sciences. For example, it would shed light onto the molecular mechanisms of several genetic diseases and would eventually allow us to control the fate of proteins in cells. Students will be challenged to develop an approach for the analysis of protein sequence databases for identification of known and novel sorting signals. Special emphasis will be given to the identification of potential protein targets in genetic diseases, neurodegenerative disorders, and in information theory approaches aimed to fully crack the protein sequence code.

  2. Quantitative Modeling for the effect of Anti-tumor Agents in vivo

    Very often new anti-cancer drugs successfully tested in vitro (on cells grown on petri dishes) fail to efficiently perform in vivo (e.g., in animal models). Although multiple reasons may account for these failures, the lack of proper models guiding the transition between these approaches remains a main contributing cause to unsuccessful trials.

    This challenge encompasses the development of mathematical/statistical models of anti-cancer drug action that will explicitly take into consideration tumor size, drug concentration and tumor cell heterogeneity. Students will be provided with real data from current investigations on a novel agent against bladder cancer.

  3. Monitoring and Quantifying Abnormal Cell Behavior

    A key to the development of novel therapeutics and the investigation of disease mechanisms is our ability to quantitatively perceive cell abnormalities. For example, when investigators try different drug prototypes to counteract the manifestations of a disease, they need to score them from the best to the worse. This information is critical to allow scientists to choose what prototype features to adopt and which ones to avoid during the development of a new generation of therapeutic agents. A similar rationale applies to investigations about the effect of different mutations in certain genes: what mutations lead to the more severe abnormalities?

    The students will be challenged to develop methods to describe and quantify abnormalities in organelle and cell morphology or cell dynamics, in both patient cells and model organism cells.


Faculty workshop leaders: Claudio Aguilar, PhD, Biology; Mark D. Ward, PhD, Statistics
Center contact: Brent T. Ladd, Education Director

Lodging and Meals:

Lodging will be provided and paid for by the Center for all off-campus students. Students from off-campus requiring lodging will be staying at First Street Towers. The office is open 24 hours so students can check in whenever they arrive. Phone for First Street Towers is 765-494-0200 for assistance If you take an airport shuttle to Purdue, a shuttle pickup/drop-off spot is often at the Purdue West shopping area next to Follett's along McCutcheon Drive. It is then only 1 block east to First Street Towers along State Street (see map below)

Off campus students should plan to arrive at Purdue by Sunday evening. The workshop will conclude early afternoon on Friday.

Meals will be provided for everyone for breakfast, lunch and dinner during the workshop. Students staying in dorms will receive dinner cards at check-in, all others pick up from Brent Ladd in HAAS 202F July 1 or 2 on campus.

Travel to/from Purdue

Workshop Location:

Lawson Computer Science Building, Room 2150

Campus locations for breakfast & dinner:
  • Earhart Dining Hall (ERHT), open for breakfast – 6:30-8:30 am - open for dinner 5-7 pm.
  • Pappy’s Sweet Shop, open until 8 pm, located in the Purdue Union
  • Villa Pizza, open until 8 pm, located in the Purdue Union.

Post Workshop Activities

Post-workshop - Interdisciplinary Team Grants:

Students will have opportunity to submit proposals to the Center for funding to continue team collaborations with intention of producing co-presentations and paper at future conferences.


Funded under grant agreement CCF-0939370 Center for Science of Information from the National Science Foundation and with support from the Department of Computer Science, Purdue University.