Tuesday
Wednesday
Over the last decade, biology has been transformed into a data-driven
science. Through innovations in sequencing, high-throughput
microscopy, mRNA expression arrays, protein-protein and protein-DNA
binding assays, and numerous other high-throughput methods, it is now
possible to query simultaneously the activities of thousands of genes
and their products under a wide variety of experimental conditions.
The resulting data pose an exciting challenge for the field of machine
learning. Many of the model organisms (most notably S. cerevisiae) are
of sufficient complexity to render detailed mathematical modeling
intractable. However, it is still possible to try to learn
quantitative models which are rich enough to fit data, yet simple
enough to generalize and to be interpretable. Work by numerous groups
suggests a promising future for more complex eukaryotes (e.g.,
C. elegans, S. pombe, or D. melanogaster).
Qualitatively new challenges to the machine learning community include
the integration of heterogeneous datasets, such as sequence, binding,
and expression data; the creation of models which are interpretable
even to those not trained in probabilistic reasoning or statistical
learning theory; and the presentation resulting models in a way useful
to bench biologists as well as computational biologists.
This three-day workshop is designed to encourage interaction among
innovators in computational biology and innovators in machine
learning; to illuminate recent successes as well as pressing
challenges; and to inspire the development of novel, biologically
relevant, and biologically interpretable machine learning approaches
to the current problems in biology.