### Machine Learning for Computational Imaging

Faculty: Ulugbek Kamilov

Computational imaging often deals with the problem of forming images free of artifacts and noise. REU students will work on advanced algorithms for image restoration that are based on integration of optimization and machine learning. We have developed a family of such techniques that use learned information, such as natural image features, to generate clean images from the corrupt ones. REU students will have an opportunity to learn about real-world problems in biomedical imaging, study cutting edge imaging technology, and contribute to this exciting research area.

Familiarity with image processing and machine learning. Proficiency with Python or Matlab.

### Large-Scale Optimization for Machine Learning

Faculty: Ulugbek Kamilov

Optimization algorithms play an essential role in modern machine learning (ML). The choice of an optimization algorithm establishes whether a sufficiently good performance can be obtained in hours or in days. Increasingly, optimization is becoming large-scale due to modern ML models having millions of parameters trained over extremely large datasets. REU students will work on the development of novel advanced optimization algorithms for large-scale ML. Computational Imaging Group (CIG) at WashU has recently developed several new algorithms and the students will contribute to this exciting area by extending our current results. Several applications will be considered, including efficient training of deep neural nets.

Proficiency with Python or Matlab. Familiarity with machine learning. Mathematical maturity to understand optimization algorithms and their analysis.

### Privacy-preserving Medical Computation

Faculty: Ning Zhang

Recent advances in genome sequencing technology have made it possible to generate highly detailed genotypes inexpensively. The ubiquitous access to genomic information has led us towards a new era of medical research and personalized medicine. On the other hand, privacy is a major concern towards broader sharing of sensitive medical information. Given the current global challenges on healthcare, open access for medical discovery is a pressing issue with societal importance.

In this project, REU students will work collaboratively with PhD students to explore the use of trusted execution environment, such as Intel SGX, for privacy-preserving computation.

C++, basic security concepts

### High Performance Computing to Benefit Science

Faculty: Jeremy Buhler, Roger Chamberlain, Ron Cytron

Many scientific disciplines benefit from advances in computational techniques. Scientific instruments can see measured signals more effectively, analysis can benefit from substantially larger volumes of data, and ultimately, we can ask and answer deeper questions about the world around us.

In this project we work with experts in multiple domains of science (e.g., astrophysics, biology, nuclear chemistry) and seek to improve the performance and power efficiency of computations that benefit scientific inquiry. This often includes the exploitation of non-traditional computer architectures used to accelerate an application.

Programming ability; C/C++ experience strongly preferred; Experience with cloud computing is a plus

### Machine Learning for Interactive Data Visualization

Faculty: Alvitta Ottley

The development of data visualization research in the past decades enables visualization systems to achieve great general usability in various domains. Such advancements have improved not only the understanding of the data but also the understanding of people and how they use data visualization systems. In this project, we will explore possibilities for next-generation data visualization. We will apply machine learning techniques to make predictions about users’ goals and examine methods of supporting data exploration.

Proficiency in web programming and JavaScript; Prior experience with D3 is a plus but not required. Familiarity with machine learning and statistics would be beneficial.

### Oppression as a control system

Faculty: Neal Patwari

In this project, we apply models and theory from the study of feedback systems to develop tools for the study of equitable algorithms. There is an emerging area of “algorithmic fairness”, developed to study algorithms like those used in predictive policing. This project extends the existing definitions for fairness. Ibram X. Kendi defines an anti-racist policy as one that produces or sustains equity between racial groups. We extend this definition to algorithms (which are policies) and other systems of oppression, i.e., sexism, ablism, heterosexism, classism, and others. To know if an algorithm produces or sustains equity, one must additionally study the feedback mechanisms between the algorithm, its users, other people, and other systems in the context in which the algorithm operates. Feedback systems have been studied in the context of control theory, and dynamical systems theory, and we will apply these tools. The REU project may involve different aspects of this goal, from literature surveys, to building models of algorithms in context, to using temporal data to perform model identification, to analysis and simulation-based results.

Some background in 1: oppression / critical theories, and 2: control or systems

### What is the value of data?

Faculty: Netanel Raviv and Roman Garnett

Data is one of the most important assets in the information age, but rigorous methods for estimating its quality are lacking. Existing methods for quality estimation mostly focus on size or specific utilities; these are inadequate for most applications in the data economy since data is often collected laboriously and distributed broadly. According to one recent radical suggestion, data is of high quality if the set of models which explain it well has low variance, and of low quality otherwise. In this project we will revisit, revise, implement, and test this notion both in theory and in practice, aiming for a universal and practical measure of data quality. The project is mathematical in nature, but includes a software component.

Mathematical maturity is crucial, especially in probability, statistics, and linear algebra. Familiarity with mathematical programming tools is an advantage.

### Active Learning for Math Education

Faculty: Dennis Barbour

The ability to learn is related to cognitive function. Cognitive function is conventionally evaluated with a one-time test battery. One-off testing reinforces trait-based (static) interpretations of cognitive ability. Cognitive function can actually wax or wane with other variables (state-based or dynamic). Inequities in educational environments can lead to biased interpretations of this variation in function. These biases can reinforce inequities, disadvantaging students further. This burden is disproportionately borne by underrepresented minorities.

We use active machine learning to construct hyperefficient multidimensional cognitive tests. Using these rapid tests in the classroom allows teachers to understand on each day how prepared each student might be for that day’s lesson. This information can also be used to automate lesson design itself on a per-student basis. This project involves working on Python machine learning code, analyzing the results of prior tests, and integrating our algorithms into Unity-based cognitive tests for widespread deployment

Either Python programming OR Unity development; some machine learning.

### Uncovering the “Hidden Half” of plants

Faculty: Tao Ju

Roots, the “hidden half” of a plant, play many important roles including physical support of the plant, uptake of water and nutrients, and stabilization of the soil. Their functions, as well as their amazingly complex structures, have intrigued biologists for centuries. With advanced imaging technique like CT and MRI, biologists are finally able to “see” these underground forms in 3D. However, computational methods are needed to extract relevant information from the images, such as identifying root branches, measuring their length and shape, understanding their organization and architecture, and analyzing them over time.

As part of a NSF-funded collaboration between three institutions, the REU will join an interdisciplinary team of computer scientists, mathematicians, and biologists to build automated algorithms and interactive graphical tools for image-based analysis of plant roots. The central theme of the algorithms will be using geometric skeletons, a popular shape descriptor in computer graphics and vision, to model plant roots and to infer biological information.

Experience with C++ and Python, familiarity with OpenGL, good foundation in algorithms and data structures (particularly those related to graphs).

### Tracking Burnout among Physicians using EHR-based Audit Log Files

Faculty: Chenyang Lu

We are looking for one or more motivated undergraduate students with expertise in applying machine learning algorithms on large volumes of clinical activity data from physicians. Specifically, we are interested in developing models for characterizing and predicting (a) phenotypes or markers that correspond with physician clinical activities that contribute towards their burnout, and (b) assess the downstream effects of burnout on clinical outcomes (such as decision making, medical errors).

Development efforts will involve the creation of temporal/sequential models, and exploratory analysis for developing phenotypes for clinical activity load and burnout.

The ideal candidate will have prior experience in working with large datasets (millions of rows of data), strong programming skills, and at least some exposure to machine learning/predictive modeling.