CSE REU Projects for 2024

Miniature Urban Autonomous Driving

Faculty: Yevgeniy Vorobeychik

We are developing an urban autonomous driving platform, WU-mini-city. The goal for this summer is to develop and hone basic self-driving capability of a miniature autonomous car (MIT Racecar/J architecture), including lane following and obstacle avoidance. A significant practical challenge is that many such tasks require non-trivial computation. The research task is to develop approaches that are efficient and low-enough latency to enable them to run on the computing platforms that are a part of the miniature vehicles. We will explore deep reinforcement learning approaches as well as more conventional model-based approaches to solve these problems.

Skills Required: Strong programming and mathematical background. Knowledge of machine learning foundations, both conceptual and practical. Significant experience with python, including ML libraries in python. Prior experience with deep learning, reinforcement learning, and/or robotics a strong plus.

Predicting Child Maltreatment

Faculty: Yevgeniy Vorobeychik

Child abuse and neglect can have devastating long-term consequences, negatively affecting a child’s developmental, mental, and social well-being, leading to heightened rates of mortality and morbidity. The goal of this project is to develop a computational model of family engagements with child protective services (CPS) across all children born to the same mother, considering familial attributes as well as prior maltreatment responses
by CPS. In addition, the student will develop a machine learning approach to predict maltreatment as well as the impact of CPS interventions on child welfare.

Skills Required: Strong mathematical skills, strong programming skills. Knowledge of and experience with python. Prior experience with machine learning is a plus.

Explainable Planning and Scheduling

Faculty: William Yeoh

In human-aware planning and scheduling systems, when the agent recommends a plan or schedule to a human user, it is often the case that the user might not understand why the recommendation is good, for example, compared to an alternative in the user’s mind. In such a scenario, there is a need for the agent to explain its recommendation to the user, providing them with the necessary information to understand properties of the recommendation (e.g., optimality, feasibility, etc.).

In this REU project, students will have the opportunity to investigate solution approaches from a wide spectrum, ranging from symbolic logic-based approaches that use knowledge representation and reasoning (KR) to data-driven approaches that use large language models (LLMs), as well as neuro-symbolic approaches that combine the benefit of both.

Skills Required: Strong programming skills. Familiarity with logic and/or LLMs is a plus.

MechWorks platform for Earthquake Engineering

Faculty: Chris Gill

We are developing a new concurrency platform atop multicore, GPU, and FPGA devices, which integrates real-time scheduling and control for earthquake engineering experiments that will be conducted in collaboration with colleagues at Purdue University. Depending on their interests and background, undergraduate students will be involved in different aspects of the project that may include C++ programming, task set generation and testing, performance and scalability assessment, and porting code among different hardware devices. Depending on outcomes of the project over the summer, REU students may have opportunities to co-author papers with faculty and graduate students working on the project, which has happened in previous years of our collaborations.

Skills Required: Background in C or C++ programming, and/or familiarity with multi-core, GPU, and/or FPGA platforms are preferred. Background in CUDA, control theory, and/or real-time systems is beneficial but not required.

Hyperdimensional computing and linear codes

Faculty: Netanel Raviv

Hyperdimensional Computing (HDC) is an emerging computational paradigm for representing compositional information as high-dimensional vectors, and has a promising potential in applications ranging from learning with neural networks to neuromorphic computing. In a radical shift from traditional information processing methods, in HDC one represents objects as random binary vectors, and uses algebraic operations on those vectors to store data structures and apply learning algorithms.

Recently, linear error correcting codes have been shown to provide incredible speedups in a variety of learning and information processing tasks in HDC. In this project we will explore the theory and practice of using linear codes in HDC, and examine the connection to associative memories. We will develop algorithms, prove bounds, and test those in real-world learning applications.

Skills Required: An ideal candidate will have strong mathematical skills, especially in linear algebra, finite fields, and probability. The student must be comfortable with reading and writing proofs, have critical and algorithmic thinking, and also be capable of implementing simple learning algorithms in Python. Familiarity with error correcting codes or associative memories (Hopfield networks) is appreciated.

Automating cyber-physical attack and defense

Faculty: Ning Zhang

Cyber-physical systems, such as autonomous vehicles, are revolutionizing different sectors in our society from manufacturing to transportation. While the industry is excited about the potentials of such systems with pervasive connectivity, security in these safety-critical cyber-physical systems remains a major concern for users, developers and lawmakers. The goal of this project is to develop the platform and attacks to enable verification of system defenses.

In this project, the REU will work with Ph.D. students on (one or more tasks) (1) reproducing existing attacks/defenses on real CPS platforms, including cyber and physical-based methods. (2) building hardware-in-the-loop simulation. Based on their interests, REUs have choices to develop their knowledge of kernel, hacking, and/or adversarial machine learning.

We are looking for multiple REU students to work on this project.

Skills Required: Strong programming skills and familiarity with C++ or python. Some understanding of the Linux systems stack (e.g., scheduling, network, perf subsystems) or adversarial machine learning is preferred. Prior experience with robots/simulation/physical-world realizable adversarial example is a plus.

Understanding Large Language Models

Faculty: Chenguang Wang

Large language models (LLM) such as GPT-4, Gemini, and Llama have emerged as a transformative technology in the field of natural language processing. These models, powered by deep learning techniques, possess the ability to understand and generate human-like text, revolutionizing various applications such as machine translation, question-answering systems, text summarization, and more. One fundamental challenge in the field is that we still lack an understanding of models’ important properties, behaviors, and emerging abilities such as mathematical reasoning and creative writing. This causes severe issues like hallucination and unsafeness, preventing their development in critical applications. In this project, we will explore both theoretical and empirical approaches toward a better understanding of LLMs. Based on the findings, we also plan to develop next-generation models.

Skills Required: Strong in maths and programming. Research experience or publications in relevant fields. Excellent GPA/Rank.

High Performance Computing to Benefit Scientific Observation

Faculty: Roger Chamberlain, Jeremy Buhler, Ron Cytron

The task of gathering data from scientific instruments — such as telescopes, particle sensors, and DNA sequencers — can benefit from advances in computational techniques. Instruments can report their observations faster and more reliably, and large volumes of observation data can be processed in real time within the instrument to make rapid discoveries and direct follow-up observations. Ultimately, more powerful computation can dramatically increase the ability of our instruments to answer questions about the world around us.

In this project, we work with experts in multiple domains of science (e.g., astrophysics, aerosol chemistry, molecular biology) to improve the performance and power efficiency of computations that benefit scientific observation. Because computations must occur within or adjacent to the observing instrument, they must meet strict size, weight, and power (SWaP) constraints as well as achieving goals for application throughput and latency. To succeed within these constraints, we often turn to low-power multicores and to non-traditional computer architectures (FPGAs, GPUs) to accelerate an application.

Skills Required: Programming ability; C/C++ experience a plus but not strictly required

Game-Theoretic Approaches for Baseball

Faculty: Yevgeniy Vorobeychik

Quantitative analytics has transformed sports over the last decade. To date, these analytic approaches are statistical at their core, characterizing what is and what was, while using this information to drive decisions about what to do in the future. However, as we often view team sports, such as soccer, hockey, and baseball, as pairwise win-lose encounters, it seems natural to model these as zero-sum games.

In this project, the students will develop a game-theoretic approach for computing pitching strategies for an inning of a baseball game, and, subsequently, for an entire game. These approaches will extend past work on solving a single at-bat modeled as a stochastic game. We will in particular explore efficient abstraction techniques for an inning of a game that leverages the fact that it can be decomposed into a series of at-bats.

Skills Required: Strong mathematical skills and strong python programming skills, as well as an appreciation (or at least good knowledge) of baseball. Some background in game theory and/or sports analytics a plus.

Computer Vision for Plant Ecology

Faculty: Nathan Jacobs

The field of computer vision has changed dramatically in the past ten years, making it possible to automatically extract detailed information from vast sets of images. While challenges remain, this presents an opportunity to improve our understanding of earth processes and their impact on plants and animals. The Multimodal Vision Research Laboratory (MVRL) is developing computer vision techniques to address pressing problems in ecology in close collaboration with domain experts. The focus of this project is on developing methods to assist botanists in processing herbarium specimens (plants that are collected in the field, dried, and stored in archives). There is a vast global effort to digitize these specimens to support research on climate change and climate resilience, but automated tools are needed to better make use of this imagery. We anticipate two students working in collaboration on different aspects of the problem (e.g., triaging images to support routing to the appropriate expert, improving estimates of geographic location for specimens without precise GPS, etc.).

Skills Required: Experience with Python programming, data wrangling, and machine learning is required. Experience with computer vision, image processing, and deep learning would be helpful but is not required.