My main research interests can be summarized under Machine Learning with Graphs. This includes Graph Mining, Gaussian Processes, Random Walks, Information Retrieval, Within-Network Relational Learning, Statistical Relational Learning (SRL), and the application of ML/SRL methods to solve problems in Robotics, Biology, Medicine, and Agriculture.


Benchmark Data Sets for Graph Kernels

This graph database contains collected benchmark data sets for the evaluation of graph kernels. The data sets were collected by Kristian Kersting, Nils M. Kriege, Christopher Morris, Petra Mutzel, and Marion Neumann with partial support of the German Science Foundation (DFG) within the Collaborative Research Center SFB 876Providing Information by Resource-Constrained Data Analysis”, project A6Resource-efficient Graph Mining”.

Research Projects

PhD Thesis

Before joining Wash U, I was a PhD Student in the Knowledge Discovery and Machine Learning research group at the University of Bonn (2010–2015). I also worked on projects in the STREAM (Statistical Relational Activity Mining) Group at Fraunhofer IAIS and the BIT (Bonn-Aachen International Center for Information Technology) at the University of Bonn.

My PhD topic was “Learning with Graphs using Kernels from Propagated Information” and it was supervised by Prof. Dr. Kristian Kersting. The members in my PhD committee were Prof. Dr. Stefan Wrobel, Prof. Dr. Christian Bauckhage, and Prof. Dr. Klaus Greve.


Classification of plant diseases – SmartDDS (Smart Disease Detection System), German Federal Office for Agricultre and Food (BLE)

“Images are GRAPHS!”

Modern communication and sensor technology coupled with powerful pattern recognition algorithms for information extraction and classification allow the development and use of integrated systems to tackle environmental problems. This integration is particularly promising for applications in crop farming, where such systems can help to control growth and improve yields while harmful environmental impacts are minimized. Thus, the vision of sustainable agriculture for anybody, anytime, and anywhere in the world can be put into reach. Our reasearch focuses on plant disease classification based on cell phone images, a novel way to supply farmers with personalized information and processing recommendations in real time. Several statistical image features and a novel scheme of measuring local textures of leaf spots yield promising results. We have data for two important agricultural crop varieties, wheat and sugar beet to evaluate the classification of disease symptoms caused by various fungi or bacteria.


Task-dependent robot grasping – First–MM (Flexible Skill Acquisition and Intuitive Robot Tasking for Mobile Manipulation in the Real World), FP7 EU Project

“Point clouds are GRAPHS!”

Robot grasping is a critical and difficult problem in robotics. The problem of simply finding a stable grasp is difficult enough, but to perform a useful grasp, we must also consider other aspects of the task: the object, its properties, and any task-related constraints. The choice of grasping region is highly dependent on the category of object, and the automated prediction of object category is the problem we focus on here. In our research, we consider manifold information and semantic object parts in a graph kernel to predict categories of a large variety of household objects such as cups, pots, pans, bottles, and various tools. The similarity based category prediction is achieved by employing propagation kernels on graph representations of 3D point clouds of objects. Our work highlights the importance of moving towards the use of structured machine learning approaches in order to achieve the dream of autonomous and intelligent robot grasping: learning to map low-level visual features to good grasping points under consideration of object–task affordances and high-level world knowledge.

Screen Shot 2015-08-13 at 19.29.35

Traffic frequency prediction – Frequenzatlas (Traffic Frequency Prediction for Germany), German Association of Outdoor Advertising (FAW)

“Street networks are GRAPHS!”

Triggered by a market relevant application that involves making joint predictions of pedestrian and public transit flows in urban areas, our research addresses the question of how to utilize hidden common cause relations among variables of interest in order to improve performance in the two related regression tasks. Specifically, we propose stacked Gaussian process learning, a meta-learning scheme in which a base Gaussian process is enhanced by adding the posterior covariance functions of other related tasks to its covariance function in a stage-wise optimization. The idea is that the stacked posterior covariances encode the hidden common causes among variables of interest that are shared across the related regression tasks. Stacked Gaussian process learning is efficient, capable of capturing shared common causes, and can be implemented with any kind of standard Gaussian process regression model such as sparse approximations and relational variants. Our experimental results on real-world data from the market relevant application show that stacked Gaussian processes learning can significantly improve prediction performance of a standard Gaussian process.