Publications

Evaluating Shared Memory Heterogeneous Systems Using Traverse-compute Workloads

Published in Open-Source Computer Architecture Research 2023, 2023

Abstract

Download here

Redwood: Flexible and Portable Heterogeneous Tree Traversal Workloads

Published in 2023 IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Abstract

Shared memory heterogeneous systems are now mainstream, with nearly every mobile phone and tablet containing integrated processing units. However, developing applications for such devices is difficult as workloads must be decomposed across different processing units, and the decomposition must be flexible to account for the growing diversity of devices, each with different relative processing unit throughput. Furthermore, many devices require distinct programming front ends, requiring significant effort to write cross-platform applications. In this work, we identify a pragmatic class of applications, which we call traverse-compute applications, that are ideal for shared memory heterogeneous systems. These applications have a flexible heterogeneous decomposition where CPUs excel at traversing a tree structure, while accelerators excel at node computations. Leveraging this insight, we present Redwood: a framework for writing heterogeneous traverse-compute workloads. Redwood provides a simple processing unit abstraction and a tree traversal library that enables heterogeneous optimizations. Using Redwood, we implement Grove, a benchmark suite containing nine pragmatic tree traversal applications, e.g., k-nearest neighbors. We instantiate Redwood for three different heterogeneous programming platforms: CUDA, SYCL, and High-Level Synthesis; we use Grove to evaluate five shared memory heterogeneous systems. Our evaluation highlights the importance of flexible heterogeneous decomposition as the optimal parameters differ widely across platforms and applications. However, once optimally configured, heterogeneous implementations can provide up to 13.53x speedups (geomean of 3.01x) over homogeneous implementations, showcasing the potential of heterogeneous computing for these workloads.

Download here

A Modular Architecture for Procedural Generation of Towns, Intersections and Scenarios for Testing Autonomous Vehicles

Published in 2020 IEEE Intelligent Vehicles Symposium (IV), 2020

Abstract

Simulation-based testing is critical for ensuring safety of autonomous vehicles. Autonomous vehicles are enabled by deep learning techniques which require a large quantity of data. With simulation testing, we can create rare events for testing and training of autonomous vehicles. Procedural generation of roads and modeling of driving behaviors in an easily extendable architecture ensures that we are able to create rare scenarios at scale with minimal artistic burden. In this paper, we present CruzWay, a system that both supports and creates these scenarios. With CruzWay, we are able to procedurally generate town sized road networks or road intersections. CruzWay supports generation of road meshes as well as navigation meshes from SUMO road network files. CruzWay can generate cars as well as pedestrians run by behavior trees (BTs) in this environment. The self-contained, modular nature of BTs in combination with procedural roads allows us to create a large number of scenarios.

Download here

Yanwen Xu

Publications

Evaluating Shared Memory Heterogeneous Systems Using Traverse-compute Workloads

Abstract

Redwood: Flexible and Portable Heterogeneous Tree Traversal Workloads

Abstract

A Modular Architecture for Procedural Generation of Towns, Intersections and Scenarios for Testing Autonomous Vehicles

Abstract