Avatar

Boxuan Li

Master Student

Carnegie Mellon University

Biography

I am a MCDS (Master of Computational Data Science) student at CMU (Carnegie Mellon University). Before pursuing a master degree, I was an analyst in the Core Engineering Department at Goldman Sachs (GS) for 2 years. My team built in-house monitoring tools on top of a graph database.

I am a Committer and Technical Steering Committee Member for an open-source project, janusgraph, which is one of the most popular graph databases. I was a maintainer for an open-source project, coala. I was a Google Summer of Code 2018 Student, Google Code-in 2018 mentor and Google Summer of Code 2019 mentor.

Before joining GS, I got my bachelor degree in Computer Science from The University of Hong Kong (HKU). I was supervised by Dr. Heming Cui at Systems and Networking Group, and Dr. Reynold Cheng at Data Engineering Group. I also spent a semester as an exchange student at University of Toronto, where I met and worked with Prof. Peter Marbach.

Interests

  • Distributed Systems
  • Graph Database

Education

  • Master of Computational Science, 2021-2022

    Carnegie Mellon University

  • BEng in Computer Science (First Class Honors), 2015-2019

    The University of Hong Kong

  • Exchange Student, 2018

    University of Toronto

  • Summer Student, 2016

    University of California, Berkeley

Working Experience

Adventure in industry

 
 
 
 
 

Software Engineer

Goldman Sachs

Jul 2019 – Jul 2021 Hong Kong

Tech Stack: Java, JanusGraph, Terraform, Hadoop, Cassandra, MongoDB, Elasticsearch, Spring

• Contributed to building a large graph-based topology monitoring system used by 30+ teams

• Developed a Terraform provider for users to manage graph resources via Infrastructure as code (IAC) solution, greatly reducing manual efforts to compare and update the graph by 90%

• Optimized core graph queries to improve average query latency by 50%

• Implemented a Spark streaming application to transform and ingest ~10M process telemetries daily, enabling resiliency monitoring and quick incident troubleshooting

• Built a framework on top of Hadoop MapReduce to run OLAP queries against ~20M vertices and edges, reducing latency of analytical queries by 95%

• Built a RESTful microservice on Kubernetes that auto-scales based on event metrics, reducing hardware resources by half on average

• Refactored the entire Java codebase with Spring, enabling inversion of control and dependency injection

• Independently mentored an intern to develop a Maven plugin to generate and validate IAC resources

 
 
 
 
 

Software Engineer Intern

YITU Technology

May 2017 – Aug 2017 Shanghai, China

Tech Stack: C++, MongoDB, python

• Developed cold and hot backup modules for large, distributed image recognition platform with 1 billion data entries

• Built RESTful web services in C++ to process image feature extraction, storing and retrieval requests with high QPS

• Implemented an automatic performance and failover testing pipeline using Python, reducing manual efforts by 80%

Publications

(2020). MC-Explorer: Analyzing and Visualizing Motif-Cliques on Large Networks. 36th IEEE International Conference on Data Engineering (ICDE 2020) Demo Track.

PDF Project Video

(2020). Stable community structures and social exclusion. International Conference on Social Informatics.

PDF

(2018). PLOVER: Fast, Multi-core Scalable Virtual Machine Fault-tolerance. 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18).

PDF Code Slides

Research Projects

PLOVER: Virtualized State Machine Replication System

Plover is the first Virtualized SMR (VSMR) System that achieves fast and multi-core scalable virtual machine fault-tolerance

Discovering Maximal Motif Cliques in Large Heterogeneous Information Networks

A maximal motif-clique discovery algorithm

MC-Explorer: Analyzing and Visualizing Motif-Cliques on Large Networks

A web application for motif clique search and interactive graph analysis

Stability Analysis of Information Communities under Perturbations

A mathematical model to analyze the Nash equilibria stability of community networks

Awards

Silver Medal in IEEE-CIS Fraud Detection Competition

Rank 1246381 (Top 2%)

Silver Medal in Instant Gratification Competition

Rank 341832 (Top 2%)

Dean’s Honours List

HKU Foundation Scholarships for Outstanding Students