Ann Arbor Algorithms



We are specialized in design, implementation and integration of algorithms and software systems in computer vision, machine learning, signal processing and large-scale data processing. We are also experienced in system optimization by identifying and removing bottlenecks in space, time and accuracy. We help our customers to rapidly evaluate and adopt latest development in these fields by customizing open-source software, or by reimplementing algorithms from scratch.

We speak English, Python and C++, and deliver our software in modularized code packages, portable executables, docker containers and AWS services.


When we introduce a new software technology to our customers, we also help train their existing engineer or new recruit and work with them closely, so when our job is done there are people to carry on development and maintanance. We provide CPT/OPT training opportunities to students who wants to pursue a career in software engineer or data science, or to apply latest deep-learning technologies to their field of study.

We are now offering a one-day hands-on beginner Tensorflow training program that covers image annotation and basic model training (see codebase).


We have advised multiple startup companies on design of technology and product roadmaps, design of system architectures, selection of platforms and toolchains, etc. We identify and interview candidates for our customers and help them to build their engineering teams.


We maintain close relationship with academia and are involved in leading research in machine learning and its applications. Our current academic clients/collaborators include universities,particularly historically black colleges and universities.

Case Studies

Training Deep Convolutional Models

As both GPUs per system and TFLOPs per GPU grow rapidly, how to efficiently preprocess and stream training data to keep the GPUs busy is becoming an increasingly challenging problem. We developed PicPac, a C++ library to efficiently manage and stream massive amount of training data. PicPac fully utilizes the high IOPS of SSD/NVME to support out-of-core random shuffling and stratified sampling, and implements a plug-in framework of data transformation and augmentation to support various training tasks. PicPac's python API is easy to use and is compatible with Tensorflow, PyTorch, MxNet and Caffe.

Medical Imaging and Lesion Detection

We are experienced in deep-learning with DICOM medical images, both 2D and 3D. We have developed deep-learning models to detect and segment lung cancer, breast cancer, multiple-myeloma and other lesions. Our solutions based on PicPac and have ranked high in multiple competitions. See our demo of carotid artery plaque segmentation and 3D reconstruction.

Example of lung nodule detection.

Content-Based Image Search Engine

We developed KGraph, one of today's fastest libraries for approximate nearest neighbor search (benchmark), and Donkey, a NoSQL feature vector database and toolkit for developing nearest neighbor search engines. Donkey supports KGraph and Locality Sensitive Hashing for indexing and supports HTTP/Restful API.
Leveraging KGraph, Donkey and latest deep-learning models for feature extraction, we have helped our client in UK implement a content-based image search engine that indexes tens of millions of images with a single server.

Next Generation Sequencing

A2Genomics is our cloud platform for high-throughput sequencing data analysis and pattern discovery. Our pipeline efficiently processes massivie NGS datasets, run multiple algorithms including PCA, SVD, DESeq, k-means, SOM and WGCNA, and generates publication quality visualizations.

Collaborative Filtering

We have helped a leading Chinese internet radio app with 70+ million users design and implement a recommendation system that minds user behavior and making online personalized recommendations.

Radio Commercial Search and Discovery

We have helped our client in China develop audio fingerprinting algorithms and implement a system that indexes millions of hours of radio broadcast audio covering 100+ cities. The system provides online search-by-example service and automatically discovers repetitive audio clips for new advertisements monitoring.

Semantic Segmentation

We have been training deep-learning models for our customers since 2015 and the techniques we use have gone through many iterations, from Caffe to Tensorflow and PyTorch and from FCN to U-Net and Mask R-CNN. Most of our tasks involve semantic segmentation of various data, e.g. ECG signals, CT/MRI volumes and video clips. The following video demonstrates our semantic segmentation capability.


CMS names 25 innovators advancing in AI Health Outcomes Challenge

Gold Medal in Data Science Bowl 2018

Silver Medal in Data Science Bowl 2017

Gold Medal in Data Science Bowl 2016

About Us

Ann Arbor Algorithms was founded by Wei Dong, PhD. Dr. Dong earned his PhD in computer science from Princeton University in 2011 and started doing business as Ann Arbor Algorithms in 2014.


Please contact Dr. Wei Dong for all inquiries.
Phone: 609-423-5844
Address: 2723 S. State St. Suite 150, Ann Arbor MI 48104