SnapML Support for Quantized Models

Running machine learning (ML) models on Lens Studio and Spectacles can expand creators’ abilities to design powerful and robust AR experiences. However, one of the biggest constraints of running ML models on devices is power consumption. In order to minimize the power consumption of running ML models, running ML models on a digital signal processor (DSP) is the best solution because DSP is the most power efficient IP.

Up until now, SnapML has been supporting float models. For models to run on DSP however, they need to be quantized. We are therefore introducing quantization support through SnapML.

There are three major benefits for creators to use quantized models:

Fast inference speed
Small model size
Increased power efficiency.

Quantization is able to reduce model size while also improving model’s inference speed due to fixed point computation and less memory bandwidth usage.

We have updated our SnapM frameworks to include a post-training quantization pipeline in Lens Studio consisting of Python libraries that enable users to quantize their models.

With this update, SnapML now supports quantized models in Tensorflow Lite importer. Users will be able to leverage quantized model templates from our template library including a new Multi Class Classification Template and will have access to our Github repo for the quantized training code to get started.