Announcing librunecoral - Runes on Google Coral

and

Sep 13, 2021

A few years ago, Google launched Coral.ai, a platform (both hardware and the software tools) for building local AI.

Google defines Coral as:

a complete toolkit to build products with local AI. Our on-device inferencing capabilities allow you to build products that are efficient, private, fast and offline.

image credit: coral.ai

Coral is a suite of hardware products that is ready to run Google’s Tensorflow library out of the box. Our team are big fans of the product because it is great for both prototyping and productionizing use cases.

One of the key differentiators in their stack is the presence of an edge TPU processor.

Coral provides a complete platform for accelerating neural networks on embedded devices. It's a small-yet-mighty, low-power ASIC that provides high performance neural net inferencing.
Edge TPU is Google’s purpose-built ASIC designed to run AI at the edge. It delivers high performance in a small physical and power footprint, enabling the deployment of high-accuracy AI at the edge. (ref)

As described on Wikipedia - Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google specifically for neural network machine learning, particularly using Google's own TensorFlow software.

HOT-G’s Rune on Coral

The magic sauce to our company’s mission is to support edge computing on most devices over time. So far, our use cases have been focused on running Runes on mobile, browser, desktop, Arduino, and Linux based systems (Raspberry Pi). We are working on bringing other platforms over time including the ones which require integration with blockchain.

Today, we are excited to launch support for Google Coral - Runes can now be compiled to run on Google coral.

A few key benefits of using Rune (which is open source) are the following:

You can build your edge AI app and target multiple types of devices we support
Perform ML Ops on the edge - Rune provides semantics for configurable and programmable data pipelines which all run on the edge
Get security, monitoring, and eventually observability with our Hammer Forge product (closed beta as of now, public launch end November 2021).

If you are interested in early access to our observability, monitoring, and security platform for edge computing with Runes, drop a note to akshay@hotg.ai

In the remainder of this post, we will describe the technical details on how we ported Rune for Coral.

Librunecoral

Over the last few weeks, we have spent a lot of time and effort in the making of librunecoral.

This originally started as a simple wrapper over libcoral - google's library for leveraging hardware acceleration on its coral devices - but after the initial proof of concept evolved into our replacement for the tflite crate used by the Rune project.

This wasn’t just another case of Not Invented Here syndrome, though. Over the last 9 months of working with TensorFlow Lite via the tflite crate, we’ve encountered some shortcomings that can be resolved by creating our own bindings to TensorFlow Lite tailored to the Rune project’s needs.

Here is a taste of what we were seeing:

Google does not provide API/ABI compatibility guarantees between libedgetpu, libcoral, and libtensorflow-lite (google-coral/libedgetpu#26)
There is no way to statically link to libedgetpu or libcoral on all our target platforms using CMake (specifically, iOS)
The tflite crate links to an older version of TensorFlow Lite and can’t be cross-compiled to mobile platforms (boncheolgu/tflite-rs#49)

Rune doesn’t need access to the full TensorFlow Lite API so we were able to reduce it quite a bit.

// Returns an int with all the backends that are available
int availableAccelerationBackends();

// Load a model using its "mimetype" to figure out what format the model is in
// Only "application/tflite-model" is accepted at this time.
// And then create an interpreter for the model to be interpreted
RuneCoralLoadResult create_inference_context(
    const char *mimetype, const void *model, size_t model_len,
    const RuneCoralAccelerationBackend backend,
    RuneCoralContext **inferenceContext);

// Returns the number of opcodes currently used
size_t inference_opcount(
    const RuneCoralContext * const inferenceContext);

// Return the number of input tensors of the current inference context, and update tensors to point to them
size_t inference_inputs(
    const RuneCoralContext * const inferenceContext,
    const RuneCoralTensor ** tensors);

// Return the number of output tensors of the current inference context, and update tensors to point to them
size_t inference_outputs(
    const RuneCoralContext * const inferenceContext,
    const RuneCoralTensor ** tensors);

// frees all the resources allocated for a context
void destroy_inference_context(RuneCoralContext *inferenceContext);


// Run inference on the model with the inputs provided and collect the outputs
RuneCoralInferenceResult infer(
    RuneCoralContext *context,
    const RuneCoralTensor *inputs, size_t num_inputs,
    RuneCoralTensor *outputs, size_t num_outputs);

So, what goodies does librunecoral bring for you?

1) We can now support a lot more platforms:

x86 Windows (x86_64-pc-windows-msvc)
x86 Linux (x86_64-unknown-linux-gnu)
ARM Linux (aarch64-unknown-linux-gnu)
ARM Android (aarch64-linux-android)
iOS (aarch64-apple-ios)
x86 MacOS (x86_64-apple-darwin)
M1 MacOS (aarch64-apple-darwin) - soon.

2) We provide hardware acceleration where available. We currently support GPU and TPU on Linux/Android with support on other platforms in the works.

3) We now statically link every thing to rune, so building and distributing your application with runes will be less painful. You just need to copy around one executable!

4) We now get to use latest and greatest TensorFlow Lite in all your Runes.

5) We now have a roadmap to do more cleanups and provide ML pipelines optimized for your use case.

Over the coming weeks, we will migrate our mobile apps to the main Rust Rune runtime and deprecate the old C++ rune_vm, so as to unify our codebase.

In part 2, We will also share the benchmarks of the new hardware accelerated pipelines on various embedded devices so stay tuned!

Resources

librunecoral - The crate can be found here: https://lib.rs/crates/hotg-runecoral
The ticket on github tracking the open source work on support for coral - https://github.com/hotg-ai/rune/issues/269
Coral.ai products - https://coral.ai/products/

We are excited and humbled to be getting a lot of attention recently from our social media posts and references. If you are interested in learning about our ideas on the future of TinyML, AI, edge computing, and decoupling intelligence, feel free to subscribe below. Thank you for following along with our team and supporting us on this beautiful journey together! We’re beyond grateful for your readership and support.

Follow us on our socials:

Twitter @hotg_ai and @hammer_otg | LinkedIn | Discord | Discourse

tinyVerse

Discussion about this post