Intel AI Ecosystems - solutions for training and inferencing

Intel’s AI Ecosystem and Portfolio

As a technology leader, Intel offers a complete AI ecosystem that concentrates far beyond today’s AI and Intel is committed to fuelling the AI revolution deep into the future. It’s a top priority for Intel, as demonstrated through in-house research, development, and key acquisitions. FPGAs play an important role in this commitment.

Intel’s comprehensive, flexible, and performance-optimised AI portfolio of products for machine and deep learning covers the entire spectrum from hardware platforms to end user applications:

• Intel Math Kernel Library for Deep Neural Networks (Intel MKL-DNN)
• Compute Library for Deep Neural Networks
• Deep Learning Accelerator Library for FPGAs
• Frameworks such as Caffe and TensorFlow
• Tools like the Deep Learning Deployment Toolkit from Intel

Intel AI Portfolio

The Intel Xeon Platinum and Phi ranges of processors are an ideal way to begin in deep learning training models. Following training of a model, hardware such as FPGAs and the Movidius USB stick can then deliver rapid inferencing - to learn more about the Xeon family click below.

Find out more

Unleash Full Potential
Experiences
Tools	Intel Deep Learning Deployment Toolkit		Intel Computer Vision SDK		Movidius Neural Compute Stick		Saffron Technology
Frameworks	Apache Spark	Tensorflow	mxnet	theano	Microsoft CNTK	Torch	Caffe
Libraries	Intel Dist	Intel DAAL	Intel Nervana Graph	Intel MKL	Intel MLSL	Movidius MvTensor Library	Associative Memory Base
Hardware					Memory and Storage	Networking	Visual Intelligence

What is an FPGA?

Field programmable gate arrays (FPGAs) are programmable integrated circuits containing logic elements, DSP blocks, on die memory, and flexible I/O. These building blocks enable the developer or end user to implement any number of functions directly in the hardware. Users can efficiently manage data locality by taking the data directly from the source and tightly coupling the compute with the results between computations directly on the FPGA. In many cases, running computations on an Intel FPGA is faster, lower power, lower latency, and has higher throughput than using a CPU. Intel FPGAs can also be reprogrammed, even after they are installed, enabling hardware reuse and reconfiguration.

Performance	Hardware Flexibility	Workload Flexibility
Power Efficiency Intel FPGAs provide improved (or low) power efficiency for running AI workloads, such as AlexNet and GoogleNet algorithm, thus reducing overall power consumption and total cost of ownership	Deployment Flexibility An FPGA enables you to run in “offload” mode taking data to and from a CPU or “in-line” where data is processed directly in the FPGA before going to the host processor	Precision FPGAs can run any precision and data type, from 64-bit floating point to integer to binary, further allowing you to customise your implementation to your exact needs.
Speed FPGAs are fast. When response time matters, Intel FPGAs provide excellent speed and deterministic low latency, supporting faster decision making and offering a better customer experience	I/O Types FPGA transceivers allow direct interface to any data source and interface standard, including cameras, storage devices, or the network	Future Algorithms Because FPGAs are reprogrammable, they are not only useful for today’s AI workloads and algorithms, but are adaptable to the AI architectures of the future
Throughput Intel FPGAs can increase the throughput of your system, allowing you to do more with less	Power Envelope Besides being power efficient, Intel FPGA designs are power scalable, so you can design your workload to use as much power as you need	Multi-Functionality Intel FPGAs can combine AI algorithms with other important functions in your system into a single chip resulting in low power, low deterministic latency, high throughput, and a low total cost of ownership

FPGAs are concerned with system performance. Intel FPGAs accelerate and aid the compute and connectivity required to collect and process the massive quantities of information around us by controlling the data path. In addition to FPGAs being used as compute offload, they can also directly receive data and process it inline without going through the host system. This frees the processor to manage other system events and provide higher real time system performance.

Real time is key. AI often relies on real-time processing to draw instantaneous conclusions and respond accurately. Imagine a self-driving car waiting for feedback after another car breaks hard or a deer leaps from the bushes. Immediacy has been a challenge given the amount of data involved, and lag can mean the difference between responding to an event and missing it entirely.

Introducing the Movidius Neural Compute Stick

The Movidius Neural Compute Stick (NCS) is a tiny fan-less deep learning device that you can use to learn AI programming at the edge. It is powered by the same low-power Movidius Vision Processing Unit (VPU) that can be found in millions of smart security cameras, gesture-controlled autonomous drones, and industrial machine vision equipment, for example. The convenient USB stick form factor makes it easier for developers to create, optimise and deploy advanced computer vision intelligence across a range of devices at the edge.

The USB form factor easily attaches to existing hosts and prototyping platforms, while the VPU inside provides machine learning on a low-power deep learning inference engine. You start using the NCS with trained Caffe framework-based feed-forward Convolutional Neural Network (CNN), or you can choose one of our example pre-trained networks. Then, by using our Toolkit, you can profile the neural network, then compile a tuned version ready for embedded deployment using our Neural Compute Platform API.

• Supports CNN profiling, prototyping, and tuning workflow
• All data and power provided over a single USB Type A port
• Real-time, on device inference – cloud connectivity not required
• Run multiple devices on the same platform to scale performance
• Quickly deploy existing CNN models or uniquely trained network

Scan and Intel - Your Complete AI solution

As a Platinum partner of Intel, and a specialist in deep learning & AI, the Scan Business team can offer you any advice on the full range of Intel AI solutions, proof-of-concept trials and any data scientist support you require.

All Purpose

Intel Xeon Processor Scalable Family

Scalable performance for widest variety of AI & other datacenter workloads - including breakthrough deep learning training & inference

Highly-Parallel

Intel Xeon Phi Processor (Knights Mill)

Scalable performance optimised for even faster deep learning training and select highly-parallel datacenter workloads

Flexible Acceleration

Intel FPGA

Scalable acceleration for deep learning inference in real-time with higher efficiency, and wide range of workloads & configurations

Deep Learning

Crest Family

Scalable acceleration with best performance for intensive deep learning training & inference, period

Introducing the Intel AI Ecosystem

Solutions for training and inferencing

Intel’s AI Ecosystem and Portfolio

Intel AI Portfolio

What is an FPGA?

Power Efficiency

Deployment Flexibility

Precision

Speed

I/O Types

Future Algorithms

Throughput

Power Envelope

Multi-Functionality