AI Inferencing Hardware NVIDIA Jetson | SCAN UK

As the types and uses of AI has evolved, two distinct types of inferencing have emerged. The first type is large scale, where significant GPU resource is still required to deliver the requested outcome. For example, LLMs such as ChatGPT route requests back to a datacentre full of servers processing many actions simultaneously. This type of inferencing relies very much on the same type of hardware used to train the model - so to make the best choice, we advise you read our AI Training Hardware Buyers Guide, where model parameters and memory sizes are discussed in greater detail.

The second type of inferencing is much more focused, where low-power embedded GPU modules are sufficient to deliver the desired outcome. Examples include the brain of an autonomous vehicle or a robot, where a scaled down version of the AI model is installed within the device in order to control how it behaves and reacts to external inputs and information.

This guide is focused on this type of inferencing, its major use cases, and the embedded GPU systems that support it.

Use Cases

Explore use cases that require rapid real-time inferencing at the edge by clicking the tabs below.

Robotics

Robotics is undergoing a revolution, moving beyond the era of specialist machines to generalist robots. This shift moves away from single-purpose, fixed-function robots toward adaptable robots trained to perform diverse tasks across varied environments. Inspired by human cognition, these adaptable robots combine fast, reactive responses with high-level reasoning and planning, enabling more efficient learning and adaptation.

Humanoid robot hardware and software architecture

Building a typical humanoid robot requires four essential layers:

Hardware Abstraction

Integrates all key sensing and actuation modalities, enabling the robot to perceive its environment and interact physically with the world.

Real-Time Control Framework

Manages precise, low-latency control of the robot's movement, where minimising latency is absolutely critical for safe and responsive operation.

Perception and Planning

Equips the robot with environmental understanding, grasp and motion planning, locomotion, object recognition, and localisation—allowing effective interaction with the surrounding world.

High-Level Reasoning

Powers advanced functions such as scene understanding, complex task planning, and natural language interaction, where longer processing times are acceptable to support deeper reasoning and adaptability.

NVIDIA AI Software Stack

To deliver a seamless cloud-to-edge experience, embedded GPUs run the NVIDIA AI software stack for physical AI applications, including NVIDIA Isaac for robotics, NVIDIA Metropolis for visual agentic AI, and NVIDIA Holoscan for sensor processing. The resulting model is then inferred on the embedded GPU installed within the robot.

Smart Cities

Cities around the world are using AI and digital twins to reimagine how their most valuable physical assets and spaces are managed. NVIDIA embedded GPUs at the edge deliver real-time data from a host of cameras and sensors to feed deep learning-powered video analytics. When combined, these help to increase operational efficiency and safety across a broad range of spaces—from city streets and airports to event centres, shops and factory floors.

Smart Cities

AI brings innovative ways to build sustainable cities, keep infrastructure in top shape, and enhance public spaces like roadways for residents and communities. By turning data from countless sensors and IoT devices into crucial decisions with vision AI, the transformation begins.

Smart Airports

Handling millions of passengers annually, airports need to quickly and accurately manage incidents to minimise disruptions. AI-powered video analytics turn surveillance cameras into a source of actionable insights, ensuring smooth operations and a better passenger experience.

Smart Campus

Corporate buildings and educational campuses can benefit from Vision AI, offering proactive safety solutions through continuous monitoring without needing staff and rapid response to potential issues that could otherwise go unnoticed and unreported.

Smart Venues

Entertainment venues are designed for enjoyment, whether for sports, concerts, or other events but, they also need to ensure safety and efficiency. Using IoT sensors and cameras to provide real-time responses to any concerns, maintaining a secure and enjoyable environment.

Smart Retail

Use AI to reduce losses, speed up checkout processes, prevent stockouts, and gain insights into customer behaviour for better merchandising. Camera and sensor data provide analytics that enhance decision-making, streamline operations, and boost efficiency.

Smart Manufacturing

In manufacturing, the automation and monitoring of assets, systems, and environments are crucial. Companies are leveraging AI and IoT sensors to gain real-time insights, leading to a safer and more efficient workplace.

The NVIDIA Omniverse Blueprint for smart city AI provides the complete software stack needed to accelerate the development and testing of AI agents in physically accurate digital twins of cities. It includes:

NVIDIA Omniverse

Builds physically accurate digital twins and run simulations at city scale.

NVIDIA NeMo model training and fine-tuning

NVIDIA Cosmos

Generate synthetic data at scale for post-training AI models.

NVIDIA NeMo

Curates high-quality data and use that data to train and fine-tune vision language models (VLMs) and LLMs.

NVIDIA Metropolis

Builds and deploys video analytics AI agents for video search and summarisation (VSS).

The blueprint workflow comprises three key steps. First, developers create a SimReady digital twin of locations and facilities using aerial, satellite or map data with Omniverse and Cosmos. Second, they can train and fine-tune AI models, such as computer vision models and VLMs, using NVIDIA TAO to improve accuracy for vision AI use cases. Finally, real-time AI agents powered by these customised models are deployed to alert, summarise and query camera and sensor data using the Metropolis VSS. Embedded GPUs feature widely through the physical edge AI systems needed to gather critical data.

Introducing NVIDIA Jetson

NVIDIA is the leading provider of edge AI and robotics platforms, offering powerful, compact Jetson GPU-accelerated modules and the JetPack software development kit (SDK). Jetson hardware is available as developer kits or modules for system integration. JetPack provides pre-built software services to fast-track sophisticated edge AI applications, including robotics, generative AI and computer vision.

It supports all Jetson modules, delivering real-time sensor processing, visual AI, and advanced robotics features in a unified ecosystem that is seamlessly compatible with NVIDIA DGX hardware and software stacks, and the NVIDIA Omniverse platform for simulation and development of digital twins.

The latest Jetson modules brings more power to NVIDIA's three computer solution for building AI-powered robots. The first being a DGX appliance to train the AI model that will be deployed on the robot; the second an Omniverse platform to simulate how the robot will move and react in the real world; and the third a Jetson module running the model on the robot.

The NVIDIA Jetson ecosystem offers a comprehensive range of products and services, including AI software, development tools, and hardware solutions such as servers, edge appliances, and industrial PCs from certified partners such as Scan. These solutions support industries ranging from robotics and manufacturing to retail, transportation, and healthcare, with commercial and ruggedised options also available.

Embedded AI Hardware

NVIDIA Jetson modules span a wide range of performance levels and price points, making them suitable for a wide variety of autonomous applications. The two main series are Thor and Orin, although the older Xavier, TX2 and Nano models are still available for legacy projects.

NVIDIA Jetson Thor

NVIDIA Jetson Thor is available as either an AGX developer kit or a choice of two GPU modules - the T5000 and the T4000 which need system integration. With Thor, robots no longer need to be reprogrammed for each new job, as it is the ultimate platform for physical AI, providing powerful compute for generative reasoning and multimodal, multi-sensor processing. Thor can be integrated into next-generation robots to accelerate foundation models, allowing flexibility for challenges such as object manipulation, navigation and following complex instructions.

Architecture

Jetson Thor is a SoC (System on Chip), comprising a Blackwell GPU with 5th gen Tensor cores and an Arm CPU, each sharing a unified memory pool. The AGX Thor Developer Kit and T5000 module have the same specs, with the T4000 module consuming less power at the cost of lower performance. However, the power consumption of all three variants can be configured to meet your project requirements.

	Jetson AGX Thor Developer Kit and Jetson T5000	Jetson T4000
AI Performance (FP4)	2,070 TOPS	1,200 TOPS
GPU	NVIDIA Blackwell, 2,560 CUDA cores, 96 5th gen Tensor cores	NVIDIA Blackwell, 1,536 CUDA cores, 64 5th gen Tensor cores
GPU Max Frequency	1.57GHz
CPU	14-core Arm Neoverse-V3AE	12-core Arm Neoverse-V3AE
Memory	128GB LPDDR5X	64GB LPDDR5X
Networking	4x 25GbE	3x 25GbE

With its Multi-Instance GPU (MIG) technology and suite of accelerators, Thor can handle real-time video data streaming and AI inference, making it ideal for building AI agents performing video search and summarisation (VSS) tasks at the edge. Thor modules also support a wide range of generative AI models - including VLA (Vision Language Action), LLMs (Large Language Models) and VLMs (Vision-Language Models), delivering seamless cloud-to-edge integration.

Relative Performance & Capability

Jetson Thor is the most powerful inferencing module. It delivers over 7.5x higher AI compute than Jetson Orin, with 3.5x better energy efficiency.

Module	Jetson Thor T5000	Jetson Thor T4000	Jetson AGX Orin 64GB	Jetson AGX Orin 64GB Industrial	Jetson AGX Orin 32GB	Jetson Orin NX 16GB	Jetson Orin NX 8GB	Jetson Orin Nano 8GB	Jetson Orin Nano 4GB
AI performance (FP4)	2,070 TOPS	1,200 TOPS	275 TOPS	248 TOPS	200 TOPS	157 TOPS	117 TOPS	67 TOPS	34 TOPS
Memory	128GB	64GB	64GB	64GB	32GB	16GB	8GB	8GB	4GB
Power	40-130W	40-70W	15-60W	15-75W	15-40W	10-40W	10-40W	7-25W	7-25W
Dimensions	243x112mm	100x87mm	100x87mm	100x87mm	100x87mm	69.6x45mm	69.6x45mm	69.6x45mm	69.6x45mm

NVIDIA Jetson Orin

NVIDIA Jetson Orin is available either as AGX developer kits or a choice of seven GPU modules which need system integration. The compact Jetson AGX Orin Developer Kit offers maximum performance, but can also emulate any of the Jetson Orin modules; while the Jetson Orin Nano Super Developer Kit is smaller, and includes a reference carrier board compatible with all Orin NX and Orin Nano modules. All Orin modules are provided with a powerful software stack featuring pre-trained AI models, reference AI workflows and vertical application framework, accelerating end-to-end development for generative AI, as well as edge AI and robotics applications.

Architecture

Jetson Orin is a SoC (System on Chip), comprising an Ampere GPU with 3rd gen Tensor cores and an Arm CPU, each sharing a unified memory pool. With nine models to choose from, Orin is available in a wide variety of performance, power consumption and budget levels.

	Jetson AGX Orin Developer Kit	Jetson AGX Orin Industrial	Jetson AGX Orin 32GB	Jetson Orin NX 16GB	Jetson Orin NX 8GB	Jetson Orin Nano Super Developer Kit	Jetson Orin Nano 4GB
AI Performance	275 TOPS	248 TOPS	200 TOPS	157 TOPS	117 TOPS	67 TOPS	34 TOPS
GPU	NVIDIA Ampere, 2,048 CUDA cores, 64 3rd gen Tensor cores		NVIDIA Ampere, 1,792 CUDA cores, 56 3rd gen Tensor cores	NVIDIA Ampere, 1,024 CUDA cores, 32 3rd gen Tensor cores			NVIDIA Ampere, 512 CUDA cores, 16 3rd gen Tensor cores
GPU Max Frequency	1.3GHz	1.2GHz	0.9GHz	1.17GHz		1GHz
CPU	12-core Arm Cortex A78AE v8.2		8-core Arm Cortex A78AE v8.2		6-core Arm Cortex A78AE v8.2
Memory	64GB LPDDR5		32GB LPDDR5	16GB LPDDR5	8GB LPDDR5		4GB LPDDR5
Networking	10GbE			1GbE

Relative Performance & Capability

Jetson Orin modules occupy the high-end to mid-range, offering great flexibility and versatility, especially with the developer kits.

Module	Jetson Thor T5000	Jetson Thor T4000	Jetson AGX Orin 64GB	Jetson AGX Orin 64GB Industrial	Jetson AGX Orin 32GB	Jetson Orin NX 16GB	Jetson Orin NX 8GB	Jetson Orin Nano 8GB	Jetson Orin Nano 4GB
AI performance (FP4)	2,070 TOPS	1,200 TOPS	275 TOPS	248 TOPS	200 TOPS	157 TOPS	117 TOPS	67 TOPS	34 TOPS
Memory	128GB	64GB	64GB	64GB	32GB	16GB	8GB	8GB	4GB
Power	40-130W	40-70W	15-60W	15-75W	15-40W	10-40W	10-40W	7-25W	7-25W
Dimensions	243x112mm	100x87mm	100x87mm	100x87mm	100x87mm	69.6x45mm	69.6x45mm	69.6x45mm	69.6x45mm

Frequently Asked Questions FAQ

Here are some common questions and answers to help you find the information you need.

What is AI inferencing?

Inferencing is the final stage of an AI project, where a trained model is presented with unseen data or prompts to complete the task it was designed for.

What is an inference in AI?

An inferencing in AI is the output or end result of an AI model, the nature of which is determined by the data the model was trained on.

What is an inference system in AI?

An inference system is the where the trained model is presented with unseen data and asked to respond. Common examples include chatbots which inference their response based on previously analysed conversations, or image generators which create new images based on previously analysed art.

What are the two basic types of inference in AI?

The first type of inferencing is large scale, where significant GPU resource is required to deliver the requested outcome. For example, LLMs such as ChatGPT route requests back to datacentres full of servers processing many actions simultaneously.

The second type of inferencing is much more focused, where low-power embedded GPU modules are sufficient to deliver the desired outcome. Examples include the brain of an autonomous vehicle or a robot, where a scaled down version of the AI model is installed within the device in order to control how it behaves and reacts to external inputs and information.

AI Inferencing Hardware Buyers Guide

AI Inferencing Hardware Buyers Guide

Use Cases

Robotics

Hardware Abstraction

Real-Time Control Framework

Perception and Planning

High-Level Reasoning

NVIDIA AI Software Stack

Smart Cities

Smart Cities

Smart Airports

Smart Campus

Smart Venues

Smart Retail

Smart Manufacturing

NVIDIA Omniverse

NVIDIA Cosmos

NVIDIA NeMo

NVIDIA Metropolis

Healthcare

Medical Image Analysis

Digital Surgery

Digital Pathology

Patient Monitoring & Safety

Introducing NVIDIA Jetson

Embedded AI Hardware

NVIDIA Jetson Thor

Architecture

Relative Performance & Capability

NVIDIA Jetson Orin

Architecture

Relative Performance & Capability

Ready to Buy?

Need AI Training Hardware?

Frequently Asked Questions FAQ