For the high-performance computing market (HPC) Scan 3XS offers a range of GPU servers, professionally built to the very highest standards, based on NVIDIA’s ground-breaking Tesla GPU accelerator cards.
For any organisation looking for outstanding, no-compromise performance, then an NVIDIA Tesla-based server will meet your requirements.
NVIDIA’s Tesla is an enterprise-class GPGPU (General Purpose GPU), designed to handle the intense computational workloads that are typical of the scientific and industrial industries such as Deep Learning, Oil and Gas, Rendering and Material Sciences.
For the deeply parallel computations required by these and many other industries, the large number of stream processor cores in NVIDIA processors makes them highly suitable for these tasks, delivering performance that is up to 10x greater than what is possible from conventional CPU-based servers.
Whereas CPUs generally consist of several cores that can perform tasks very fast sequentially, NVIDIA’s GPU consist of thousands of smaller but more powerful cores that can execute tasks simultaneously. By way of analogy, instead of creating a picture one stroke at a time, a GPU can create the same picture in one single brush stroke.
NVIDIA Tesla GPUs power some of the world’s most powerful supercomputers, such as the Titan at Oak Ridge National Laboratory.
CUDA is a programming language based on the widely familiar C+ and is optimised to unlock the full power of the parallel architecture of NVIDIA’s Tesla GPUs.
There are now over 400 GPU-enabled applications available for CUDA in the high performance computer (HPC) space and growing, along with over 300,000 developers.
CFD is the study of fluid flows. It uses numerical analysis and algorithms to solve and analyse problems associated with simulating the interaction of liquids and gases with surfaces. Using the Lattice Boltzmann method of performing fluid simulations, the original C870 Tesla processor was able to offer speedups that were several orders of magnitude greater than performance compared to CPUs in the time.
Tesla 8-Series GPU
NEC SX6+ (565 MHz)
Intel Itanium 2 (1.4 GHz)
Intel Xeon (3.4 GHz)
Million Lattice updates / sec (MLUPs)
The following Scan 3XS GPU servers have been optimised for CFD calculations.
In 2009, NVIDIA ported Weta’s PantaRay engine to a CUDA-based Tesla GPU to help create the lush world of Pandora for James Cameron’s ground-breaking movie Avatar, which ushered in the modern 3D movie era. The NVIDIA Tesla S1070 GPU-based server used was some 25 times faster than a conventional CPU based server.
The graph below shows the rendering performance you can expect to see in 3ds Max when rendering using the NVIDIA Iray plugin on various CPUs and GPUs. The scores are shown as relative to a single Intel Xeon E5 CPU.
Dual Xeon E5 CPUs
Single Xeon CPU
The following Scan 3XS GPU servers have been optimized for 3D rendering.
More recently, benchmarks have shown NVIDIA’s Tesla K80 GPU to be two to five times faster in key HPC scientific applications, compared to an Intel Xeon Phi processor. Popular biopolymer molecular dynamics simulation packages such as AMBER, NAMD, GROMACS and LAMMPS are all optimised for NVIDIA Tesla GPUs.
Monte Carlo RNG DP
Binomial Options SP
Tesla K80 GPU
Xeon Phi 7120
The following Scan 3XS GPU servers have been optimised for HPC.
Deep Learning, also known as machine learning is a fast-growing field of artificial intelligence, using computers to process the huge amounts of information that is increasingly becoming available to a variety of industries. The idea is to enable computers to learn in a way that mimics how the human brain learns.
In the past three years NVIDIA’s ever advancing GPUs have made significant advances in performance in Caffe, a deep learning framework developed by the Berkeley Vision and Learning Center.
M40+ cuDNN4 (2015)
M40+ cuDNN3 (2015)
K40+ cuDNN1 (2014)
Using computers to perform analytics on ‘big data’ is boosting research and delivering a competitive advantage in areas such as Manufacturing, Defence, Life Sciences and Automotive, to name but a few.
The following Scan 3XS GPU servers have been optimised for Deep Learning.
The Tesla K40 has 2880 CUDA cores, a clock speed of 745MHz and features 12GB of GDDR5 of memory. This helps it deliver up to 4.29 teraFLOPS of single-precision throughput and uses a passive heat sink for cooling, making it suitable for denser clusters.
The K80 features two GK210B Tesla chips on one board, totalling 4992 CUDA cores, a base core clock of 560 MHz and 24GB of GDDR5 memory, offering a peak of 8.74 teraFLOPS of single-precision power.
Tesla K80 GPU
Tesla K40 GPU
Based on the Maxwell architecture, the M4 is ideally suited to meet the needs of the hyperscale data centre, where the numbers of servers need to be increased on the fly in response to sudden increased demand. This is usually done to support big data and cloud service environments.
The M4 is also ideal for executing the various machine learning tasks associated with Deep Learning.
The M4 is a compact, low-powered card that’s optimised for use in dense clusters of up to eight GPUs. It offers 1024 CUDA cores, 4 GB of GDDR5 memory, and a base clock of 872 MHz.
The M40 is the fastest solution at training neural networks for Deep Learning and other systems. It has 3072 CUDA cores, 948 MHz base clock, and 12GB of GDDR5 memory.
GPU Server with x4 Tesla M40
Take a look at our range of HPC-ready NVIDIA Tesla servers here.
Thanks to the power of the latest NVIDIA hardware many enterprises are now able to run full virtualised desktops for their users. Virtualisation brings the power of a data centre to even the thinnest of clients.
It also enables IT departments to manage desktops on a large scale from a single console interface, making upgrades and deployments much simpler.
Thanks to NVIDIA GRID technology, high-performance graphics are now available to students or engineers, architects and doctors in a fully virtualised environment.
GRID can also be used to render and stream games from the cloud, providing users with high-quality, low latency gaming. It means console quality gaming, without a console.
The M6 is a compact, lower-power solution, using the MXM format making it suitable for installation in high-density blade servers. It features 1536 CUDA cores and 8GB of GDDR5 memory and can support up to 16 vGPU users.
The Tesla M60, is a full-size dual-GM204 card featuring two GM204 GPUs. It features a total of 4096 CUDA cores and 16GB of GDDR5. It can support 32 concurrent vGPU users, or 16 users per GM204 GPU.
Click here to see some of our GRID-ready NVIDIA Tesla servers.
For the ultimate in performance you should consider a server based on the latest generation Tesla P100 based on the Pascal architecture. The P100 is NVIDIA’s most advanced accelerator, design to work on the intense computational workloads demanded by those working on Artificial intelligence for self-driving cars or predicting our climate's future.
The P100 can deliver more than 21 teraFLOPS of FP16 performance or either 5 and 10 teraFLOPS of double and single precision performance for HPC workloads.
The P100 is the first GPU of any kind to support CoWoS (Chip-on-Wafer-on-Substrate) with HBM2 technology to deliver 3x the memory performance compared to the NVIDIA Maxwell architecture. HBM2 is ECC rated for error correction to ensure the highest levels of system accuracy and reliability.
HBM (High Bandwidth Memory) delivers higher bandwidth, than conventional GPU memory such as DDR4 or GDDR5, all while using less power in a substantially smaller form factor.
With the P100 NVIDIA has introduced an entirely new interconnect technology called NVLink. This is a point-to-point connection between a CPU and a GPU and also between a GPU and another GPU.
As the speed of GPUs increases, the PCI-E interface was becoming a bottleneck. As such, a new interconnect was required and NVLink is the solution. It enables data to move between GPUs and CPUs at five times the bandwidth of PCI Express.
NVLink-enabled Tesla P100s for servers offer a peak interconnect speed of 160GB/s, compared to 32GB/s for PCI-E.
To protect your investment in existing technologies, while still taking advantage on NVIDIA’s latest architecture the P100 is also available with a PCI-E interface. A single GPU-accelerated node powered by four Tesla P100s interconnected with PCI-E could replace up to 32 commodity CPU nodes for many HPC applications.
Using fewer powerful nodes, while performing the same or greater number of tasks, enables you to save up to 70 per cent of overall data centre costs.
|Tesla P100 for PCI-E Based Servers||Tesla P100 for NVLink Optimised Servers|
|Double-Precision Performance||4.7 teraFLOPS||5.3 teraFLOPS|
|Single-Precision Performance||9.3 teraFLOPS||10.6 teraFLOPS|
|Half-Precision Performance||18.7 TeraFLOPS||21.2 TeraFLOPS|
|NVIDIA NVLink Interconnect Bandwidth||-||160 GB/s|
|PCIe x16 Interconnect Bandwidth||32 GB/s||32 GB/s|
|CoWoS HBM2 Stacked Memory Capacity||16 GB or 12 GB||16 GB|
|CoWoS HBM2 Stacked Memory Bandwidth||720 GB/s or 540 GB/s||720 GB/s|
|Enhanced Programmability with Page Migration Engine||Yes||Yes|
|ECC Protection for Reliability||Yes||Yes|
|Server-Optimised for Data Centre Deployment||Yes||Yes|
Tesla GPU are a great fit for the enterprise, not just because of their outstanding performance, but also due to having the support of a great ecosystem of tools.
There are a number of tools for managing your Tesla GPU cluster and scheduling jobs, such as IBM Platform HPC, Bright Cluster Manager and Moab Cluster Suite, to name just a few.
To maximise uptime of your Tesla GPU server a complete suite of enterprise-grade tools is available. The NVIDIA Data Center GPU Manager provides IT managers with the ability to implement system policies, monitor GPU health, diagnose system events, and monitor data centre throughput.
Tesla servers are available in either tower, or 1U, 2U, 3U or 4U rack configurations, depending on the GPU specification and usage.
Scan 3XS offers a range of reliable, high performance server solutions, fully customisable to your needs.