Data Science Workstations Buyers Guide
What is a Data Science Workstation?
Powered by the latest NVIDIA GPU Accelerators, Data Science Workstations are high performance PCs that enable data scientists to develop and debug models and create a Minimum Viable Product (MVP) with their data sets. Data Science Workstations are built using enterprise-grade hardware for maximum reliability, leading to faster business insights and deployments.
Scan is an Elite Solution Provider for NVIDIA DGX Systems, has a dedicated AI support team including data scientists, and has developed a unique range of Data Science Workstations. This page will guide you through what to consider when choosing a Data Science Workstation.
Try before you buy
Scan Data Science Workstations can be evaluated online via a Proof of ConceptBook a Test Drive
Quadro RTX 6000
The RTX 6000 is the first GPU we recommend for a Data Science Workstation. It is based on the latest Turing architecture and has a great combination of CUDA cores, Tensor cores and a generous 24GB of memory.
Quadro RTX 8000
The next GPU up in the range, the RTX 8000, has the same number of CUDA cores and Tensor cores, but doubles the memory to 48GB, which will enable to you work with much larger datasets than on a RTX 6000.
The third GPU, the GV100, is based on the Volta architecture, and while somewhat slower at single precision (FP32) is significantly faster at double precision (FP64) and so may be a better choice for some workloads.
The GPU Accelerator is the most important component in a Data Science Workstation as it is the main driver for rapid processing and accuracy in your model development and training. We recommend enterprise-grade NVIDIA Quadro GPU accelerators, as unlike consumer graphics cards, as they are designed and built for sustained use and so provide maximum reliability. The latest Turing-architecture Quadro GPUs include Tensor cores which are specifically designed to accelerate workloads, while the NVLink bus allows the VRAM on multiple GPUs to appear as a single ultra-fast memory pool to your applications.
The following table highlights the key specifications of the three Quadros GPU we recommend in our Data Science Workstations.
|NVIDIA Quadro RTX 6000||NVIDIA Quadro RTX 8000||NVIDIA Quadro GV100|
|Memory||24GB GDDR6||48GB GDDR6||32GB HBM2|
We recommend and pre-install the Ubuntu 18.04 operating system plus a custom software stack built on NVIDIA CUDA-X that includes over 15 GPU-optimised libraries. Other operating systems are available on request.
The host processor or CPU plays an important role in a Data Science Workstation in the data prep stage. We recommend enterprise-grade Intel Xeon processors in our systems as they support ECC Registered memory for maximum reliability. Our single-GPU workstations include a single Xeon-W CPU, which are available with up to 18 cores / 36 threads. Our dual-GPU workstations are powered by a pair of Xeon Scalable processors with a total of up to 48 cores / 96 threads.
While having sufficient VRAM on the GPU accelerator is critically important, system performance will be crippled without adequate optimised system memory. As already mentioned our Data Science Workstations feature ECC Registered memory. ECC stands for Error Correcting Code, and means that the memory can detect and fix data corruption on the fly. We recommend 128GB of quad-channel RAM in our single-GPU workstations and 192GB of six-channel RAM for our dual-GPU workstations, although both types of system can support more memory should this be required.
There’s no point in having the fastest and most powerful GPU accelerators, CPUs and system memory if they are starved for data. We recommend the latest high performance NVMe SSDs in our Data Science Workstations, which with a typical read speed of over 3000MB/sec are approximately 500% faster than a SATA SSD and 1900% faster than a traditional HDD. That said, we recognise that you may need to store old projects and documents on your workstation, and an HDD is ideal for this use as they are very cost effective.
Moving data between different systems can be a time consuming process, so to make the most of the fast data processing capabilities we pre-install 10GbE NICs in our Data Science Workstations. 10GbE has the added advantage of being compatible with twisted-pair copper CAT 6/6a or CAT7 cabling with RJ45 connectors, so in most offices you won’t need to install new cabling, just a new switch. Scan is a partner with Intel and Mellanox, and can provide faster NICs such as 25/50/100GbE on request.
Cooling and Power
GPU accelerators consume a lot of power so Scan’s Data Science Workstations are equipped with high-quality 80PLUS Gold power supplies, ensuring a reliable and efficient power source for the system. In addition, the cooling system of each workstation is optimised to ensure consistent results each and every time.
Which Data Science Workstation is right for me?
Pre-configured Data Science WorkstationsWe have designed and built a range of pre-configured and ready to ship Data Science Workstations. Each systems has been optimised to provide the best possible performance in deep learning and machine learning workflows at different price points.
3XS Data Science Workstations
How fast are Scan’s Data Science Workstations for deep learning?
We have benchmarked our pre-configured Data Science Workstations in two of the most popular frameworks, Tensorflow and PyTorch, so you can compare the performance of the workstations against each other.
The results from benchmarking Tensorflow, which are displayed as the rate of images per second, and PyTorch, which are displayed as the rate of Toks per second, clearly show the massive speed advantage the dual-GPU workstations (Q280X, Q264X, Q248X) have over the single-GPU workstations (Q136X, Q120X, Q116X).
A word of caution interpreting the results. You would expect the training rate for the workstations with the RTX 8000 GPUs to be faster than the workstations with the RTX 6000 GPUs. The reason this is not the case in the graphs is that the data we’re using isn’t large enough to take advantage of the extra VRAM the RTX 8000 has compared to the RTX 6000. And because the RTX 8000 runs at a slightly slower clock speed than the RTX 6000, the end result is a slightly slower training rate with a small data sets. However, if you are working with a larger data set the RTX 8000 would be much faster than the RTX 6000 as it has double the VRAM.
Browse the Range of Scan Data Science Workstations
Alternatively, if you can’t see the exact spec you’d like our online configurator allows you to pick and choose the components, and we will build a Data Science Workstation to your requirements.Configurable Data Science Workstations