Deep learning and AI runs best on parallel processors - only truly delivered by GPU-accelerated systems - although the recommended NVIDIA GPUs are not the same throughout the deep learning process. The power and type of GPUs required for the development, training and inferencing phases of deep learning and AI workflows, differs greatly - so the Scan AI team has put together a portfolio of systems tailored for each of these stages.
Furthermore, as an NVIDIA Elite Solution Provider you can be sure all our configurations and solutions are tried, tested and in many cases certified by NVIDIA to ensure we deliver the latest technologies, the best performance and remain cost-effective.
Development is the stage of deep learning where you work to establish models that can be taken through to full scale training. Whether working with deep learning libraries, frameworks or applications, you would typically use small sets of data repeatedly making minor tweaks and changes to see if the results look like the outcomes you are wanting - this type of work is usually not overly GPU-intensive so a workstation is often sufficient. In many cases a single GPU may be enough, although using multiple GPUs does allow for faster discovery by developing several models alongside each other - to this end Scan AI offer 3XS dev boxes with up to six NVIDIA GPUs to help develop a minimum viable product (MPV) to get started in deep learning.
Alongside these workstations are NVIDIA Data Science Workstations, similar in style but configured with professional-grade RTX GPU cards to an NVIDIA certified architecture. These provide enterprise level components designed for intensive 24/7 use and are supported by certified drivers for many applications.
Due to AI development projects being smaller repeated workloads, they may also benefit from utilising underused GPU resource across a number of systems. This is made possible by Run:AI software that effectively ‘pools’ your GPUs from multiple systems to give you a virtual centralised resource that can be allocated and segregated amongst multiple tasks or users - dynamically as demand changes and projects evolve.
Finally, although we’ve stated that AI development require GPU-accelerated hardware, it doesn’t necessarily have to be a physical system on your desk. Our 3XS AI in the Cloud service delivers the power of GPUs to any device, anywhere to make your development phase as flexible and scalable as possible offering multiple profiles that can be easily swapped between as your workloads changes.
If you’re unsure what hardware requirements may best suit your development plans, then the Scan AI team is always on hand to advise and help - don’t hesitate to get in touch.
Training is the phase of deep learning where you have identified a suitable model worth investigating further from the small datasets. You would now expand the dataset to a far greater capacity to undertake the repeated cycles of training so that the model can learn and become more accurate with each iteration. This type of intensive work on larger datasets requires much more GPU resource, so server systems offering up to eight GPUs are commonplace to accomplish these tasks.
NVIDIA certified 3XS EGX and HGX servers are the starting point for training offering professional -grade NVIDIA GPUs in two, four or eight card configurations featuring a whole host of Ampere-based accelerators tailored to specific workloads, and fully customisable with a choice of Intel Xeon or AMD EPYC CPUs and a range of memory capacities. It is also possible to choose from either Ethernet or Infiniband networking at various throughput speeds. Alternatively, we can also provide IBM Power CPU-based servers to complete our customisable training options.
For the highest demand workloads we recommend the NVIDIA range of DGX appliances - starting with the DGX A100 Station - delivering datacentre performance but in a standard desktop workstation performance, followed by the DGX A100 - the best in class AI supercomputer. At the very top of the scale multiple DGX units can be combined in a POD architecture to deliver huge performance supported by AI-optimised NVMe all-flash storage. To further ensure optimal GPU utilisation across your training infrastructure, multiple GPU systems can also benefit from pooling to become a single virtualised parallel compute resource that can be allocated and segregated dynamically using Run:AI software - regardless of the type of systems and GPUs involved.
Finally, although we’ve stated that AI training requires significant GPU-accelerated hardware, it doesn’t necessarily have to be a physical system on your premises. Our 3XS AI in the Cloud service delivers the power of datacentre-scale GPUs to any device, anywhere to make your training workflows as flexible as possible, offering multiple profiles that can be easily upgraded as your workloads demand.
If you’re unsure what hardware requirements may best suit your project, then the Scan AI team is always on hand to advise and help - don’t hesitate to get in touch.
When it comes to inferencing, the type of GPU resource required may be quite different from development or training in that a fully trained model ready for deployment in the real world doesn’t need significant power to carry out its task - whether that be on video surveillance footage, image recognition or data collection via sensors. Additionally, it may be that the inferencing device requires remote placing where access is limited so low power, zero maintenance and 4G / 5G connectivity are necessary. For this reason embedded GPU systems are often chosen as they meet this criteria, can be highly customised for specific needs such as harsh environments or extreme temperatures.
Occasionally, it may be that a model needs retraining to ensure accuracy remains at the levels required, but the full power of a datacentre is not required - for these cases we have a range of ruggedised NVIDIA EGX retraining servers that address this need.
If you’re unsure what hardware requirements may best suit your inferencing deployment, then the Scan AI team is always on hand to advise and help - don’t hesitate to get in touch.
Free Proof of Concept Trial
Any of our AI hardware systems can be tested in a secure datacentre environment, guided by our team of AI experts to ensure you get the maximum benefit and insight from your trial.LEARN MORE
As an alternative to hosting your AI hardware within your premises, the Scan AI team can arrange for your servers and other infrastructure in a number of UK and European datacentres.LEARN MORE