DLI – Fundamentals of Accelerated Computing with CUDA C/C++

We would like to invite you to attend an online DLI workshop

Scan invites you to join an online instructor-led workshop focusing on the Fundamentals of Accelerated Computing with CUDA C/C++.

In this course you’ll learn the fundamental tools and techniques for accelerating C/C++ applications to run on massively parallel GPUs with CUDA.

Logo
Logo
NVIDIA Deep Learning Institute Online Workshop

What is the NVIDIA Deep Learning Institute

The NVIDIA Deep Learning Institute (DLI) offers hands-on training in AI, accelerated computing and accelerated data science. Developers, data scientists, researchers and students can get practical experience powered by GPUs in the cloud and earn a certificate of competency to support professional growth.

Register Interest

In this course you will:

  • Write code to be executed by a GPU accelerator
  • Expose and express data and instruction-level parallelism in C/C++ applications using CUDA
  • Utilise CUDA-managed memory and optimise memory migration using asynchronous prefetching
  • Leverage command-line and visual profilers to guide your work
  • Utilise concurrent streams for instruction-level parallelism
  • Write GPU-accelerated CUDA C/C++ applications, or refactor existing CPU-only applications, using a profile-driven approach
Agenda
Introduction
(15 Mins)
Accelerating Applications with CUDA C/C++
(120 Mins)
Learn the essential syntax and concepts to be able to write GPU-enabled C/C++ applications with CUDA:
  • Write, compile, and run GPU code.
  • Control parallel thread hierarchy.
  • Allocate and free memory for the GPU.
Break
(60 Mins)
--------
Managing Accelerated Application Memory with CUDA C/C++
(120 Mins)
Learn the command-line profiler and CUDA-managed memory, focusing on observation-driven application improvements and a deep understanding of managed memory behavior:
  • Profile CUDA code with the command-line profiler.
  • Go deep on unified memory.
  • Optimize unified memory management.
Break
(15 Mins)
--------
Asynchronous Streaming and Visual Profiling for Accelerated Applications with CUDA C/C++ (120 Mins) Identify opportunities for improved memory management and instruction-level parallelism:
  • Profile CUDA code with NVIDIA Nsight Systems.
  • Use concurrent CUDA streams.
Final Review
(15 Mins)
  • Review key learnings and wrap up questions.
  • Complete the assessment to earn a certificate.
  • Take the workshop survey.

£75 inc VAT per person


Certification

Participants can earn certification to prove subject matter competency and support professional career growth. Certification is offered for all instructor-led workshops.

scan computers ai department logo

01204 474210

Contact our AI team

Call us on 01204 474210