DLI – Fundamentals of Accelerated Computing with CUDA C/C++

We would like to invite you to attend an online DLI workshop

Scan invites you to join an online instructor-led workshop focusing on the Fundamentals of Accelerated Computing with CUDA C/C++.

In this course you’ll learn the fundamental tools and techniques for accelerating C/C++ applications to run on massively parallel GPUs with CUDA.

NVIDIA Deep Learning Institute Online Workshop

What is the NVIDIA Deep Learning Institute

The NVIDIA Deep Learning Institute (DLI) offers hands-on training in AI, accelerated computing and accelerated data science. Developers, data scientists, researchers and students can get practical experience powered by GPUs in the cloud and earn a certificate of competency to support professional growth.

In this course you will:

Write code to be executed by a GPU accelerator
Expose and express data and instruction-level parallelism in C/C++ applications using CUDA
Utilise CUDA-managed memory and optimise memory migration using asynchronous prefetching
Leverage command-line and visual profilers to guide your work
Utilise concurrent streams for instruction-level parallelism
Write GPU-accelerated CUDA C/C++ applications, or refactor existing CPU-only applications, using a profile-driven approach

Agenda
Introduction (15 Mins)	Meet the instructor Create an account at courses.nvidia.com/join
Accelerating Applications with CUDA C/C++ (120 Mins)	Learn the essential syntax and concepts to be able to write GPU-enabled C/C++ applications with CUDA: Write, compile, and run GPU code. Control parallel thread hierarchy. Allocate and free memory for the GPU.
Break (60 Mins)	--------
Managing Accelerated Application Memory with CUDA C/C++ (120 Mins)	Learn the command-line profiler and CUDA-managed memory, focusing on observation-driven application improvements and a deep understanding of managed memory behavior: Profile CUDA code with the command-line profiler. Go deep on unified memory. Optimize unified memory management.
Break (15 Mins)	--------
Asynchronous Streaming and Visual Profiling for Accelerated Applications with CUDA C/C++ (120 Mins)	Identify opportunities for improved memory management and instruction-level parallelism: Profile CUDA code with NVIDIA Nsight Systems. Use concurrent CUDA streams.
Final Review (15 Mins)	Review key learnings and wrap up questions. Complete the assessment to earn a certificate. Take the workshop survey.

£75 inc VAT per person

Certification

Participants can earn certification to prove subject matter competency and support professional career growth. Certification is offered for all instructor-led workshops.

01204 474210