Duration: 8 hours
Format: Self-paced online or instructor-led
The CUDA computing platform enables the acceleration of CPU-only applications to run on the world's fastest massively parallel GPUs. Experience C/C++ application acceleration by:
Upon completion of this workshop, you'll be able to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques. You’ll understand an iterative style of CUDA development that will allow you to ship accelerated applications fast.
See GTC Pricing for more information.8 hours Pre-GTC DLI Workshops Joshua Wyatt - Content Developer, NVIDIA Deep Learning Institute, NVIDIA
At this reception, meet NVIDIA staff and other GTC alumni to get tips, especially if you're a first-timer.Special Event - 2 h Special Event
Join a random group of GTC attendees for enlightening conversations over a self-hosted dinner in great restaurants nearby. Less creepy than it sounds, this is one of the more popular programs at GTC.
Sign up in Main Lobby.Special Event - 2 h Special Event
We will answer your questions on the design and implementation of renderers based on raytracing using CUDA, and discuss how to get the best performance out of NVIDIA hardware in your renderer.
Connect with Experts are informal sessions where you can ask experts from NVIDIA and other organizations your burning questions about a specific subject.1 Hour Connect with the Experts Carsten Waechter - Ray Tracing Software Architect, NVIDIA
This lab is focused on teaching you how to maximize the productivity when developing software for the Jetson platform. You will experience first hand how to manage source code on the host PC to cross-compile the software, initiate remote debugging sessions to debug CPU C/C++ and CUDA C code. Through a comprehensive set of exercises, you will also learn how to use the CUDA Visual Profiler for optimizing CUDA kernels, use the Tegra System Profiler for optimizing CPU code and tracing multi-process system-wide activities, and use Tegra Graphics Debugger for debugging and profiling 3D graphics applications. Prerequisites: Basic CUDA-C and C++ coding skills.120 Minutes Instructor-Led Lab Sebastien Domine - VP SW Eng. Developer Tools, NVIDIA
Singularity is a container technology which is widely supported by HPC centers and service providers because it facilitates extreme mobility of compute via verifiable, trusted containers. This talk will cover a high level view of container computing and an introduction to Singularity, description of the Singularity Image Format (SIF), as well as technical recipes and usage examples with GPUs. After attending this talk, you will have a strong understanding of containerization and how to leverage this technology to create extremely reproducible workflows.50-minute Talk Gregory Kurtzer - CEO, SyLabs
Learn about recent progress in accelerating Monte Carlo simulation on the GPU in applications for pricing financial instruments and risk management. We'll focus on the forward Monte Carlo simulation, which allows for a natural parallelization across CUDA cores, and present a recent extension of our implementation to a broad selection of industry standard valuation models for different asset classes, including hybrid models that can be used to price multi-currency and multi-asset portfolios. Even with increasing complexity and dimensionality of valuation models, our benchmarks show stable GPU speedup factors in the ranges of 20x and 30x for calculations with floating point double precision FP64 and single precision FP32, respectively. We also briefly summarize a most recent research project on a more complex backward (/American / Least Squares) Monte Carlo simulation method, based on regression algorithms used to price general financial instruments with optionality. The latter method heavily relies on matrix calculations and benefits from using GPU- accelerated libraries, cuBLAS for linear algebra and cuSOLVER for solvers.25-minute Talk Serguei Issakov - Global Head of Quantitative Research and Development, Senior Vice Pres, Numerix
Recent advances in earth observation are opening up a new exciting area for exploration of satellite image data. We'll teach you how to analyse this new data source with deep neural networks. Focusing on emergency response, you will learn how to apply deep neural networks for semantic segmentation on satellite imagery. We will specifically focus on multimodal segmentation and the challenge of overcoming missing modality information during inference time. It is assumed that registrants are already familiar with fundamentals of deep neural networks.25-minute Talk Damian Borth - Director, German Research Center for Artificial Intelligence (DFKI)
DRIVE PX is an open platform for Autonomous Driving Ecosystem. It’s been adopted by over 300 partners in the automotive ecosystem to develop solutions for vehicles that are intelligent and autonomous. This talk will outline the technical challenges facing development of autonomous intelligent vehicles and provide details of how the next generation of DRIVE AI car computer i.e. DRIVE Xavier and DRIVE Pegasus address these challenges.50-minute Talk Srikanth Sundaram - Senior Product Manager DRIVE PX 2, NVIDIA
NVIDIA IndeX incorporates NVIDIA's hardware and software technology to enable interactive high-quality 3D visual exploration and real time evaluation of computed and simulated large data for a wide range of scientific fields: NVIDIA IndeX is deployed for DGX technology and can be made available as a container on the cloud, such as AWS or NGC. With NVIDIA IndeX scientists gain unique insights into unlimited size and complexity of 3D data and NV-IndeX's in-situ solution allows scientists envisioning remarkable new data simulation and visualization workflows. We present NVIDIA IndeX's CUDA programming interface for implementing novel visualization techniques, illustrates CUDA programs that produce various high-fidelity visualizations and demonstrates large-scale data visualization on the NVIDIA GPU Cloud based on custom visualization techniques.25-minute Talk Marc Nienhaus - Sr. Manager Software Engineering, NVIDIA IndeX, NVIDIA
Join us for an informative introduction to CUDA programming. The tutorial will begin with a brief overview of CUDA and data-parallelism before focusing on the GPU programming model. We will explore the fundamentals of GPU kernels, host and device responsibilities, CUDA syntax and thread hierarchy. A programming demonstration of a simple CUDA kernel will be delivered. Printed copies of the material will be provided to all attendees for each session - collect all four!80 Minutes Tutorial Dan Cyca - Chief Technology Officer, Acceleware
This session will give an overview of new methods that leverage machine learning and causal inference to enable reliable individualized decision-making. We will present applications in different areas of healthcare where real-time inference is changing the practice of medicine. The latter also gives rise to new challenges in developing human-machine collaborative systems.25-minute Talk Suchi Saria - John C. Malone Assistant Professor, Johns Hopkins University
Attend this session to get your questions on deep learning basics and concepts answered. NVIDIA experts can help you with the fundamentals and provide guidance on how and when to apply Deep Learning and GPUs to your work. No question is too basic to ask.
Connect with Experts are informal sessions where you can ask experts from NVIDIA and other organizations your burning questions about a specific subject.1 Hour Connect with the Experts Rajan Arora - Solution Architect, NVIDIA
Toyota Research Institute's (TRI) mission is to improve the quality of human life through advances in artificial intelligence, automated driving, and robotics. Learn more about their research and how they are using AWS EC2 P3 instances, industry's most powerful GPUs instances, in combination with other AWS services to enable autonomous vehicles and robots at scale.50-minute Talk Chetan Kapoor - Senior Product Manager - EC2, Amazon Web Services
Explore how auditors are applying deep learning to detect "anomalous" records in large volumes of accounting data. The Association of Certified Fraud Examiners estimates in its Global Fraud Study 2016 that the typical organization loses 5% of its annual revenues due to fraud. At the same time, organizations accelerate the digitization of business processes affecting Enterprise Resource Planning (ERP) systems. These systems collect vast quantities of electronic journal entry data in general- and sub-ledger accounts at an almost atomic level. To conduct fraud, perpetrators need to deviate from regular system usage or posting pattern. This deviation will be weakly recorded and reflected accordingly by a very limited number of "anomalous" journal entries of an organization. To anomalous journal entries, several deep auto-encoder networks are trained using NVIDIA's DGX-1 system. The empirical evaluation on two real-world accounting datasets underpinned the effectiveness of the trained network in capturing journal entries highly relevant for a detailed audit while outperforming several baseline methods.25-minute Talk Marco Schreyer - Researcher, German Research Center for Artificial Intelligence
What is Deep Learning? In what fields is it useful? How does it relate to artificial intelligence? We'll discuss deep learning and why this powerful new technology is getting so much attention, learn how deep neural networks are trained to perform tasks with super-human accuracy, and the challenges organizations face in adopting this new approach. We'll also cover some of the best practices, software, hardware, and training resources that many organizations are using to overcome these challenges and deliver breakthrough results.50-minute Talk William Ramey - Director, Developer Programs, NVIDIA
Over the last couple of years, neural nets have enabled significant breakthroughs in computer vision, voice generation and recognition, translation, and self-driving cars. Neural nets will also be a powerful enabler for future game development. We'll give an overview of the potential of neural nets in game development, as well as provide an in-depth look at how we can use neural nets combined with reinforcement learning for new types of game AI. We will also show some new exciting results from applying deep reinforcement learning to AAA games.50-minute Talk Magnus Nordin - Technical Director, Electronic Arts / SEED
Two years after release, Vulkan is a mature and full-featured low-level graphics API, with significant adoption in the developer community.
NVIDIA will present a status update on our Vulkan software stack. We will cover latest Vulkan developments, including extensions, software libraries and tools. We will also cover best practices and lessons learned from our own work with the Vulkan API in the past year.50-minute Talk Nuno Raposo Subtil - Senior Software Engineer, NVIDIA
As deep learning techniques have been applied to the field of healthcare, more and more AI-based medical systems continue to come forth, which are accompanied by new heterogeneity, complexity and security risks. In the real-world we've seen this sort of situation lead to demand constraints, hindering AI applications development in China's hospitals. First, we'll share our experience in building a unified GPU accelerated AI engine system to feed component-based functionality into the existing workflow of clinical routine and medical imaging. Then, we'll demonstrate in a pipeline of integrating the different types of AI applications (detecting lung cancer, predicting childhood respiratory disease and estimating bone age) as microservice to medical station, CDSS, PACS and HIS system to support medical decision-making of local clinicians. On this basis, we'll describe the purpose of establishing an open and unified, standardized, legal cooperation framework to help AI participants to enter the market in China to build collaborative ecology.25-minute Talk Xu Chen - Director of AI Research, Winning Health
We'll present an in-car ADAS technology to detect drowsy driving. This technique can be used to alert and awaken the driver, or take corrective actions if required. We employ a CNN-based approach for this technique, which is trained on a mix of synthetic and real images. We'll cover the details of the detection system pipeline and the synthetic dataset generation. We'll also show a demonstration of the detection system in action.25-minute Talk Sidharth Varier - Senior System Software Engineer, NVIDIA
Detecting objects, whether they're pedestrians, bicyclists, or other vehicles, at a traffic intersection is essential to ensure efficient traffic flow and the safety of all participants. We'll present an experiment to assess training and real-time inference of a NVIDIA Tegra X1 SoC module with a suite of GigE Flea3 Point Grey cameras installed on a vehicle. The system is to be trained using a subset of data collected from different types of busy intersections on a university campus and testing is to be done on the remaining data. We'll use a deep generative model that can learn and reconstruct the traffic scene. We'll share our CUDA optimization strategies on the Tegra X1 and the real-time performance of the inference model.25-minute Talk Menna El-Shaer - Doctoral Student/Researcher, The Ohio State University
Protecting crew health is a critical concern for NASA in preparation of long duration, deep-space missions like Mars. Spaceflight is known to affect immune cells. Splenic B-cells decrease during spaceflight and in ground-based physiological models. The key technical innovation presented by our work is end-to-end computation on the GPU with the GPU Data Frame (GDF), running on the DGXStation, to accelerate the integration of immunoglobulin gene-segments, junctional regions, and modifications that contribute to cellular specificity and diversity. Study results are applicable to understanding processes that induce immunosuppression—like cancer therapy, AIDS, and stressful environments here on earth.25-minute Talk Venkat Krishnamurthy - Head, Product Management, MapD Technologies
The yield curve provides the information of bonds' returns of various maturities, and reflects extremely complex market interactions and monetary policy. The yield curve constructing models, such as the Spline Fitting Model, use a number of bond sample points and model parameters to deduce the yield curve. It involves repeated experiments by choosing appropriate bond samples which rely highly on manual operation. Due to the amount of relevant information and rapid growth of transaction data, this task becomes even more challenging. Some literatures show that deep learning can detect and exploit interactions in the data that are, invisible to any existing financial economic theory. By discovering latent patterns in historical data, it can be a good supplement for choosing active samples and assessing curve's quality. In financial applications, accuracy and speed are both of critical importance. The GPU is applied to both deep learning framework and yield curve construction. Intelligent, fast and accurate, our yield curve construction framework achieves 5x speed up vs manual operation, and provides a feasible way for future practice.25-minute Talk Joe Zhang - Project Manager, Shanghai Clearing House
In this session, you will learn how Google Cloud helps enterprises make the most out of data, and deliver customer value. We will provide an in-depth overview of the Cloud AI and Data Analytics offering that helps enterprises manage their ML lifecycle, from data ingestion to insights and prediction. We will also demonstrate some breakthrough solutions, like AutoML, that are making ML accessible to everyone.50-minute Talk Chris Kleban - Product Manager, GPUs on Google Cloud, Google Inc.
Explore the memory model of the GPU! This session will begin with an essential overview of the GPU architecture and thread cooperation before focusing on the different memory types available on the GPU. We will define shared, constant and global memory and discuss the best locations to store your application data for optimized performance. Features such as shared memory configurations and Read-Only Data Cache are introduced and optimization techniques discussed. A programming demonstration of shared and constant memory will be delivered. Printed copies of the material will be provided to all attendees for each session - collect all four!80 Minutes Tutorial Dan Cyca - Chief Technology Officer, Acceleware
Data fuels so much of our lives. It accelerates our conversations, our decisions, our very ideas. And in the physical world, data is already acting as an accelerant to how we take these ideas and make them real. As the things we make become increasingly connected, our world becomes increasingly computable. And anything that becomes easily computable, becomes equally mutable. What does this mean for the world we live in? As we let go of our design tools and hand more control to intelligent algorithms, we'll see this reflected in the real world: the world of self-driving everything.25-minute Talk Radha Mistry - Story Strategist, Autodesk
We'll show how recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. FCNs can be trained to automatically segment 3D medical images, such as computed tomography (CT) scans based on manually annotated anatomies like organs and vessels. The presented methods achieve competitive segmentation results while avoiding the need for handcrafting features or training class-specific models, in a clinical setting. We'll explain a two-stage, coarse-to-fine approach that will first use a 3D FCN based on the 3D U-Net architecture to roughly define a candidate region. This candidate region will then serve as input to a second 3D FCN to do a fine prediction. This cascaded approach reduces the number of voxels the second FCN has to classify to around 10 percent of the original 3D medical image, and therefore allows it to focus on more detailed segmentation of the organs and vessels. Our experiments will illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results on many datasets. Code and trained models will be made available.25-minute Talk Holger Roth - Assistant Professor (Research), Nagoya University
We're sure you heard about Citrix's HDX and VMware's Blast Extreme protocol. Maybe you know about different codecs like H.264, H.265/HEVC, VP9, AV-1 or MJPEG and 2DRLE. We'd like to give you some insights what codec technology can be used in which remoting protocol and what you can expect in regards to density, image quality and granularity when configuring these codecs. What do you think is better, Adaptive Display V2 or Full screen H.264. Better use YUV 4:2:0 or YUV 4:4:4 for H.264 ? PCoIP or Blast Extreme ? Should you use NVenc or not, what options available in VDI and RDSH, on Kepler, Maxwell and Pascal ? You probably want to ask what's recommended to use ? As always: It depends. So please join our session to learn more and discuss about the pros & cons of the available technologies and how you can make the best out of it four YOUR deployment25-minute Talk Simon Schaber - GRID Solution Architect, NVIDIA
We'll present a deep learning system able to decide if two people are similar or not. This system use the global appearance of a person, not just the face, to perform the re-identification. Our system also provides attributes (top color, bottom color, genre, length of the clothes, and the hair). We'll describe how to train a system with tensorflow on a GPU cluster and how to use it on a global video analysis system running on GPU devices.25-minute Talk Matthieu Ospici - AI Engineer, Atos