No
Yes
View More
View Less
Working...
Close
OK
Cancel
Confirm
System Message
Delete
My Schedule
An unknown error has occurred and your request could not be completed. Please contact support.
Scheduled
Scheduled
Wait Listed
Personal Calendar
Speaking
Conference Event
Meeting
Interest
There aren't any available sessions at this time.
Conflict Found
This session is already scheduled at another time. Would you like to...
Loading...
Please enter a maximum of {0} characters.
{0} remaining of {1} character maximum.
Please enter a maximum of {0} words.
{0} remaining of {1} word maximum.
must be 50 characters or less.
must be 40 characters or less.
Session Summary
We were unable to load the map image.
This has not yet been assigned to a map.
Search Catalog
Reply
Replies ()
Search
New Post
Microblog
Microblog Thread
Post Reply
Post
Your session timed out.
This web page is not optimized for viewing on a mobile device.Visit this site in a desktop browser or download the mobile app to access the full set of features.
Uppercase Letter
Lowercase Letter
Uppercase or Lowercase Letter
Number
Special Character
Password length of
or more and have
of the following:
Password 2 does not match password.
2018 GTC San Jose
Favorite
Remove from My Interests

S8373 - MVAPICH2-GDR: Pushing the Frontier of Designing MPI Libraries Enabling GPUDirect Technologies

Session Speakers
Session Description

Learn about the latest developments in the high-performance mass passing interference (MPI) over InfiniBand, iWARP, and RoCE (MVAPICH2) library that simplify the task of porting MPI applications to supercomputing clusters with NVIDIA GPUs. MVAPICH2 supports MPI communication directly from GPU device memory and optimizes it using various features offered by the CUDA toolkit, providing optimized performance on different GPU node configurations. These optimizations are integrated transparently under standard MPI API, for better programmability. Recent advances in MVAPICH2 include designs for MPI-3 RMA using GPUDirect RDMA framework for MPI datatype processing using CUDA kernels, support for GPUDirect Async, support for heterogeneous clusters with GPU and non-GPU nodes, and more. We use the popular Ohio State University micro-benchmark suite and example applications to demonstrate how developers can effectively take advantage of MVAPICH2 in applications using MPI and CUDA/OpenACC. We provide guidance on issues like processor affinity to GPU and network that can significantly affect the performance of MPI applications that use MVAPICH2.


Additional Information
Supercomputing, Tools and Libraries
Cloud Services, General, Software
Intermediate technical
Talk
50 minutes
Session Schedule
    Do Not Sell My Personal Information
    First name
    Last name
    Email address