No
Yes
View More
View Less
Working...
Close
OK
Cancel
Confirm
System Message
Delete
My Schedule
An unknown error has occurred and your request could not be completed. Please contact support.
Scheduled
Wait Listed
Personal Calendar
Speaking
Conference Event
Meeting
Interest
There aren't any available sessions at this time.
Conflict Found
This session is already scheduled at another time. Would you like to...
Loading...
Please enter a maximum of {0} characters.
{0} remaining of {1} character maximum.
Please enter a maximum of {0} words.
{0} remaining of {1} word maximum.
must be 50 characters or less.
must be 40 characters or less.
Session Summary
We were unable to load the map image.
This has not yet been assigned to a map.
Search Catalog
Reply
Replies ()
Search
New Post
Microblog
Microblog Thread
Post Reply
Post
Your session timed out.
This web page is not optimized for viewing on a mobile device.Visit this site in a desktop browser or download the mobile app to access the full set of features.
GTC 2018 Silicon Valley

S8654 - State-of-the-Art Large Scale Language Modeling in 12 Hours With a Single GPU

Session Speakers
Session Description

For sequence learning tasks that utilize recurrent neural networks, scale is both the key to accuracy and the bane of speed. We'll take existing state-of-the-art language modeling techniques and speed them up by orders of magnitude without losing accuracy. The tactics include injecting flexibility into NVIDIA's black box cuDNN LSTM; replacing the LSTM with the more parallelized and customizable Quasi-Recurrent Neural Network; reducing the softmax bottleneck using the adaptive softmax; and investigating individual function efficiency on the GPU using the NVIDIA Visual Profiler. The end result is a general and scalable language model framework that can achieve state-of-the-art quality on the WikiText-103 dataset (103 million words) in under 12 hours using a single NVIDIA Volta V100. The resulting PyTorch codebase is open source for experimentation and extension.


Additional Information
Performance Optimization, Deep Learning and AI Frameworks
Higher Education / Research
Intermediate technical
Talk
50 minutes
Session Schedule