Deep learning creates advances following a virtuous recipe: model architecture search, creating large training datasets, and scaling computation. Baidu Research's Silicon Valley AI Lab develops state-of-the-art conversational user interfaces following this DL recipe. We research new model architectures and features for speech recognition (Deep Speech 3), speech generation (Deep Voice 3), and natural language processing. To deploy these models in impactful products, we want a deep understanding of how recipe components coordinate to drive accuracy improvements. Through large-scale empirical studies, we find intriguing results about how deep learning is likely to scale: As training set size increases, DL model generalization error and model sizes scale as particular power-law relationships. For a fixed dataset size, as model size grows, training time remains roughly constant -- larger models require fewer steps to converge to the same accuracy. These scaling relationships have significant implications on DL research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about dataset growth and future computing system design.