Lightning 1.1 reveals Sharded Training — train deep learning models on multiple GPUs saving over 50% on memory, with no performance loss or code change required!

Image By Author

In a recent collaboration with Facebook AI’s FairScale team and PyTorch Lightning, we’re bringing you 50% memory reduction across all your models. Our goal at PyTorch Lightning is to make recent advancements in the field accessible to all researchers, especially when it comes to performance optimizations. Together with the FairScale team, we’re excited to introduce our beta for Sharded Training with PyTorch Lightning 1.1.

Training large neural network models can be computationally expensive and memory hungry. …

Hands-on Tutorials

Train state-of-the-art speech recognition, NLP and TTS models at scale with NeMo and Lightning

Image by author

NeMo (Neural Modules) is a powerful framework from NVIDIA, built for easy training, building and manipulating of state-of-the-art conversational AI models. NeMo models can be trained on multi-GPU and multi-node, with or without Mixed Precision, in just 3 lines of code. Continue reading to learn how to use NeMo and Lightning to train an end-2-end speech recognition model on multiple GPUs and how you can extend the NeMo models for your own use case, like fine-tuning strong pre-trained ASR models on Spanish audio data.

In this article we’ll highlight some of the great features within NeMo, steps to building your…

Reduce cost and horizontally scale deepspeech.pytorch using TorchElastic with Kubernetes.

End-to-End Speech To Text Models Using Deepspeech.pytorch

Deepspeech.pytorch provides training, evaluation and inference of End-to-End (E2E) speech to text models, in particular the highly popularised DeepSpeech2 architecture. Deepspeech.pytorch was developed to provide users the flexibility and simplicity to scale, train and deploy their own speech recognition models, whilst maintaining a minimalist design. Deepspeech.pytorch is a lightweight package for research iterations and integrations that fills the gap between audio research and production.

Scale Training Horizontally Using TorchElastic

Training production E2E speech-to-text models currently requires thousands of hours of labelled transcription data. In recent cases, we see numbers exceeding 50k hours of labelled audio…

Allen Institute as a part of their open research efforts released a data dump of scholarly articles as an initiative to aid efforts in tackling COVID-19. This dataset contains 51,000 articles as of the date this article being written and is increasing in size.

When searching the data, key word search is probably going to be effective, however, supplementing this with semantic sentence embeddings would provide valuable insight into the data either by clustering or a semantic search engine. Semantic searching does have its issues though, described nicely in this issue here.

Over the past few years at Digital Reasoning we have been developing audio analytics software to be highly effective at processing the noisy, domain-specific voice data that we typically encounter within the trading operations of major banks. Within the Audio Research Team, rapid research cycles drive continual refinements to our audio technology. The faster we can iterate, the better the quality of the solutions we deliver to our customers.

Sean Narenthiran

Research Engineer at Grid AI | Pytorch Lightning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store