In a recent collaboration with Facebook AI’s FairScale team and PyTorch Lightning, we’re bringing you 50% memory reduction across all your models. Our goal at PyTorch Lightning is to make recent advancements in the field accessible to all researchers, especially when it comes to performance optimizations. Together with the FairScale team, we’re excited to introduce our beta for Sharded Training with PyTorch Lightning 1.1.
Training large neural network models can be computationally expensive and memory hungry. …
NeMo (Neural Modules) is a powerful framework from NVIDIA, built for easy training, building and manipulating of state-of-the-art conversational AI models. NeMo models can be trained on multi-GPU and multi-node, with or without Mixed Precision, in just 3 lines of code. Continue reading to learn how to use NeMo and Lightning to train an end-2-end speech recognition model on multiple GPUs and how you can extend the NeMo models for your own use case, like fine-tuning strong pre-trained ASR models on Spanish audio data.
In this article we’ll highlight some of the great features within NeMo, steps to building your…
Reduce cost and horizontally scale deepspeech.pytorch using TorchElastic with Kubernetes.
Deepspeech.pytorch provides training, evaluation and inference of End-to-End (E2E) speech to text models, in particular the highly popularised DeepSpeech2 architecture. Deepspeech.pytorch was developed to provide users the flexibility and simplicity to scale, train and deploy their own speech recognition models, whilst maintaining a minimalist design. Deepspeech.pytorch is a lightweight package for research iterations and integrations that fills the gap between audio research and production.
Training production E2E speech-to-text models currently requires thousands of hours of labelled transcription data. In recent cases, we see numbers exceeding 50k hours of labelled audio…
Allen Institute as a part of their open research efforts released a data dump of scholarly articles as an initiative to aid efforts in tackling COVID-19. This dataset contains 51,000 articles as of the date this article being written and is increasing in size.
When searching the data, key word search is probably going to be effective, however, supplementing this with semantic sentence embeddings would provide valuable insight into the data either by clustering or a semantic search engine. Semantic searching does have its issues though, described nicely in this issue here.
Over the past few years at Digital Reasoning we have been developing audio analytics software to be highly effective at processing the noisy, domain-specific voice data that we typically encounter within the trading operations of major banks. Within the Audio Research Team, rapid research cycles drive continual refinements to our audio technology. The faster we can iterate, the better the quality of the solutions we deliver to our customers.
Research Engineer at Grid AI | Pytorch Lightning