In the previous post, Margaret introduced the project covering different scenarios where it could be useful along with other technical objectives.
In this post, we will provide some more details about the models we used, some major bits from their conversion process in TensorFlow Lite (TFLite), and the benchmarking results of the models. You can follow along with the materials as enlisted here.
Our interviewee today is Dan who is a third-year Ph.D. student in Computer Science at UC Berkeley. Dan is primarily interested in Machine Learning Safety. Some of his notable works include the GELU activation function, the activation function used in BERT and GPT. His work in robustness and uncertainty includes proposing the baseline for detecting anomalies with deep neural networks; creating robustness benchmarks for natural adversarial examples and image corruptions; and so on.
We have Vincent Sitzmann for today’s interview! Vincent is a Postdoctoral Researcher at MIT’s CSAIL and he just completed his Ph.D. at Stanford. Vincent’s research interests lie in the area of neural scene representations — the way neural networks learn to represent information on our world. One of Vincent’s works that stirred the Deep Learning community is Implicit Neural Representations with Periodic Activations also referred to as SIREN. The results of SIREN speak for the efficacy of it. Vincent developed a Colab Notebook that is more than enough to get us started with SIREN. …
For today’s interview, we have Alexander M. Rush with us. Alexander is currently an Associate Professor at Cornell University where his research group studies several areas of NLP such as text generation, document-level understanding, and so on. They also work on open-source developments such as OpenNMT.
Alexander’s research work has been groundbreaking particularly in the area of text generation and one of them is an absolute favorite of mine — A neural attention model for abstractive sentence summarization. Alexander is also with Hugging Face 🤗 helping the company to develop SoTA stuff in NLP. …
This article is the successor to my previous article on Quantization. In this article, we’re going to go over the mechanics of model pruning in the context of deep learning. Model pruning is the art of discarding those weights that do not signify a model’s performance. Carefully pruned networks lead to their better-compressed versions and they often become suitable for on-device deployment scenarios.
The content of the article is structured into the following sections:
I am pleased to have Niki Parmar for today’s interview. Niki is a Senior Research Scientist at Google Brain where she is involved in research related to self-attention and extending that to different applications in both language and vision. Her interest lies in generative models, 2D, and 3D vision and self-supervised learning.
Niki has co-authored a number of impactful research papers in the domain including the seminal paper on Transformers — Attention Is All You Need. To know more about her research and her interests you can follow her on Google Scholar and LinkedIn.
Sayak: Hi Niki! Thank you for…
I first came across the term “Google Developers Expert” (GDE) back in 2018 during DevFest Kolkata where a GDE in Flutter Pawan Kumar gave a talk. After going through the details of the program I got immensely interested to join it. Folks have different reasons to join the program and here are the ones that genuinely inspired me —
I am pleased to have Colin Raffel for today’s interview. Colin is currently working as a Research Scientist at Google. Colin’s research interests broadly lie in areas like learning with limited labeled data, transfer learning, especially from an NLP context. Colin is also one of first the authors of the seminal paper Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5).
Among Colin’s other works, my favorite ones are on semi-supervised learning — FixMatch and MixMatch. If you are interested in knowing more about an extension of the MixMatch work, be sure to check out this ICLR…
Interact with the dashboard of results here.
State of the art machine learning models are often bulky which often makes them inefficient for deployment in resource-constrained environments, like mobile phones, Raspberry Pis, microcontrollers, and so on. Even if you think that you might get around this problem by hosting your model on the Cloud and using an API to serve results — think of constrained environments where internet bandwidths might not be always high, or where data must not leave a particular device.
We need a set of tools that make the transition to on-device machine learning seamless. In this…
tf.kerasmodels to speed up model training time.
In this article, we are going to see how to incorporate mixed precision (MP) training in your tf.keras training workflows. Mixed precision training was proposed by NVIDIA in this paper. It has allowed us to train large neural networks significantly faster with zero to very little decrease in the performance of the networks. Here’s what we are gonna cover -