In the previous post, Margaret introduced the project covering different scenarios where it could be useful along with other technical objectives.

In this post, we will provide some more details about the models we used, some major bits from their conversion process in TensorFlow Lite (TFLite), and the benchmarking results of the models. You can follow along with the materials as enlisted here.

The entire code for this project is available here in this GitHub repository. If you’d like to jump right to the Android implementation part of the project, please refer here.

Model Conversion

The project includes two types of models…

Our interviewee today is Dan who is a third-year Ph.D. student in Computer Science at UC Berkeley. Dan is primarily interested in Machine Learning Safety. Some of his notable works include the GELU activation function, the activation function used in BERT and GPT. His work in robustness and uncertainty includes proposing the baseline for detecting anomalies with deep neural networks; creating robustness benchmarks for natural adversarial examples and image corruptions; and so on.

Dan has interned at DeepMind where he conducted research on robustness and uncertainty under Balaji Lakshminarayanan. During this tenure, he developed AugMix. He also co-organized the Workshop…

We have Vincent Sitzmann for today’s interview! Vincent is a Postdoctoral Researcher at MIT’s CSAIL and he just completed his Ph.D. at Stanford. Vincent’s research interests lie in the area of neural scene representations — the way neural networks learn to represent information on our world. One of Vincent’s works that stirred the Deep Learning community is Implicit Neural Representations with Periodic Activations also referred to as SIREN. The results of SIREN speak for the efficacy of it. Vincent developed a Colab Notebook that is more than enough to get us started with SIREN. …

For today’s interview, we have Alexander M. Rush with us. Alexander is currently an Associate Professor at Cornell University where his research group studies several areas of NLP such as text generation, document-level understanding, and so on. They also work on open-source developments such as OpenNMT.

Alexander’s research work has been groundbreaking particularly in the area of text generation and one of them is an absolute favorite of mine — A neural attention model for abstractive sentence summarization. Alexander is also with Hugging Face 🤗 helping the company to develop SoTA stuff in NLP. …

Machine Learning in Production

This article discusses pruning techniques in the context of deep learning.

This article is the successor to my previous article on Quantization. In this article, we’re going to go over the mechanics of model pruning in the context of deep learning. Model pruning is the art of discarding those weights that do not signify a model’s performance. Carefully pruned networks lead to their better-compressed versions and they often become suitable for on-device deployment scenarios.

The content of the article is structured into the following sections:

  • The notion of “Non-Significance” in Functions and Neural Networks
  • Pruning a Trained Neural Network
  • Code Snippets and Performance Comparisons between Different Models
  • Modern Pruning Techniques
  • Final…

I am pleased to have Niki Parmar for today’s interview. Niki is a Senior Research Scientist at Google Brain where she is involved in research related to self-attention and extending that to different applications in both language and vision. Her interest lies in generative models, 2D, and 3D vision and self-supervised learning.

Niki has co-authored a number of impactful research papers in the domain including the seminal paper on Transformers — Attention Is All You Need. To know more about her research and her interests you can follow her on Google Scholar and LinkedIn.

An interview with Niki Parmar, Senior Research Scientist at Google Brain

Sayak: Hi Niki! Thank you for…

I am a GDE

Highlights about the experiences I gathered throughout the GDE program in one year.

TensorFlow Roadshow Bengaluru 2019 (Source: Unknown)

I first came across the term “Google Developers Expert” (GDE) back in 2018 during DevFest Kolkata where a GDE in Flutter Pawan Kumar gave a talk. After going through the details of the program I got immensely interested to join it. Folks have different reasons to join the program and here are the ones that genuinely inspired me —

  • Learning and sharing is definitely one of the defining characteristics of my life. It’s a value that got instilled in me during my stint at TCS. …

I am pleased to have Colin Raffel for today’s interview. Colin is currently working as a Research Scientist at Google. Colin’s research interests broadly lie in areas like learning with limited labeled data, transfer learning, especially from an NLP context. Colin is also one of first the authors of the seminal paper Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5).

Among Colin’s other works, my favorite ones are on semi-supervised learning — FixMatch and MixMatch. If you are interested in knowing more about an extension of the MixMatch work, be sure to check out this ICLR…

Machine Learning in Production

Model optimization strategies and quantization techniques to help deploy machine learning models in resource-constrained environments.

Interact with the dashboard of results here.

State of the art machine learning models are often bulky which often makes them inefficient for deployment in resource-constrained environments, like mobile phones, Raspberry Pis, microcontrollers, and so on. Even if you think that you might get around this problem by hosting your model on the Cloud and using an API to serve results — think of constrained environments where internet bandwidths might not be always high, or where data must not leave a particular device.

We need a set of tools that make the transition to on-device machine learning seamless. In this…

Machine Learning in Production

Learn how to incorporate mixed-precision training for tf.keras models to speed up model training time.

Explore an interactive dashboard of the experiments conducted for the article.

In this article, we are going to see how to incorporate mixed precision (MP) training in your tf.keras training workflows. Mixed precision training was proposed by NVIDIA in this paper. It has allowed us to train large neural networks significantly faster with zero to very little decrease in the performance of the networks. Here’s what we are gonna cover -

  • Several options to incorporate mixed-precision training for tf.keras models
  • Things to remember while performing mixed-precision training
  • Hands-on examples of these options
  • Use Weights and Biases (W&B) to compare the…

Sayak Paul

Calling `` at PyImageSearch | Netflix Nerd | Personal site:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store