An interview with Patrick Hall, Principal Scientist at bnh.ai and Advisor to H2O.ai

We have Patrick Hall for today’s interview. Patrick has worn many hats — be it as a maker, as a product director or as an Adjunct Professor. Currently, he is wearing three such hats. He is serving bnh.ai (a law firm which he co-founded) as a Principal Scientist, he is with H2O.ai serving as an advisor for their responsible AI efforts, and still teaching at George Washington University.

Patrick is one of the most prominent figures out there when it comes to Machine Learning Interpretability (MLI) efforts. He has authored workshop and journal papers on responsible AI, an e-book on related subjects, and has given a number of interesting presentations as well. A lot of his efforts for driving MLI can be found here on his LinkedIn profile. Additionally, he maintains several GitHub repositories to share resources for a number of different topics related to MLI. (They’re all open-source repositories and he happily welcomes PRs there.)

An interview with Patrick Hall, Principal Scientist at bnh.ai, Advisor to H2O.ai

Sayak: Hi Patrick! Thank you for doing this interview. It’s a pleasure to have you here today.

Patrick: I’m happy to be here! Hope everyone is staying well!

Sayak: Maybe you could start by introducing yourself — what is your current job and what are your responsibilities over there?

Patrick:

  • I recently founded bnh.ai. It’s a new boutique law firm that specializes in helping organizations detect, avoid, and respond to liabilities (i.e., compliance, litigation, reputational) caused by machine learning (ML) and artificial intelligence (AI). I provide leadership on the technical side of the house at bnh.ai. My partner is the legal lead.

Sayak: That was very well-bulleted out and detailed. Thanks, Patrick. You have driven so many efforts around MLI. Would you like to mention what is the primary motivation behind it?

Patrick:

  • My motivations were commercial at first. Coming from SAS and seeing how ML projects actually worked and did not work, all over the world, I learned that explainability and deployment were the key factors in a commercial ML project’s success or failure.

Sayak: Your transitions have been very methodical and practically grounded. When you were starting in the field of MLI what kind of challenges did you face? How did you overcome them?

Patrick:

  • The math was shakey in XAI at first. Many of the techniques were approximate or inconsistent. I think one of our earliest breakthroughs was to realize you would have less inconsistency and better surrogate models if you used constrained models, and then later we learned that complex ML models can really be interpretable themselves. (Thank you, Professor Cynthia Rudin). Shapley values also came later and those have been transformative because they are accurate and consistent. (Thank you, Dr. Scott Lundberg.)

Sayak: I did not ever think of the compliance part like the way you mentioned. I would like to now ask a very basic question regarding your MLI workflow. After you have trained your ML model on a given dataset, is there any general framework that you follow to incorporate MLI in there?

Patrick:

  • Well’s it’s very crucial to consider interpretability in the entire workflow, (see Information 11(3), https://res.mdpi.com/data/covers/information/big_cover-information-v11-i3.png) and not just after a model is trained.

Sayak: Never ever thought about the last point you mentioned. The points you mentioned are very concerning indeed. If you were to enlist some resources that an MLI beginner cannot afford to miss, what would those be?

Patrick:

Sayak: That was a very humble gesture, Patrick. I would simply refer to the Awesome Machine Learning Interpretability meta-list you already mentioned. What would be the state of MLI in the next five years? Which particular areas of MLI the community would draw most interest in?

Patrick:

  • ML, like aviation, nuclear power, or other powerful commercial technologies that came before it, will likely become more regulated. Data scientists will need help with compliance and incident response for ML, and we want to be there to help with bnh.ai. The U.S. States and the Federal Government, and many nations are already enacting or proposing AI guidance, e.g., Canada, Germany, Netherlands, Singapore, the U.K. and the U.S. (Trump Administration, DoD, and FDA).

Sayak: I think this would definitely work as a checklist for the folks interested in pursuing MLI more. Debugging neural networks is something that excites me a lot actually. Being a practitioner, one thing that I often find myself struggling with is learning a new concept. Would you like to share how you approach that process?

Patrick:

  • I read about a new thing, then I try to apply it through code, writing or presentations, then talk to people about those, and I repeat. I really do learn a lot from different communities, both through online and real-life discussions

Sayak: Haha, that is a fun fact to know (FORTRAN, really!). If you ask me I am also more of a reader, I prefer learning by reading books cover to cover. I also like reading through an implementation when I struggle to understand a concept. Finally, any advice for beginners?

Patrick:

  • Don’t trust the internet too much. If you have customers listen to them. If you have mentors, listen to them. At SAS, I had lots of customers and mentors and they told a very different, and more true, story about data mining and ML, than did Twitter, Quora, Kaggle, or Medium. At that time, ML discussions on the web were absolutely dominated by deep learning on benchmark datasets like MNIST, CIFAR, and ImageNet. Almost nothing could have been less interesting to my mentors and customers at that time.

Fifty Years of Data Science

50 Years of Test (Un)fairness: Lessons for Machine Learning

Statistical Modeling: The Two Cultures

A Very Short History of Data Science

  • Treat Kaggle as a learning platform, not a religion. Kaggle is a game. Real-world ML is not. And if you treat real-world ML like a game, you could do more harm than good.

Sayak: Thank you so much, Patrick, for doing this interview and for sharing your valuable insights. I hope they will be immensely helpful for the community.

Patrick: I hope it’s helpful too because I’ve learned so much from the data science community over the years. Thank you Sayak for your patience and interest … and for your PRs!

Calling `model.fit()` at PyImageSearch | Netflix Nerd | Personal site: https://sayak.dev/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store