Listen to This Podcast Episode Here

by Patrick O’Shaughnessy

My guest this week is one of my best and oldest friends, Jeremiah Lowin. Jeremiah has had a fascinating career, starting with advanced work in statistics before moving into the risk management field in the hedge fund world. Through his career he has studied data, risk, statistics, and machine learning—the last of which is the topic of our conversation today.

He has now left the world of finance to found a company called Prefect, which is a framework for building data infrastructure. Prefect was inspired by observing frictions between data scientists and data engineers, and solves these problems with a functional API for defining and executing data workflows. These problems, while wonky, are ones I can relate to working in quantitative investing—and others that suffer from them out there will be nodding their heads. In full and fair disclosure, both me and my family are investors in Jeremiah’s business.

You won’t have to worry about that potential conflict of interest in today’s conversation, though, because our focus is on the deployment of machine learning technologies in the realm of investing. What I love about talking to Jeremiah is that he is an optimist and a skeptic. He loves working with new statistical learning technologies, but often thinks they are overhyped or entirely unsuited to the tasks they are being used for. We get into some deep detail on how tests are set up, the importance of data, and how the minimization of error is a guiding light in machine learning and perhaps all of human learning, too. Let’s dive in.


Show Notes

2:06 – (First Question) – What do people need to think about when considering using machine learning tools

3:19 – Types of problems that AI is perfect for

6:09 – Walking through an actual test and understanding the terminology

11:52 – Data in training: training set, test set, validation set

13:55 – The difference between machine learning and classical academic finance modelling

16:09 – What will the future of investing look like using these technologies

19:53 – The concept of stationarity

21:31 – Why you shouldn’t take for granted label formation in tests

24:12 – Ability for a model to shrug

26:13 – Hyper parameter tuning

28:16 – Categories of types of models

30:49 – Idea of a nearest neighbor or K-Means Algorithm

34:48 – Trees as the ultimate utility player in this landscape

38:00 – Features and data sets as the driver of edge in Machine Learning

40:12 – Key considerations when working through time series

42:05 – Pitfalls he has seen when folks try to build predictive market investing models

44:36 – Getting started

46:29 – Looking back at his career, what are some of the frontier vs settled applications of machine learning he has implemented

49:49 – Does intereptability matter in all of this

52:31 – How gradient decent fits into this whole picture

Source link


Please enter your comment!
Please enter your name here