How AI Happens

Vanguard Principal of Center for Analytics & Insights Jing Wang

Episode Summary

Jing Wang, Principal and Head of the Center for Analytics and Insights at Vanguard, shares her fascinating journey from high-energy physics to leading AI initiatives that are reshaping investor outcomes for the better.

Episode Notes

Jing explains how Vanguard uses machine learning and reinforcement learning to deliver personalized "nudges," helping investors make smarter financial decisions. Jing dives into the importance of aligning AI efforts with Vanguard’s mission and discusses generative AI’s potential for boosting employee productivity while improving customer experiences. She also reveals how generative AI is poised to play a key role in transforming the company's future, all while maintaining strict data privacy standards.

Key Points From This Episode:

Quotes:

“We make sure all our AI work is aligned with [Vanguard’s] four pillars to deliver business impact.” — Jing Wang [0:08:56]

“We found those simple nudges have tremendous power in terms of guiding the investors to adopt the right things. And this year, we started to use a machine learning model to actually personalize those nudges.” — Jing Wang [0:19:39]

“Ultimately, we see that generative AI could help us to build more differentiated products. – We want to have AI be able to train language models [to have] much more of a Vanguard mindset.” — Jing Wang [0:29:22]

Links Mentioned in Today’s Episode:


Jing Wang on LinkedIn

Vanguard
Fermilab
How AI Happens

Sama

Episode Transcription

Jing Wang: So we found those simple nudges has tremendous power in terms of guiding the, investors to adopt the right things. And then this year we start to use a machine learning model to actually personalize those nudges.

 

Rob Stevenson: Welcome to How AI Happens, a podcast where experts explain their work at the cutting edge of artificial intelligence. You'll hear from AI researchers, data scientists, and machine learning engineers as they get technical about, the most exciting developments in their field and the challenges they're facing along the way. I'm your host, Rob Stevenson, and we're about to learn How AI Happens. Okay, welcome back to the podcast, all of you wonderful AI practitioning darlings out there in podcast land. It's me, Rob, and I have a wonderful guest for you today. I'm really excited to speak with her and really excited for you to learn from her as well. She's had a ton of roles in our space. Currently she serves as the principal and the head of the center for analytics and Insights at Vanguard. Jing Wang. Welcome to the podcast. How are you today?

 

Jing Wang: Very good, thanks, Rob. It's a, pleasure to be on your show.

 

Rob Stevenson: Pleasure to have you here, and I always am pleased to have guests on who have done a considerable amount of research in the academic side. It's a unique thing about this industry that people tend to have both sorts of experience. And in your case, you served as a, did your postdoc work at Fermilab, which is exciting to me because I grew up near Fermilab and that was a place where I got to take a lot of field trips. So I wonder if we crossed paths maybe, when there was a lot of tiny children causing a ruckus and interrupting your research back in the day.

 

Jing Wang: That's wonderful to hear. So, yes, Formrmula is this unique place, right? It's a circled off with, miles and miles of flatland and other accelerators are under the ground and we have bison and rooming on the surface. And I remember once a year the cafeteria would serve bison burgers. It's a, interesting place to be.

 

Rob Stevenson: Yeah, it's a huge campus. And then, yeah, like you say, it's deceptive because it's all underground. Would you mind sharing a little bit about the kind of research you did there?

 

Jing Wang: Yeah, absolutely. So I had received my PhD in theoretical high energy physics. So essentially what it means is I studied the smallest constituents of, of the word. And at that time when I was, a, PhD and a postdoc researcher, we spent a lot of time studying string theory, which was the leading candidate for a unified theory that behind all the particles we observe in the high energy space. So we built a lot of mathematical models to try to explain what that unified theory could look like. And then we try to extrapolate those mathematical theories down to the current word. Right, the or lower energy space as we call it and trying to understand, okay, so if you cool the universe down from the high energy state to a lower N state, what could you observe? What kind of particles? Or would you observe what kind of interactions they may exhibit? Right. Then we use accelerators and colliders to trying to detect those tiny signals that might show up. So that's the kind of work I did as a physics researcher.

 

Rob Stevenson: Do you ever miss that sort of work?

 

Jing Wang: I do. What I miss is the sort of protected space for really sinking down, reading and learning and thinking about maybe a longer term sort of ideas. Really sketch out what the ideas could, could be funded upon and could take, evolve. But on the other hand, I think one thing drew me out of the physics worldd into the real world is I realized that I love problem solving. I love solving complex problems. And the real world, there are plenty of very complex problems, maybe not as profound, but sometimes of higher impact on the real world. So. And after I left physics, I got to do many of those kind of things and I really enjoy it.

 

Rob Stevenson: It seems like when one gets sufficiently advanced in the hard sciences, as you did, this sort of career in analytics and data and AI is a really good fit. I'd be curious to know your opinion, why that is. Is it because that at a certain point it's all math or why do you think that there are so many really advanced scientists in the space, even if they're not working in the hard science field necessarily?

 

Jing Wang: Well, I have to say that my career path from my physics career to what I do today is not straight. You know, I wouldn't say I'm a data scientist by training. what happened is I left physics and I went back to business school and I was a strategy consultant for a period of time. So there I continued to building my problem solving muscle, but in the business domain, right. And then, I joined Vanguard and almost accidentally I became an AI practitioner. So it's a Vanguard decided that, you know, we really need to invest in the data analytics domain, especially the advanced analytics several years ago to enable us to step into this digital environment where there's so much data generated by our investors and we need to deeper, deeply understand that and leverage those advancedant methodology to create Solutions to allow us to personally interact with them. So that's when I stepped in first as a strategist, helped the company to think about, okay, what is this about? Build a business case around why do we invest in this domain? And really then started to lead a team to actually do this. So I would say my biggest contribution to this practice is a super product owner, if you will, for AI. My training and my background really allowed me to very easily grasp why do we need to do this work, what's the power of data science, what's the power of AI? And will allow me to really explain that to the company, right? To really kind of help them to understand the possibility that AI brings and connect that dots between the business sort of language and the mindset to, hey, how do we leverage AI, in our business solution development? I also think that because I understand that, the process so well, like, you know, it's a scientific process with continuous experimentation with data and always going back to say, can we try something new? Can we try this new data, try new methodology? And it's different ways of working. A lot of AI teams fail because the team was not protected to be able to do that, to deliver a good solution. Understanding that is, allowed us to really kind of build the right team, give them my data scientist team, the right environment for them to use the right approach to do their work.

 

Rob Stevenson: This is part of the reason I was excited to speak with you, Jing, because this notion of really explaining the why it's so crucial, but also it's a skill that is maybe removed and different from a skill that a lot of the technical folks in this space have, right? When you get really good at building and training and tweaking models, doing that work internally to explain to stakeholders and to, as you said, protect the AI team. It's a different skill set, but it's so, so important. And so I was hoping we could speak about that a little bit. And so I would love it if you could share a little bit more what you mean when you say you set out to protect the AI team.

 

Jing Wang: Yeah. So in, I would say coming back to your first maybe question around the skill about the storytelling, right, the storytelling ability. And that's how I think about our AI practice. And we always have a strategy. For me, having a strategy for our practices is really important. It's a starting point, right? So there at highest level, we connect our AI, strategy to Vanguard's mission. Right. Vanguard is a very mission driven company. We have a purpose that everybody can recite. You Know, we take a stand for our investors and we give them the best chance for investment success. And how do we do that? Right. The strategy is also very simple. It summarizes to four things. We deliver the best performing mutual funds and ETFs. We provide the best advice, services and guidance to allow investors to do the right thing. And we provide a world class client experience so people want to interact with us and ineact us with ease. And finally we leverage our employees, what we call crew. Right, that's the enabled crew is at the core of everything we do. So those are the four pillars. We make sure all our AI work is aligned with those four pillars to deliver business impact. So that when it comes to storytelling, I would say the first thing we try to do is to say how do we, how does this AI work? Well, enable the company to accomplish those goals and ultimately deliver our promise to our investors. Right. We tie back in there and then we sit down with the business to say how do we jointly develop some defined use cases, the questions, the problems we want to solve. Right. Before we go jump do a POC and build a model for you. What exactly is the problem you want to solve? Right. Here comes the protecting my data science part. In my early days people don't understand what data science could deliver. And why do we even want to have this, this process that seemingly takes a very long time. And the joke is you, you know, spend several weeks together a very large data set and you vanish and six months later you gave me a model and clearly that's not how I do things. And, but there's the sense around, hey, is this really needed? Right. So people would have come to us to say we just need you to give us a very large data set engineered in those ways. And I have to say no, that's not what we do. We want to do is to really think about what are the problems you are trying to solve and why you want to solve it and what kind of outcomes you want to deliver to your business and to your customers and will together craft what are the solution. Right. That's needed to do that.

 

Rob Stevenson: I think everyone can relate whether no matter what function they're in, to having to say to a coworker, that's not really what we do over here. And I'm, I was struck that you delineate like we are not data engineers. Could you speak a bit more about that? Like what is the difference and how do you make sure people know that internally?

 

Jing Wang: Yes. So the way we define engineers are really looking atull a lot of raw data set, clean it and integrate it, make it a very model ready datase set and do some exploratory data analysis so that we are ready to, based on this understanding of the data set, we're ready to model. And that is very much at the beginning of the model development lifec cycle. And a lot of skills that's needed is absolutely important skill. We have brilliant data engineers and that's critical to the, to the model development. But what really is the power of data science is what comes next is really thinking about this is the outcome I want to deliver. Let's say it's a personalization, right. I need to deliver the recommendation for a particular, for all our investors on what's the next best services and products that it needs to adopt. And then we, you know, with the model data set we try different methodologies and then we're trying to predict and then we're trying to measure what methodology give us the best performance and that meets the business needs. Right. That reflects the business environment, reflects the individual behaviors that we observe in the data set and also then can be tested, can be evaluated. Right. And that whole process comes after the data set is ultimately delivering the results that you're looking for. Right. So there's a difference between giving me a data set then I am going toa cut it and do some analysis to drive my decision making Versus all the way to what we will deliver is a very personalized recommendation for any given investors. Out of the millions of investors we have at any given time, what would be the next best services offer we want to put in front of them so that they will want to engage with us. Right. So there's a difference between is the data set we deliver which is the output of the data engineering effort versus a solution that we're going to deliver out of the the process of creating a AI solution.

 

Rob Stevenson: I'm glad you brought up the personalization piece of it because Vanguard is doing some really interesting things. I think you refer to it internally as nudging. Nudging investors to make more fruitful investing decisions essentially. And I think when one hears that you might think oh well you log into a new product and there's a little pop up that's like have you tried this feature which is really common for any sort of software but it's a moat more sophisticated at Vanguard, shall we say. So I was hoping we could begin there. How would you sort of differentiate it from the standard sort of like here's how to use our product walkthroughs that we've all seen. How is Vanguard a lot more sophisticated.

 

Jing Wang: Yeah. So when we think about nudging, obviously it is s a very powerful concept in, behavioral finance that has produced that are extremely interesting to us. Ultimately, we believe in very low cost, broadly diversified products. Right. Mutual fund and etf. That's how Vanguard started its journey. And we produce great fund products. But at the same time we realized that the investor behavior has huge impact to their investment outcomes. Right. There's a lot we can do not in the product space, but much more in the behavior guidance front. And nudging is a great approach, a, tool that's available for us to guide them in that journey. So we started this experiment, if you will say, in first in our 401k space. Right. So retirement plans is one of the major vehicles that US Workers use to save for retirement. And we realized that there are a lot of actions that the employees can take in, therefore 1k plans to maximize their investment outcomes. For example, the savings rate, right. When people sign up, typically they don't really put in a default savings. The rate is very low. Right. It's 4%, which is far from sufficient. And then they set and forget about it every year. Right. So we want to nudge them to think about a high amount of savings rate so that as they increase, gets promotion and increase their salary, they lifted their savings rate as well that ultimately lead to better outcomes and things like employee match. A lot of, companies, offer match. And we see that the adoption rate for the match is actually not very high. So it's meant to encourage the employees to put more money away and maximize the matching from the companies. And that's. It's not very high. Right. So there are many things that we can do that employee can do in a retirement plan plan to achieve help them to achieve better process outcomes. So we developed a whole machine learning model to actually look at successful adoptions of those behavior in the past by participants. And I use that machine learning model, we can predict which one of those offers we want to put in front of the participant so that they are more likely to adopt them. By adopting those offers, what we call offers, they will be on a path towards better outcomes. So the business has a menu of 12 to 15 offers that in any given retirement plan is available to the participants. And so they leverage our model to really kind of personalize which offer they communicate to the participant to drive those adoptions. And vnguard does not really benefit any from those for the most part. But it is to help the participant to achieve better behavior. And so we, in more recently 2021, 2022, we start to look at our individual investors space. We realize the similar things, right. Just to give you example, we realize billions of cash were kept in our individual investors account. And some of the cash are part of a well designed allocation which is perfect. Right. But in many of the situations people have cash in the return in the investment account because they forgot, they forgot about them. They opened up an account, very excited, they brought money into this new account and then when they are faced with making the decisions to invest, they kind of struggle. They struggled. On the same night of opening the account, they fel like oh, I need to think about it. And then they put it off. Life happens and you forgot, right? So we saw a lot of large amount of cash stacked in this form, which really does not generate the kind of returns that we would like to for the employees, I mean for the investors to benefit from. So we start to say, hey, you know, there are many of those behaviors that are hurting investors ultimate investment success. How can we help them to get on to adopt those kind of right behaviors? So we experiment with different nges and there we actually applied a pretty scientific approach. We have behavioral scientists in my team and they interviewed and then they brought in behavioral theory to understand what are the potential barriers that are stopping the investors from making those right decisions. Right. What we talked about, choice paralysis is one of them. And there are other reasons, right. That are studied by, well studied by behavioral science that are stopving people from making those decisions using those analysis and insights. We design and right nudges and we experiment with those nudges and we wrote those nudges out to the right participant at the right place so that they can adopt the right behavior. So something could be, some could be very simple as a checklist. When you open up a Vanguard account and you would see a checklist to say those are the five step process. You are on step number three. And don't forget, if you don't complete step number five, which is to invest your money into different products and services, you would not truly benefit from the investment, right? So as simple as that, works for a lot of people. Reminder, right? If they put it off today and remind them a week later to say, hey, you know, don't forget you still have this money. You haven't finished that process yet. So we found those simple nudges has tremendous Power in terms of guiding the investors to adopt the right things. And then this year we start to use a machine learning model to actually personalize those nudges so that not everybody benefited from the same need, the same nudge at the same time through the right channel. Right. So we start to use a very large scale reinforcement based machine learning model to personalize the nudges.

 

Rob Stevenson: So it's not even so much like put your money in this fund versus this fund. It's more of just the adoption. It's like, okay, you need to get signed up. And if you go through these things, here's like the minimum effective dose of, you know, question mark, question mark, question mark, retirement success.

 

Jing Wang: And we have to be very careful because you know, investment management industry is a very highly regulated industry. So that means if they're not signed up for to be our have vangard to manage their assets, we cannot provide such guide, otherwise. So asking people suggesting people to invest in certain specific products is a form of otherwise. And we cannot do that for investors who have not signed up to be our advised clients. So it's a fine line we want to walk. We want't give them the tools to allow the investors to thank themselves, but without giving them specific recommendations or suggestions for product and funds. Right. So that's why we give them a lot of education materials. In the past we publish a tremendous amount of education thinking that if people read up on those things, they can make the right decisions. Right. So we're trying to, you know, put those thought leadership into bite size, into the language they better understand, but really serve up when they need to see it right at the right moment of the process, when we have their mindset of making that decision.

 

Rob Stevenson: So it sounds like it's not merely here are the steps one can slash should take for a better investment outcome. But when you speak of the personalization, right, you are speaking about hitting people in the right channel at the right time, et cetera. Could you speak a bit about how reinforcement learning was helpful in that approach?

 

Jing Wang: Yes. So it's interesting that our next best action, the personalization, recommendation framework is also evolving. So we have built many of the next best action models based on propensity models for a particular product and services. For those nudges we have propensity models, but we come to realize that are those individual domains to start to conflict with each other. Right. So we cannot, for example, touch a single investor many, many times. Right. Because this propensity model suggests that we should touch this Client for this offer. At the same time we want to touch him for another offer. At the same time he also has a nudge that needs to be displayed. Right. So which one is the best? We don't want to confuse our investors. So we start to leverage this decision engine concept that we have, bring all those signals together and use a reinforcement learning algorithm on top of it to say, hey, how do we manage multi objectives, right. That we want to optimize and what is the best offer? Then when we put in front of a client, he has most high likelihood to engage. Right. So reinforcement learning algorithm is used to really kind of understand when we put the signal in front of the client, would he interact if he click Open engagement, would he take the first step down to that journey? Either is a nudge, right. Save more kind of nudge that he would go down to the first path or is it start to read up on certain things that as a starting point of that particular journey so that the reinforcement learning always are doing those experiments, putting a lot of options in front of a client, see how the client engages with it. And every data point, every engagement comes contact to the model. So model learns about this client's preferences. And then the next time the kind interact, we would learn from this experience and presented the offer that is very likely to engage. And over the long run, we want to bring more objectives into this decision engine because we want to triage between the client experience and objectives versus client outcomes. Objectives, versus business growth objectives. Right. So there are multiple objectives and we want to ultimately connect that to the investment outcomes a little bit longer term, which is a very hard problem to solve to bring, you know, Your investment outcomes 5 year, 10 years down the road to your engagement today. Right. Is a, kind of hard problem for us to solve. So we're working on that too.

 

Rob Stevenson: Yeah, yeah, certainly. And you mentioned there just a moment ago the various kind of outcomes that Vanguard is interested in helping their customers attain. And in addition to that, Vanguard is blessed with this huge trove of data. Right. And so I'm curious, how did you decide this was the best use of that data? How did you decide that? Okay, like when you consider the remit of your center and Vanguard's larger intentions and goals around AI writ large, why was this the answer?

 

Jing Wang: Well, for me, it all starts from the business problem. Right? It starts from why. Why do we need to work on this problem? Is this problem most important problem for us to work on? for the business to solve? Right. It is this the most Important problem for the business to solve. And is that the best problem for AI to solve to leaning right. AI brings unique edge to this problem. Is that true? Then we take on the problem and then it's a discovery process. Right. I leave it to my data scientist team to to discover, to solve this problem, build the best solution. What are all the data we need? Right. To bring that together. So we have solutions that have been running for several years and then they have many different versions. When we first started to build, the first version of the solution could be very simple, right. We just, you know, only take the transaction data, for example, a client behavior data in a transaction space. But then the next version we start to experiment with the web behavior data. The next version we start to look at how it sequential data, how people navigate the historic data. You write, how did people navigate the website and the next version. Now a lot of things with gen AI we can bring in really the conversation data. So directly coming, using the conversation and transcriptions between the client and Vanguard to tease out. Right, Exactly. From client's perspective, what are they trying to get, what they trying to solve, what's the effort they have made? And even a lot of rich data that are embedded in the conversations. Right. That we don't traditionally capture. So the data space is always expanding but the problem we want to solve I think has to be the first step and then we can always enrich what data we use to, to solve the problem.

 

Rob Stevenson: Yeah, yeah, certainly you said the magic word there Jing, which is generative.

 

Jing Wang: Yes.

 

Rob Stevenson: I'm honor bound as a podcaster in the AI space in the year of our Lord 2024 to ask about generative and generative, use cases. So hop something you could share a bit about that. What are Vanguard plans for deploying generative?

 

Jing Wang: Yes. So we have been working pretty actively in the space ever CHGP burst into the center stage. But even before that I have a research team who has been really experiment with the smaller generative models, large language models. So what we see are a few very large areas of opportunities. One is internal productivity. So generative AI offers a tool that really allow us to apply much more broadly when they think about employee productivity improvement. So we're working on many use cases to allow our marketers to be more efficient and creative. We're helping our coders to be more productive. We're removing automating. A lot of processes use gai solutions so that people can spend less time on administrative stuff and focus on their true value. Add activity and we are working to really provide our frontline service crew allow those tools so that they can, they can really focus on talking to the client rather than search for, you know, answers in our internal content library. Right. Those kind of things. So there's a massive opportunity to improve our internal employee productivity. We also see that next step, we want to use generative AI to serve our clients. Right. So how can we make our digital experience more intuitive and easier to navigate? How do we understand our clients intent better? And how do I, as I mentioned, how do we really take those client insights out of directly out of their interactions with us in the unstructured data form. so that we can enrich our next best action models. We can make them much more predictive, we can much more tune with the struggles they have and we can be much more responsive to their needs. And ultimately we see that generative AI could help us to build more differentiated products. Right. We wantn, we want to have AI, we want to train language models that are much more of a Vanguard mindset. Right. How do we infuse a large lauage model is deep understanding of Vanguard's investment philosophy than goard advise philosophy when a products and services and how we interact with the client that is uniquely Vanguard. So that's our longer term goal.

 

Rob Stevenson: Yeah. That feels like the voice of the company in a way that feels like something that every company is goingna be able to do a little bit differently.

 

Jing Wang: Right, Right.

 

Rob Stevenson: And it's like okay, why engage with this product? Well because I understand that this is, it's different than speaking to you know, a chatbot from another company. or I know that this is trained specifically on this data. It's not trained on Reddit comments, you know, or gith forks. Right. I can trust that this is actually coming from a more technical place.

 

Jing Wang: Right? That's absolutely right. We have already done some experiment with instruct, fine tuned open source models with Vanguard specific content. Start with public content. Right. So we see actually the model does demonstrate much deeper understanding of Vanguard's voice. Right. As you said. So that has been useful for certain of our use cases, especially in some use cases where we have very high degree of data sensitivity. We don't want our data to leave our complex in any shape or form. We leverage those internally trained large language models to provide that kind of service.

 

Rob Stevenson: So we mentioned a moment ago, of course Vanguard has tons of data and you have. Because the financial sector is very rigid and you want to be very protective of data. You have sort of some Unique problems about training, for example a chatbot, do you have a sense of what would be like the minimum amount of data you could have for a meaningful generative experience? Like if someone listening wasn't blessed with the gold mine of data that Vanguard is sitting on. What do you think is enough?

 

Jing Wang: I think it comes to the quality. Right. Of the data set. So it'still a learning journey for us I would say to infuse the Vanguard voice in a large language model. We used the public data that Vanguard has. Right. Which is I would say not a very large amount of data. But of course you couple with the standard data sets for conversation Ettera, a lot of are externally available. But the unique Vanguard datase set that really composed of Vanguard publications in the thought leaders space, the website data across all our publicly available content is not a very large data set. But then so you're shifting the work from just go you to deal with this raw data set that's very large and let's the large language model figure out how to, how to do it versus the business and us have to do a body work to make sure we build a really robust the clean data set. So there's a trade off.

 

Rob Stevenson: Okay, that's helpful, thank you. Jing, we are creeping up on optimal podcast length here. But before I let you go, I know that you have been working in the financial sector for a long time. I can see why you're very good at your job. I suspect though that that white coat scientific researcher is still in there somewhere. So I'm curious, do you still stay abreast of what's kind of happening in the field of physics and where we are with string theory these days.

 

Jing Wang: I have to say I spen more time reading up on the AI, the latest and greatestms, than what's happening in the physics space. And, and it's more from just catching the news on what's going on, talking to my old friends and get a glimpse of what what new problems they're working on. Yeah, unfortunately I would love it if.

 

Rob Stevenson: You could share one of the newest problems we're facing, when we get into the, you know, the quantum physics realm.

 

Jing Wang: Yeah, I think from talking to my friends I felt like when I was a researcher string theory almost seems to be a given. It's like you know, the 11 dimensional supersymmetric string theory is the fundamental theory and then all we need to do is to predict what's extrapolated to the real world and look for those signals But I think subsequently the large Hardron colliders and the other colliders have not been able to find those signals that's predicted. And so there's raised a lot of questions I think around what is what is the fundamental theory? I think it's still, it's still out there that needs to be solved. So it's for me, of quite interesting, quite interesting development. But that's the science.

 

Rob Stevenson: Yeah, yeah, it's, it's just a, A progression of of exploded wrong ideas.

 

Jing Wang: Right, right, exactly. But you learn from that. Then hopefully we see other signals that we cannot explain. That becomes a hint for new theory.

 

Rob Stevenson: That's right. Jing, this has been really lovely having you on. So thank you for coming and sharing your expertise and even a little bit about physics. I enjoyed that too. It sounds like you're doing really exciting work over there, so I appreciate the time and you sharing what you're working on. It's been great today.

 

Jing Wang: Thanks Rob. It's been really interesting conversation.

 

Rob Stevenson: How AI happens is brought to you by Sama. Sama's agile data labeling model evaluation solutions help enterprise companies maximize the return on investment for generative AI, LLM and computer vision models across retail, finance, automotive and many other industries. For more information, head to sama.com.