How AI Happens

Credit Karma VP Engineering Vishnu Ram

Episode Summary

With close to nine years of experience at Credit Karma, Vishnu has been instrumental in building the company's data science operation from the ground up. He discusses the challenges of alleviating technical debt, the importance of setting up a data culture, and the process of adopting new platforms and frameworks such as TensorFlow.

Episode Notes

Vishnu provides valuable advice for data scientists who want to help create high-quality data that can be used effectively to impact business outcomes. Tune in to gain insights from Vishnu's extensive experience in engineering leadership and data technologies.

Key Points From This Episode:


“One of the things that we always care about [at Credit Karma] is making sure that when you are recommending any financial products in front of the users, we provide them with a sense of certainty.” — Vishnu Ram [0:05:59]

“One of the big things that we had to do, pretty much right off the bat, was make sure that our data scientists were able to get access to the data at scale — and be able to build the models in time so that the model maps to the future and performs well for the future.” — Vishnu Ram [0:08:00]

“Whenever we want to introduce new platforms or frameworks, both the teams that own that framework as well as the teams that are going to use that framework or platform would work together to build it up from scratch.” — Vishnu Ram [0:15:11]

“If your consumers have done their own research, it’s a no-brainer to start including them because they’re going to help you see around the corner and make sure you're making the right decisions at the right time.” — Vishnu Ram [0:16:43]

Links Mentioned in Today’s Episode:

Vishnu Ram

Credit Karma


TFX: A TensorFlow-Based Production-Scale Machine Learning Platform [19:15] 

How AI Happens


Episode Transcription

Vishnu Ram  0:00  

So one of the big things that we had to do pretty much right off the bat was making sure that our data scientists were able to get access to the data at scale, the scale at which the company was operating at, and be able to build the models in time so that the model maps to the future and performance well for the future, which is what we all care about, like we don't care how well our models perform. To history, we want our models to perform well for the future.


Rob Stevenson  0:30  

Welcome to how AI happens, a podcast where experts explain their work at the cutting edge of artificial intelligence. You'll hear from AI researchers, data scientists, and machine learning engineers as they get technical about the most exciting developments in their field and the challenges they're facing along the way. I'm your host, Rob Stevenson. And we're about to learn how AI happens. Here with me today on how AI happens is the Vice President of Engineering over at Credit Karma Vishnu rom Vishnu welcome to the podcast. How are you today?


Vishnu Ram  1:06  

Glad to be here. It's a great day outside and glad to be recording this with you, Rob.


Rob Stevenson  1:10  

Where exactly are you based?


Vishnu Ram  1:12  

I'm based out of California Bay Area.


Rob Stevenson  1:14  

Oh, lovely. So the fog hasn't quite rolled in yet. Sounds like it's sunny and lovely there. Yep. Wonderful. Well, great day for a podcast. As it turns out, I'm so excited to have you on Vishnu. There's lots for us to speak about. But before we get into your experience, building data teams, alleviating technical debt, setting up data culture, all that great stuff. Let's just get to know you a little bit. Would you mind sharing with me and the folks out there in podcast land a little bit about your background and how you came to be in this current role at Credit Karma?


Vishnu Ram  1:42  

Yeah, so I've been at Credit Karma for close to nine years now. Prior to Credit Karma, I was in India, I was doing a bunch of startups where I was doing like early stage CTO roles there. Thankfully, a couple of them are unicorns at this point of time. And that's where I really cut my teeth in engineering leadership and starting to play around with early data technologies outside of the cloud, and even just getting started with the cloud. Then once I moved into credit, karma was when I started being able to play around and like build teams to play around with a lot more data, building products, which impacts more than 100 million 100 and 20 million members at Credit Karma. So it's been it's been a long journey. And prior to my work experience, I definitely had some amount of exposure to AI in the form of fuzzy logic and neural networks and my undergrad and Master's. And it's like I had an opportunity to go deeper into neural networks, I decided not to at that point of time, which I kind of regret. But it's fine. It's worked out. Okay for me,


Rob Stevenson  2:46  

what made you want to turn away from neural networks at that moment?


Vishnu Ram  2:49  

I think those are very clear recognition that the impact that you could have outside in the industry, if you go deep into neural networks is just not there. And I'm talking about a timeframe and we didn't have the data capabilities or the compute capabilities that we all just completely take for granted today.


Rob Stevenson  3:09  

As a, what do you say you regret it is because we've kind of caught up to the technical limits of the time.


Vishnu Ram  3:16  

We have definitely caught up to the technical limits of the time. But I think I would have had an opportunity to have been much earlier from a research perspective and like have an opportunity to learn more of the foundational aspects, much earlier than I think I would have been out here at Credit Karma so far.


Rob Stevenson  3:35  

Surely there's an opportunity to inject some of that into the proceedings over at Credit Karma now with the scale that you're at, right?


Vishnu Ram  3:41  

Well, definitely. I think that's been part of my journey over here over the last seven, eight years now. So I think the opportunities that have come my way and come are companies where I think we've used it really well, from an AI perspective, we've always had ambitious things that we wanted to do for our members. And I think we've been able to do that with AI.


Rob Stevenson  4:02  

Now, do you wish this had been injected earlier? Or do you think you need to reach a certain scale with the team and the business before it was appropriate?


Vishnu Ram  4:10  

I think right from the point in time that I joined the company, we definitely had opportunities at that point in time. But to be able to inject AI to be able to do it at scale to be able to do the cutting edge that we have today. We needed to go through a journey. And part of the journey was getting our data are organized, making sure that we are able to use the right technologies, making sure that we were at the cloud. A lot of those pieces we had to figure out along the way, making sure that we had the right teams, the right skill sets in the right areas, just getting our organization ready, getting our technology stack ready. So bunch of different things that we needed to do before we were able to get to AI at scale.


Rob Stevenson  4:53  

I'm glad you mentioned that because that's part of why I was excited to chat with you Bishnu because you have Been with Credit Karma nine years, which is an eternity in startup years. And because of that you've really seen the organization grow in a meaningful way. Now Credit Karma at this point, I'm sure the folks out there listening are very aware of what Credit Karma does. Nine years ago, maybe not so much the case. I would love to, for you to kind of walk us through the journey of building Credit Karma into the data science operation, it is currently can we maybe start with what things looked like when you began at Credit Karma? How did you take stock of the operation and figure out how to make sense of the data plan there?


Vishnu Ram  5:33  

Yeah, I think when I joined critical mob, 989 years back, it was already operating at scale, we had 20 25 million users at that point of time. What that meant was, we were already delivering a very, very valuable service for all of these users. That's why we kept growing, we kept going, we kept growing, and we have right more than 120 million users today. So what that meant was the basic footprint in terms of product market fit at scale was already there. The other big thing from a data perspective as a result, so there was our founding team and our leadership team were all very data centric, they always wanted to use data to make the decisions. There were models in place, which we were using to deliver certain experiences to our members, one of the things that we've always cared about, is making sure that when we are recommending any financial products in front of a users, we provide them with a sense of certainty. I can go back, we talk about this quite a bit, but I can go back to my own college experience where I came from India, I came to the country. And then we were applying for credit cards. willy nilly, like I get a t shirt here, I get a t shirt, okay, show me where to sign was applying for credit cards. And at that point of time, I did not have the background to understand that like, Hey, I could be hurting my credit score, I could be hurting my credit report in a bad way. And I just did that. And then it took me a few months to realize, and then some lessons from my seniors telling me they're like, Hey, don't do it this way, do that this way, right. I didn't have Credit Karma at my fingertips when I came to the country, and which is something that a lot of new folks starting to deal with their credit get access to. And right from the beginning, this is like something that is very central for credit, we're providing certainty. And we had models in place, which were helping our members do that. But to be able to do that at scale, where when we kept adding more products, when we kept adding more users, we needed to get our data to work better. So there was a point in time when our data science teams to build a model, they will have to work one month or weekends to bring the data together into a training data set, which then they could use to build a model. So in today's time, is probably like completely a joke, because we know that our data changes every week, if our data changes every week. And if you're taking a month to put together a training data set for a model, that's just not going to work, your model is already out of date. By the time you have got your training data set together forget building a model and testing it and making it ready ramping it as an experiment and all of that. So one of the big things that we had to do pretty much right off the bat was making sure that our data scientists were able to get access to the data at scale, the scale at which the company was operating at and be able to build the models in time so that the model maps to the future and performance well for the future, which is what we all care about. Like we don't care how well our models perform to history, we want our models to perform well for the future. So that was probably like one of the biggest things that we had to take care of. And to do that we had to make a lot of investments in our data engineering capabilities. Because one of the pivotal investments that we had to make from a technology perspective was to go and start using Google BigQuery at that point in time. So once we were able to go to Google BigQuery, then Google provided us a way to scale our scale to our requirements in terms of how we are pulling together data into a training data set. Maybe I'll stop there, I can keep adding more based on questions that you have, but maybe I'll stop there.


Rob Stevenson  9:29  

Sure. Well, it seems to me obvious that having your data professionals have access to data upon which the company is making decisions, right? That seems pretty straightforward. Is that not often the case,


Vishnu Ram  9:42  

you would find that surprisingly true. Were in a lot of organizations as they keep scaling. This is kind of like an afterthought. I think in today's world, it might be different, but to a large extent. You want to first build your product to be able to do deliver an experience for your users, there is a lot of focus on that, then you want to make sure that your product is reliable and scalable. And lot of the things with respect to data and modeling kind of comes a little bit after that, I know things keep changing. But it's hard to do that right up front, because then you trying to add an initial friction or cost to launch your product and get it to product market fit. So for that reason, more often than not your companies as they're scaling their investments in data, their investments and data engineering and data science, kind of lag, the investments that they make in building products and bringing it in front of users making it reliable.


Rob Stevenson  10:43  

I'd love to hear you a pine a little bit on the nature of technical debt, because that's sort of what you're alluding to when there's not access to data on this scale. Okay, now, the models won't be accurate, they won't ship in time, et cetera, can one foresee coming technical debt or is the kind of thing that you have to go to the mountaintop yourself, feel the friction, and then you can build in processes to alleviate it moving forward?


Vishnu Ram  11:05  

Yeah. So I think one of the easiest ways to get a sense of technical debt is take one of Google's products or AWS products, or as yours products, and then say you want to try to build it in house yourself. And then you will get a really good sense of the technical debt, because a lot of these products have included in them years of learning years of best practices that get incorporated into the product. So when you start trying to recreate it yourself, what you're going to see is that you are capturing maybe only like 10% of the overall features that they have built. So the rest of the 90 person you not built it. Now, when you not build the rest of the 90 person, what then happens is you need to start compensating with people or processes. And then when you start compensating with people or processes, you could probably do it for a certain amount of time for a certain number of people, for a certain number of users that you're supporting. But once the user starts scaling, once the amount of time that you've been dealing with it increases, once your own team size starts growing is when you will start seeing the pinch, when you will start feeling the sense of like, Hey, what are the 90% that I need to take care of. So you could make a decision to say that, like I know what I'm doing. And if I know what I'm doing, then I can say, hey, I'll capture the right hand person at the right time. But you're making a bet on yourself and your team that you will capture the right 10% At the right time to be able to make sure that your technology platform can keep scaling with the needs of your business.


Rob Stevenson  12:38  

When you say compensating with people or processes, how is that different than normal business growth that would be expected and or desired?


Vishnu Ram  12:49  

Yeah, I think you're always going to have to do it. I mean, I put it in a way where it's like you should not do it. That's not what I really mean. What I mean is that you want to do it, but you want to do it carefully in a thoughtful manner. Are you deliberating? And are you making that as a thoughtful piece of your decision making process in terms of prioritizing where you're investing when you're investing? Or are you just like letting it flow? I think that's the way I would put it, you want to take on technical debt at the right points in time that allows you to continue to make the right prioritization decisions at the right time.


Rob Stevenson  13:28  

As he so fast forwarding a little bit now, nine years on, how would you describe the data culture of Credit Karma? If it's okay to use a term like data culture?


Vishnu Ram  13:38  

Yeah, so I think like I mentioned earlier, a lot of companies and organizations have issues in just getting support from leadership, when it comes to making the right investments at the right time. As far as data and AI are concerned. One of the benefits that I've had and my teams have had is like we've always had consistent support from our founding team, to go after the right things at the right time. And then the next part of it is from a data culture perspective is looking at some of the aspects of data, like data engineering and data science, and understanding the nuances which makes it different from software engineering, and which also understanding the state of the art in terms of tools and processes that are available in software engineering is very different from that in data engineering and data science. So once you know the gaps and in the market, know the gaps in tooling, know the gaps and processes, then you will know how to like wait, actually, to some extent for some of those processes and tooling to catch up. So in our in our case, I can probably bring up adoption of TensorFlow, for example. So our adoption of TensorFlow was when TensorFlow didn't exist. We were trying to jury rig our own machine learning framework and pattern runs and practices on top of Spark internally. And then we were trying to get our data scientists to start using it. And then once TensorFlow started coming out, we started following how TensorFlow was being developed by Google. And then we just decided to go and make a big bet on TensorFlow. So then what happened at that point of time was like our data, scientists felt like really comfortable. Because it's one it was coming, it was backed by Google. Second, it was an open source. And third, we were able to adopt a platform like that, in close collaboration with data engineering and data science. So that I would say is a big part of our culture, where whenever we want to introduce like new platforms or frameworks, we find both the teams that own that framework, as well as the teams that are going to use that framework or platform to just like work together to build it up from scratch. And we've seen a lot of success. And not just in data, but also in other areas, as far as engineering organization is concerned,


Rob Stevenson  16:00  

is the risk of not doing it that way. A process that is not useful or meaningful, or shelfware, what what is the risk of not having the sort of multi team approach?


Vishnu Ram  16:12  

It could be all of the above, right? I think there are obvious pros and cons to doing it. One con to what I suggested was, you're going to have to go a little slower at the beginning. So there have been instances where we wanted to operate at speed, where we would just go and build quickly, but then have a very clear thought process that like, hey, the way we are building we are actually have the potential of this becoming shelfware. So we would create natural points at which we want to involve folks from the other teams to come in and start working with us on that. And I think we've had more successes in rolling out some of these platforms and frameworks within the data world because we either work along with our consumers right early, because the consumers actually in some situations, they know more than you do, in terms of what they want. And they have been like doing their own research. So you have to rely, I mean, if your consumers have done their own research, then no brainer, it's a no brainer to start including them because they're going to help you see you around the corner and make sure that you're making the right decisions at the right time.


Rob Stevenson  17:18  

I'd love to hear more about the process of adopting TensorFlow, I just love to know what it looks like Vishnu when, for example, you and your team are whiteboarding. Okay, how do we actually inject this in a meaningful way? Who are the stakeholders? Who do we need to connect together? What does this look like, internally when you start plugging something like this in?


Vishnu Ram  17:37  

Yeah, I think, especially with respect to tensor flow, the way we went about it was, we were having conversations with our data science teams and trying to understand some of the challenges that they were facing. And it was very clear that the way we were going about developing our own infrastructure, and where we were trying to, like get data science to start using that infrastructure was just not working. There were a lot of challenges, we had made a bid on Scala, we said, like, hey, we select data scientists are going to write a small amount of code doesn't matter where they write it in Scala, or Python. So let's just get them to use what we are building. But let's get them to write Scala code. And then once we started the process, then we started getting feedback from the team. And this is a team which had a lot of our background. It's not like they were like, predominantly Python, especially the timeframe when we started rolling it out. And then once we got a lot of this feedback, after a few months of working with our data science team, we realize that this is not the answer, we need to go back to the drawing board and start figuring out a different answer. And around this time, some of us had already been following how TensorFlow was growing and how Google was developing TensorFlow, it's still very, very early. And I have a bunch of TensorFlow Dev Summit T shirts lying around that my daughter's just keep wearing it nowadays, I don't get them at all now. But this one, TensorFlow was doing like dev summits in person. And it was hard to get into those dev summits. And like we were, some of us were able to get into those dev summits very early on. And we realized where the TensorFlow team was going to take this to. And we realized that there were problems that we were talking internally that Google was already solving for us. And given that we were already like on BigQuery. And we want to start taking, like a big bet on Google Cloud as well, for everything that we do. from an infrastructure perspective. It was easy for us to say, hey, let's just go and start using TensorFlow. And then around that time, Google had also published this paper of TensorFlow extended, they had not open sourced everything about TensorFlow extended, but they published this TensorFlow extended, and I have a very clear memory where I looked at that paper it came out over a weekend or something like that I looked at over the weekend, and then I just like, oh, the weekend is sent it to the core working team, we just created like a core working team, involving both data engineers and data scientists and sent it to the team and said, like, Hey, this is what we want to burn. This is what we've been talking about. Google has published this, now let's maybe start learning from how TensorFlow extended is working, and then look at what is out there, look at what we need to build, and then start building it. But one of the earliest things that we also did was scary to talk about at this point of time was, also understand what our initial use cases of the new framework would be, we already had in mind how we were going to use this new TensorFlow based framework to go and deliver value for business under members. And that was a pretty big deal, because we had to figure out how we got to train models, we had to figure out how these models are going to get scored and production. So once we had a little bit of a sense of confidence that like, Hey, we're going to be able to train these models, well on top of TensorFlow, then we needed to scramble and get our production serving production. Because I'm not talking about offline scoring at all, I'm talking about online scoring of models in real time, for our users, that's the use cases that we wanted to go and deliver. So then I had to go and quickly get my production serving model serving team incorporated very, very quickly as part of the working team. So then the working team slowly starts expanding. And then when the production model serving teams gets involved, they got to figure out like, Hey, I have requirements from our platform engineering team to be able to support it. And then you start making sure that you're sharing your thought process, you're sharing the big win that you're going to achieve for the business. That's why it was important for us to learn the use cases early, then you are able to then get the right amount of support from the rest of the business to be able to go to something like it. And we did that in a very, very short amount of time, I would say for the kind of changes that we were making. It's like at the time, it was like a couple of quarters that we promise and we were able to deliver that in time. And it ended up being a big business win for the company, we did like a lot of internal sessions of how we were able to get this from scratch, to running in production. And it's probably something that I'm just going to remember for the rest of my life, how we went about doing it, and how me and I'm just like really grateful for the team that I get to work with when when we do these kinds of projects together.


Rob Stevenson  22:32  

So in this case, mapping the need for the new model to a business use case, it sounds like was almost a little bit of internal PR, like it was probably clear to you and your team and some of the more technical folks that this was important. But this is a business that exists to make money and serve customers, right serve users, you want to put things in terms of how it helps the whole business, right? So was that merely internal PR or just to help get buy in? Or what did it help with the actual, entirely the implementation, technically speaking.


Vishnu Ram  23:04  

So I think the internal education when you are moving away from logistic, random forests and other logistic models, into neural networks, the internal education, I think, is really critical. Because you are letting go of some aspects of what you're going to get from models, but you're going to gain a lot more in terms of business performance, in like given if you have leadership team, if you have a lot of people aware of how data works, then you're not trying to start from scratch, they are already educated. So then at that point of time, if you need to add a little bit more in terms of education, it's on you to do it. Well. I think that's important. You can't let them stay where they are, you need them to also get up to speed in terms of what you're trying to do. Where you expect this to go. You need to have a sense of it. I think having that sense. And then taking a little bit of effort to go and do that has always just like paid dividends for us. Some of this is also based on learnings, right? From previous projects that we did when we did not do it, we could see at the end of it when we do a retro video to say hey, if only we had done this education process a little bit better, we would have had a better chance of success, or we would have had like a much smoother ramp up more often than not the technical delivery lands. That's what we have seen because we have strong engineers and teams working on these projects, the technical delivery lands, but you need to make sure that post the technical delivery, you're able to ramp it up to scale so that it meets the needs of the business. So that's where I think more often than not like the education helps us get more support from other teams is the way I think about it. Do you want that support? Do you don't want that support? Sometimes you think you don't need that support. But most times you end up needing support and that's why that initial little bit of effort on the Education just goes along with.


Rob Stevenson  25:01  

Yeah, definitely. And it's so interesting. This is quite unrelated to your previous experience working in data science to your previous university coursework and neural networks, right? This is leadership, this is management. This is like navigating inside of a business to accomplish goals that help everyone. And I think I'm calling this out, because for folks out there who may be individual contributors right now, but they want to get into a role like yours into a management director, or VP kind of role. This is really important that they get this stuff, right, as you just call it out, you had the technical talent in spades to execute on something like this. But that's not all it takes to get a huge product like this across the table, or across the finish line. Could you speak a little bit more on that shift from having the technical talent as an individual contributor into management, where you're now thinking of more organizational wide challenges?


Vishnu Ram  25:54  

Yeah, so one of the things like even coming from my background, like I talked about me starting out as early stage engineering leader back in India. And when I was doing that, I did not have any other engineering peers that I needed to work with, I could just say to the founding team, make the right decision. And then it's just a small team, we would move fast, and we just get things done. So in that case, the communication piece is just with the founding team, with the engineers, and maybe a couple of people on product or marketing, and then you're done. Once your organization start scaling it 3040 50 100 200 500,000 is when you need to figure out how to start improving your communication aspects of how you build and deploy and scale things. And if you're not able to do it, more often than not, you will find different teams working at cross purposes, they are not trying to do things which are good for the business, everyone operates with good intent. It's not like teams and engineers, or data scientists or marketers don't operate with good intent. Everyone is operating with good intent. But they just don't know that like, am I rowing in this direction? Or am I going backward? I'm going sideways, like which direction are you rowing? Is this becomes a problem, where are we heading, I think that becomes a big problem. And it's important for everyone to understand that communication gets harder and harder and harder. And when communication gets harder, it becomes most important for you to assume good intent, it becomes most important for you to share more, just communicate more. And you might feel that the other team or the other person, they don't really need to know learn about it. But the fact that you took the time to share it with them, that will create the right culture, because then they also want to say oh yeah, I get information from Team X about what they are doing, I know that they could be interested in information that I am doing. So let me share that also. So once you have the process of like communicating out and then receiving and then listening, then you got to be able to fix or make the right course corrections much earlier. Otherwise, you are going to make those course corrections only when, say two boats have just like crashed into each other. Right? So and then when two boats have crashed into each other, guess what, it's really, really hard to have good intent. At that point of time, you're gonna think that team was really trying to actively sabotage your clients. But that is not the thing at all. The other team was also trying to do its job of like, with good intent to take the business to a better place. So but it's hard to do it when you're crashing into each other at all times. So that's where I think some of the communication aspects really come to play. Well.


Rob Stevenson  28:47  

Yeah, that makes sense. That's tremendous advice, Vishnu. And I'm going to ask you for a little bit more advice before I let you go. For the folks out there who are working on data teams within their own companies, data scientists, what advice would you give them to make sure that they are participating or helping create a data culture resulting in clean, usable, high quality data?


Vishnu Ram  29:09  

I mean, the biggest thing that I personally learned and I think this is something that I learned very, very early on, when I in the company that I worked in, one of my CTOs told me, I was building a network management system product, and my CTO told me, you have to go and sit next to the users of the product, sit next to the users just watch what they're doing. Just talk to them, just get comfortable talking to them. And I think that's one message that I'm never going to forget. And in all our roles, it's really hard to do that all the time. You have to find equivalent proxies of doing that. And the one way that I talk to my teams about that is think about what you're building, think about whether it's model or a system or service. Think about the problem that you saw. Solving, when that problem gets solved, imagine how that will impact the end user or the team or the company. Just make sure that you're able to imagine that if you're able to imagine that, guess what you got to find better ways of solving the problems, or you will find better problems to solve, which is much, much more powerful. And I think that's one thing. And it doesn't matter where you are in your career. It doesn't matter what title you have, it doesn't matter which function you are in to be able to do that. Well just gives you superpowers.


Rob Stevenson  30:37  

That is great advice. I don't think we're going to find a better bookend for this episode than then getting superpowers. So at this point, Vishnu, I would just say thank you so much for joining the show. This has been a fantastic conversation I really have loved speaking with you today.


Vishnu Ram  30:50  

Thank you. Thanks a lot, Rob.


Rob Stevenson  30:53  

How AI happens is brought to you by Sama. Sama provides accurate data for ambitious AI specializing in image video and sensor data annotation and validation for machine learning algorithms in industries such as transportation, retail, ecommerce, media medtech, robotics and agriculture. For more information, head to