How AI Happens

DataRobot's Global AI Ethicist Haniyeh Mahmoudian, Ph.D

Episode Summary

Joining How AI Happens today is Dr. Haniyeh Mahmoudian, Global AI Ethicist at DataRobot. Dr. Mahmoudian specializes in providing technical and educational guidance on the development of responsible AI. She holds a Ph.D. in Astronomy and Astrophysics from Bonn University, and was named an AI Ethics leader by Forbes. She is also a member of the National AI Advisory committee (NAIAC) that guides the President on ethical AI development and use.

Episode Notes

In our conversation, we learn about her professional journey and how this led to her working at DataRobot, what she realized was missing from the DataRobot platform, and what she did to fill the gap. We discuss the importance of bias in AI models, approaches to mitigate models against bias, and why incorporating ethics into AI development is essential. We also delve into the different perspectives of ethical AI, the elements of trust, what ethical “guard rails” are, and the governance side of AI.

Key Points From This Episode:

Dr. Mahmoudian shares her professional background and her interest in AI.
How Dr. Mahmoudian became interested in AI ethics and building trustworthy AI.
What she hopes to achieve with her work and research.
Hear practical examples of how to build ethical and trustworthy AI.
We unpack the ethical and trustworthy aspects of AI development.
What the elements of trust are and how to implement them into a system.
An overview of the different essential processes that must be included in a model.
How to mitigate systems from bias and the role of monitoring.
Why continual improvement is key to ethical AI development.
Find out more about DataRobot and Dr. Mahmoudian’s multiple roles at the company.
She explains her approach to working with customers.
Discover simple steps to begin practicing responsible AI development.

Tweetables:

“When we talk about ‘guard rails’ sometimes you can think of the best practice type of ‘guard rails’ in data science but we should also expand it to the governance and ethics side of it.” — @HaniyehMah [0:11:03]

“Ethics should be included as part of [trust] to truly be able to think about trusting a system.” — @HaniyehMah [0:13:15]

“[I think of] ethics as a sub-category but in a broader term of trust within a system.” — @HaniyehMah [0:14:32]

“So depending on the [user] persona, we would need to think about what kind of [system] features we would have .” — @HaniyehMah [0:17:25]

Links Mentioned in Today’s Episode:

Haniyeh Mahmoudian on LinkedIn

Haniyeh Mahmoudian on Twitter

DataRobot

National AI Advisory Committee

How AI Happens

Sama

Episode Transcription

Haniyeh Mahmoudian 0:00

So when you talk about guardrails, you can think of that best practice type of guardrails in data science. But we should also expand it to this type of governance and also the ethical side of it.

Rob Stevenson 0:15

Welcome to how AI happens, a podcast where experts explain their work at the cutting edge of artificial intelligence. You'll hear from AI researchers, data scientists, and machine learning engineers, as they get technical about the most exciting developments in their field and the challenges they're facing along the way. I'm your host, Rob Stevenson. And we're about to learn how AI happens. Joining me today on the podcast is a fantastically interesting professional. She has had roles across loads of organizations on the data science team as well as in academia. Currently, she serves as an adviser to the national AI Advisory Committee underneath the US Department of Commerce, and serves as the global AI ethicist over at data robot. Dr. Honey ammonium is here with me, honey, welcome to the podcast. How are you?

Haniyeh Mahmoudian 1:08

I'm fine. I actually I'm quite interested in having the discussion with you.

Rob Stevenson 1:13

I'm so thrilled to have you on just because I don't know where to begin, like we can talk about the Department of Commerce, we can go back to your data science, there's just a million things to talk about, I guess, since I kind of breezed over your background, would you mind sharing a little bit about where were you then where are you now? How did you kind of come to your current role over at data robot?

Haniyeh Mahmoudian 1:30

Absolutely. So when I finished my PhD, I wanted to kind of start working in industry. And since my background was in image processing, and working with data, transition to data science was kind of the most natural thing to me. And I was fortunate enough to be able to work on different verticals, whether it was customer relations, side, fraud detection, both on the financial side as well as on the retail side. And one of the projects that I was involved with was in human capital management, and more specifically in hiring and retention area. And as you can imagine, this is a very sensitive topic requires a lot of attention. And that's how I got interested into AI bias and more broadly, ethics and responsible AI side of it. And when I joined data robot around four years ago, one of the things that I saw in the platform missing for the users is the ability to be able to test their models for bias and potentially mitigating. And that's how I connected, I was connected to the team that was working in this area. So we would be able to actually build those features and put it out there for the users to be able to test their system before deployment and also the possibility to monitor it over

Rob Stevenson 3:02

time. What was your PhD in?

Haniyeh Mahmoudian 3:04

It was in astronomy completely different, but kind of played around for me as part of the working with Hubble Space Telescope, that was very foundational for me to be able to use that knowledge and skill of working with data and bring it in my data science experience.

Rob Stevenson 3:21

This is weird, I have to call this out. This is my second recording today for this podcast. And my last one was also with a former astronomer who worked on the Hubble telescope, who was a data scientist and now works in AI. Do you know, Kirk born? No. Okay, maybe you guys should meet because you have very similar backgrounds. It's so peculiar, especially in this space. I feel like I meet folks with lots of different backgrounds, usually, computer science, math, you know, but it's exciting to speak to astronomers I took an astronomy course in college was my favorite science course I ever took. I'm sure we were speaking very different languages, though. My studies versus what your PhD so you focus on what was your thesis all about?

Haniyeh Mahmoudian 3:59

Mine was around strong gravitational lensing. And more specifically, using data to figure out the speed of universe, the expansion speed of universe.

Rob Stevenson 4:11

Okay, what TLDR? What's the expansion speed of the universe, honey?

Haniyeh Mahmoudian 4:16

Well, my work showed that it was around 70. I'm way out of date with most of you what? We have a very specific metric for that, which honestly, after 10 years, I barely read. But we pretty much agreed what the research that came before us. So we didn't find anything, you know, a new number, let's say. So there's mostly confirming the previous work, but I don't know if the new research has changed anything on that or not.

Rob Stevenson 4:52

Yes, sorry. It's probably not fair for me to be asking you about 10 year old research but is it like light years per second? What's going on? conceptualize what, what that speed even refers to?

Haniyeh Mahmoudian 5:03

It is something similar to that, but at a different scale, because you're also looking at the distance that is going to expand. So that would also be included, too. When you talk about the expansion rate.

Rob Stevenson 5:16

Can I ask you and astronomy question that's always puzzled me, the universe is expanding constantly, what is it expanding into? Well,

Haniyeh Mahmoudian 5:26

there are different scenarios, and one of them is it will expand forever. And the other one is still expanding, but at a much lower rate. And there is another one that you expand to the point, and then you start collapsing. What we see right now, at least, again, quite out of date, but at least the evidence from the observations showed that we are more on this side of expanding and not strong evidence of the expanding and then collapsing type of universe.

Rob Stevenson 6:02

Got it? So I mean, it's probably not like with my finite three dimensional lizard brain, probably not worthwhile to be like, oh, there's something on the other side of this that like, like pop astronomers will be like, imagine this balloon inflating, right? That's probably not useful, because that balloon is inflating inside of a room and there's not a room around the universe, right?

Haniyeh Mahmoudian 6:22

Yeah. So it's pretty much the whole thing is expanding. So we're kind of, it's hard to explain your, because we tend to think about ourselves within a kind of bigger space. But this is a different thing is that we are the space when we are expanding is the space. It's that it's expanding.

Rob Stevenson 6:43

We are the space Wow, it was that should be the title of your thesis. Thanks for being a good sport and fielding my astronomy questions. I know, this has not been the main focus of your work. But we shouldn't move on to that stuff, though. So I would just love to hear a little bit more about your journey in data science. As you said, your background lends itself very well to that. And then at a certain point, you pivoted now you are focused more on the the ethical side, the trustworthy side of AI, could you tell me about that inflection point for you? At what point did you decide, okay, I don't want to work on the pure data science side. And you're going to focus more on on this more building trustworthy AI.

Haniyeh Mahmoudian 7:19

So it was around a time that I mentioned that, I saw that, for example, bias and fairness tests are missing in our platform. And, you know, I joined a team that their sole focus was around trust. And bias and fairness is one aspect of it. You know, when you're thinking about ethics, you want to think about, you know, you mentioned privacy side of it. are we considering the privacy, when we are building a model? What can we provide to our users to help them in that area better, it's sometimes you might accidentally upload a data that might have PII information or information that is sensitive. So you want to have ways to be able to give a warning that, oh, we think that you might have some columns into your data, some features that might be PII, so please go and check it out. So this is, for example, one aspect that you can think of trust. And that is very much interesting was interesting to me, because in a lot of situations, when we see the headlines, it's not like people were really putting effort into doing something wrong. But rather, it's just mistakes or not having a proper processing place to make sure that the system is working and behaving as intended. So part of what I wanted to accomplish as part of being a member of that team and bringing to the product is those guardrails that would help users to really make sure that they can trust the system and the tool that they are making.

Rob Stevenson 9:12

Could you give an example of some of those guardrails? Absolutely. So,

Haniyeh Mahmoudian 9:15

you know, we mentioned the bias and fairness side of it, right. The other side, when we are thinking about trust, the performance of the model is one aspect of it. How we should be thinking about the performance, you know, do you want to look at the performance as one aspect, or you want to consider, for example, the speed of it, because imagine you're in finance, if you're not able to give a response. If someone's using their card at a sales point. If it's not fast, there is a problem. So you want to trust the model that is able to give that response in a timely fashion that's required by your base Bonus. So that is you can think of that as one element of trust the other side of it, so things like accuracy and performance. Also what I mentioned around metrics or tests that you can run to make sure you have a high quality data, you can kind of bucket it into performance side of it. The other aspect of these guardrails could be around operations. Do you have any guardrails in place when you put your model in production, if your model is not performing at the same level, when you were training your data, if the performance is decreased significantly, you may want to take it out of production and put a new model there to be have guardrails around that to notify responsible people to get involved. So these are all different guardrails that we can think of, from the governance perspective, who can access the data who can access the models in production, to be able to replace them, what should be the approval workflow, and as we talked about another sides of it is ethics, which would include the bias and fairness side of it, the privacy side of it, and also explainability aspect of it. So when you talk about guardrails, sometimes, you can think of that best practice type of guardrails in data science. But we should also expand it to this type of governance and also the ethical side of it.

Rob Stevenson 11:28

It's an interesting sort of juggling act you have to do because the ethics and the trustworthiness are related, but they're different, right? Like you could conceivably build trust without being ethical, right. And like, I'm thinking, for example of, you get a popup on a website that's like, your data is important, or your privacy is important to us. And maybe they link to a blog post where it tells you about, you know, what they're doing to treat your data in a meaningful, responsible way. But I have no assurances of that. I just have like a marketing, you know, bit of copywriting telling me that's the case. And so you could, that would conceivably, some people might see that and be reassured and know, that's trustworthy. But it's no guarantee that you're actually building the trustworthy technology, right? Also the process of like the expectation of trust of a user, very different from the process of developing a technology internally that they may never see. How do you sort of conceptualize that juggling act between what users need and expect versus what ethical technology development means?

Haniyeh Mahmoudian 12:26

So actually, I have a slightly different take on what you said, I would say ethics is part of trust, in a sense that if the system is not built ethically, you're not really going to trust it, you know, the example that you mentioned, you might end up going to a blog post that says some, you know, good points, but that's the end of it. And you may continue with that product, but still in the back of your head, you're like, I don't know, am I being discriminated against? Is my data being used without my consent? Have they violated my privacy? Exactly? To your point that? I don't know. Right? So you may continue with a product, but still, you have these questions. And to me, that already shows that you're not 100%, trusting the system. So for me when I'm thinking about trust, and trust is by itself is a very broad umbrella. But ethics should be included as part of it to truly be able to think about trusting a system. And, you know, an extreme example of that is, we need to think about everything that we need from a system to be able to give permission for that system, to make a decision about our lives. To me, that's the ultimate way of trusting a system. Imagine that system is an AI system that's making a decision about my health. Right? So I truly need to trust that system. So I need to know exactly what's been going on there. And if that system is not ethical, if I'm not sure that that system is not biased, then I can really, you know, give my permission for that system to make any decision about, you know, my health care. So in that regard, there is slight difference, because I need to have that confidence from the ethics perspective, it to it may be have a very good performance. They have good guardrails, around operations. But if I'm not sure that they build it ethically, I don't know if the system would work for me. So I won't be able to truly trust and use that system. So for me, that's kind of the difference between the two and thinking about ethics as a subcategory, being in a broader term of trust in the system.

Rob Stevenson 14:50

Okay, that makes sense. Once you have some operating definitions here, I suppose then you can start to actually build the guardrails. And I'm really curious how that happens. Like can we kind of get into the like, I would love to just know, what is the whiteboarding session where you're like, Okay, what about like, where do we start injecting features or information? What have you to engender trust? How do we figure out what people need to trust us? And then how do we deliver it to them?

Haniyeh Mahmoudian 15:15

So exactly to the point that you made earlier that trust means a different thing for different people. If I put my technical data scientists had probably, for me trust is about, you know, making sure that model is performing as I wanted to pi accuracy, depending on what kind of accuracy metrics I'm interested in. If bias and fairness is relevant, what mathematical function I need to use in order to make sure the model is behaving and not exhibiting any discriminatory behavior. So for me, these are the ways that I evaluate the model, it brings out trust for me by seeing those metrics, seeing those numbers. However, if I'm a senior executive, I don't necessarily look look at those technical terms, and technical evaluations. But focus for me is around risks and benefits. What are the risks of using this AI tool? Or building this model? And what are the benefits for my business? And if there are risks, how I'm going to mitigate it, how I'm going to manage those risks? So maybe the focus for this type of Persona is more around governance piece of it, do I have processes in place? Do I have a robust framework to address the risks that I may have? You know, from the IT perspective, that's a whole different type of features and concerns that they may have. So they will look at trust in a different way, then we also have the end users, what they need in order to be able to trust the system, whether they're going to use it, or whether they would accept the system to be applied to them, the end user probably would need more around explainability side of it, how the system is making a decision about me, maybe it's more around disclosure side of it, what kind of information we should share, to the end user, to make sure that they understand what's happening, how the system is working. And maybe there's different type of governance there, maybe we should give them an option to opt out from AI to make a decision about them, and be able to interact with an actual human as part of the process. So depending on the persona, then we would need to think about what kind of features we would have. So part of it, the work that we do is, you know, purely technical. So it would be building those features, different ways to, for example, evaluate a system for fairness, because as you know, there are different ways to define fairness. So depending on the context use case, you would choose one metric over the other. So how we can accommodate for technical users, like data scientists to be able to use the platform and get the information that they need, but also having different type of guardrails, that would address some of the questions that say, from the executive side, if they have concerns around discrimination, in the use case, from the perspective of the executive, what they want to see is to have a system that would have either ways to identify if you have a bias in the system, but also the possibility to be able to mitigate those biases. So they want to see the processes, not necessarily a specific technique to use it, but having those processes in place, and the type of information that would be relevant to them. So we want to try to adjust that. So having ways for the senior executive to interact with the system, see the processes, see the documentation, at a very high level of what's happening here is the result. Here's the report of how the system was built and how it's performing. From the technical side, you know, the data scientists would be able to have not only metrics to be able to measure the system for bias, but also if they identify that the system is bias, ways to mitigate it. So we would have those type of technical features for them as well. And from the perspective of production, making sure that we would have features that would notify us moving forward, better the system, maybe at this point in time, this system is fair, six months from now, it might end up being bias, right. So we want to be notified you want to monitor that system. So these are all part of different personas. Which, you know, probably monitoring would fall more on, they're kind of the IT operations side of the personas. So we want to address their needs as well. And also creating reports that they would be able to review and be able to take action through alarms, notifications, however that might look like. So these are, you know, depending on what kind of persona we look at, we try to think about what they need to see what's important to them, and tailor those features for those personas.

Rob Stevenson 20:33

It's such an interesting, multidisciplinary problem. Because when you speak with personas, you're speaking my language a little bit as a marketer, marketing teams and sales teams operate on these buyer and user persona documents. And they're all the people who wind up in deals and who wind up using your product, right? And they get it some of the stuff you're talking about. It's like, what are their frustrations? What are they care about? And then you just take it to a more technical end, which is like how do they evaluate risk? Right? What are they worried about? What's going to happen to their technology, if they don't implement bias free things, or don't implement some kind of bias detection and resolution? And there's also just this element of psychology to when you're thinking about like, Okay, what is someone with this title and these responsibilities and these incentives? How do they evaluate their own technologies? Ethics, right, that cannot have come up in your astronomy PhD studies like this, it seems like a very, very new skill set. Yes.

Haniyeh Mahmoudian 21:32

As I mentioned, as part of, you know, working in the use case, that was around HR, I started reading more about these topics. And part of it is actually talking to the customers that we have, right? Because they have their own challenges, they also have their own solutions. So when, usually, when we have kind of these discussions with our customers, we go in with the purpose of actually exchanging knowledge, there's not necessarily a right or wrong answer, obviously, the wrong answer is, you know, building a model less discriminatory and be fine with it and put it in production, which I think everyone agrees that's the wrong thing to do. But, you know, when we have these dialogues, the main purpose for everyone is to actually exchange our knowledge, see, what we've heard from other customers, what worked, what didn't work, because this way, we would be able to actually improve the frameworks that we have. Because you know, the technology is advancing two years from now, all the use cases that we might be working on AI might have actually changed. So the frameworks that we have may not really apply to those use cases. So we need to be thinking about it in a dynamic way. They'd be always trying to improve it be always trying to implement something new that would address the needs, for those use cases that are, you know, becoming more and more popular. Moving on. So it's important that, you know, taking that into consideration, so part of our work is also to sit down with our customers and have that dialogue with them.

Rob Stevenson 23:18

So I guess we haven't really drawn a circle around what exactly data robot does, could you maybe give me the the elevator pitch for the company, because I think that'll lead into what I'd like to talk to you about next?

Haniyeh Mahmoudian 23:28

Absolutely. So data robot is a platform, we provide a platform for users to be able to build their models in a more automated fashion. So you have your data, you have a use case, in model in mind, you can bring that data into data robot build models faster than doing it manually, most of the things are automated. And you can also use data robot to deploy those models and monitoring over time. So you have the ML experience experimentation side of it, which is usually what data scientists do. But you also have the ML production capability, which is usually what the operations team is responsible for. So we try to provide the platform for to address the whole AI lifecycle, as you may call it. So for us, it's more about providing the software providing all the features and tools that the practitioners would need. But also we have a dedicated team of experts that would support our customers on their AI journey.

Rob Stevenson 24:35

Okay. So thank you for allowing that because it's an important distinction in your role. There are sort of AI ethics czars being installed at companies just to make sure that the use of their technology or the drilling in their own technology is done in a mindful way, which is your role, right like you are involving and making sure that data robots own like in house probably it's following some of these guidelines. but also you're building the technology for other people to use. Like, maybe that AI is our would be like one of your personas, for example. So you are doing this in house as well as building the products for others. That's why I wanted you to kind of outline that. So absolutely,

Haniyeh Mahmoudian 25:12

you know, as I mentioned me to have conversations, for example, with compliance teams to be even with ethics teams at other companies to kind of, because they might have questions of okay, how I should be addressing that, if I want to use data robot, then we will come in and have that dialogue. And sometimes these themes are new, there's at the companies, they are thinking about bringing trust and ethics into their own organization, but they don't know where to start. So that's where we come in, have that dialogue with them kind of to get a better understanding of what their approach is, what they're looking at the type of use cases that they're thinking of. So kind of, it's hard to kind of pin it down that part of my role is, as you said, exactly on the product side, thinking about how ethics can be woven into our product, but also having those type of conversations and enablement sessions with our customers around kind of AI ethics side of it.

Rob Stevenson 26:12

Right, so you are involved in the product development at data robot, but then you're also the person you go into at your potential customers and speak to, you're also doing that role for data robot, right? So you have two jobs you deserve a raise is what I'm getting at. But when you're in those conversations with users or potential customers, and you're trying to understand what they need to trust the technology and to develop their own technology in a mindful way. How do you elicit an information out of them, because it feels like they might not necessarily know what they need to trust something.

Haniyeh Mahmoudian 26:46

So one of the things that we do is we actually introduced our own trust framework, what we consider data robot framework for Trust, which we kind of touched on it, you know, trust with regards to performance trust with regards to operations and trust with regards to ethics. And then we start a dialogue with our customers or with our prospects, what part resonated with you, what parts is relevant to your organization? And also thinking about, you know, they know their organizations better than us. Right? So which part they feel like there's a gap, which part they feel like they can improve on their side? Or maybe they have a different opinion, on kind of the framework? You know, we would want to have that dialogue with them, if they view it in a different way. Absolutely. Let's have that dialogue, because part of that discussion, is, as I said, exchange of knowledge, thinking about if there is a difference, actually, maybe it's a good one, maybe we should implement that maybe we should incorporate it in our framework, or maybe they can incorporate what we had in their framework. So part of this is, you know, when we talk about our framework, that opens the door for the discussion, but are in agreement that with kind of thinking that something can be added to it, there's still a conversation would go on. And in my experience, what I've seen is that people are quite eager to discuss this topic, because it's something that they truly believe that, you know, they need to have that in their organization better if it's first changing their culture, to have that mindset, or they already have that culture in place. They just need to have a process to implement it. They always eager to discuss that and thinking about new ways that they can incorporate frameworks or principles into their workflow.

Rob Stevenson 28:49

That is great to hear that there's the existing enthusiasm, that you're not out there having to convince people that they should do the right thing, you know, but some companies are further along with it than others, even if there is the understanding that this is important. Say someone is working in a company and maybe they don't have the compliance or oversight team. They're not putting in technology and guardrails. They maybe they know they should, but just has never been prioritized. What can that person do to begin this process besides going to data robot.com and clicking request a demo? What are kind of the questions they can ask in their organization? And how can they start pointing them towards a more responsible tech development process?

Haniyeh Mahmoudian 29:29

The first thing would be education, being able to provide education. And you know, we talked about different personas. It's similar for the education piece of it as well, because what a data scientist needs in order to be able to familiarize themselves with let's say, Trust and ethics might be a little bit more on the technical side, versus a non technical person, maybe senior executives or a person on the operation side. So having those education's tailored to the needs and responsibility of the individuals, would be very helpful for them to further understand the needs. Because if we talk about ethics from the data scientist standpoint, other people may not really be able to understand either what's happening, or they can't really connect it because it's not part of their responsibility, right. So they can't really understand how that would affect them, or what they need to change in their roles. So it's very important to make sure that those education is relevant to people's responsibilities and roles. So they would be able to understand in their position, what needs to be changed, or, you know, maybe everything is exactly as it should be. So at least they would have that knowledge and education. And this can help with also changing the culture because in many situations, you know, when you think about trust and ethics, sometimes people might get a little bit offensive that I'm doing everything ethically, I haven't done anything wrong. But having those education's can actually help that, you know, reduce those tensions and defensiveness be acknowledged, everyone wants to be building AI responsibly, it's just, we may need to have some guardrails may need to have some processes in place that we may not have had in the past. So he's kind of changing that type of mindsets through education.

Rob Stevenson 31:34

Yeah, makes sense. That's great advice. And at the end of episode full of great advice, so how do you at this point, I'll just say thank you so much for being on the show with me and walking me through your experience and your work over there at data robot. I've loved chatting with you today.

Haniyeh Mahmoudian 31:46

Absolutely Same here.

Rob Stevenson 31:50

How AI happens is brought to you by sama. Sama provides accurate data for ambitious AI specializing in image video and sensor data annotation and validation for machine learning algorithms in industries such as transportation, retail, e commerce, media, medtech robotics and agriculture. More information, head to summit.com