How AI Happens

StoneX Group Director of Data Science Elettra Damaggio

Episode Summary

To help us paint a clearer picture of the inner workings of data science, we are joined today by Elettra Damaggio, the Director of Data Science at StoneX Group Inc., a business that connects other companies and people to various markets. Elettra’s foundations in ML are in neural networks and when she joined StoneX, she was thrown into the deep end of data science.

Episode Notes

After describing the work done at StoneX and her role at the organization, Elettra explains what drew her to neural networks, defines data science and how she overcame the challenges of learning something new on the job, breaks down what a data scientist needs to succeed, and shares her thoughts on why many still don’t fully understand the industry. Our guest also tells us how she identifies an inadequate data set, the recent innovations that are under construction at StoneX, how to ensure that your AI and ML models are compliant, and the importance of understanding AI as a mere tool to help you solve a problem. 

Key Points From This Episode:

Quotes:

“The best thing that you can have as a data scientist to be set up for success is to have a decent data warehouse.” — Elettra Damaggio [0:09:17]

“I am very much an introverted person. With age, I learned how to talk to people, but that wasn’t [always] the case.” — Elettra Damaggio [0:12:38]

“In reality, the hard part is to get to the data set – and the way you get to that data set is by being curious about the business you’re working with.” — Elettra Damaggio [0:13:58]

“[First], you need to have an idea of what is doable, what is not doable, [and] more importantly, what might solve the problem that [the client may] have, and then you can have a conversation with them.” — Elettra Damaggio [0:19:58]

“AI and ML is not the goal; it’s the tool. The goal is solving the problem.” — Elettra Damaggio [0:28:28]

Links Mentioned in Today’s Episode:

Elettra Damaggio on LinkedIn

StoneX Group

How AI Happens

Sama

Episode Transcription

Elettra Damaggio  0:00  

So don't do that. Don't go there and ask what's your problem, you need to know what at least might be their problems. You're just going there to confirm what you already know. Because you cannot really ask them to understand the technical side of what you do.

 

Rob Stevenson  0:23  

Welcome to how AI happens. A podcast where experts explain their work at the cutting edge of artificial intelligence. You'll hear from AI researchers, data scientists, and machine learning engineers as they get technical about the most exciting developments in their field and the challenges they're facing along the way. I'm your host, Rob Stevenson. And we're about to learn how AI happens.  Hello again, and welcome back to how AI happens all of you wonderful machine learning, AI building. Wherever you are coming from, however, you're coming to the show. Thanks for being here. I'm your host, Rob Stevenson and I have a wonderful guest for you today. She is the Director of data science over at Stone X group, electro demand. Joe Elektra, welcome to the podcast. How are you today?  

 

Elettra Damaggio  1:11  

Thank you. Well, thank you so much. Thank you for having me. I'm pretty well, thank you.

 

Rob Stevenson  1:16  

I'm really pleased to have you and I do apologize for that ugly American rendition of your beautiful Italian name. I can't roll my arse as much as you know, you're no no mites, for example.  

 

Elettra Damaggio  1:28  

That's absolutely fine. No problem. No problem at all. I'd say even in Italian sometimes I have a hard time pronouncing my name. They put like a C, Elektra, or whatever. They do all sorts of things. So it's okay.  

 

Rob Stevenson  1:42  

Okay, so that makes me feel a little better that it's not merely my being a Yankee Doodle Dandy. It's maybe challenging for other folks but in any case, here you are. I'm glad to have you. And we have lots to speak about today. First, though, I would love it if you would share with me a little bit about Stone X group, what the company does, and then your role and how you got there.  

 

Elettra Damaggio  2:01  

Yeah, of course, so well, so next group, how our CEO and everyone at a company likes to refer to us as a company that connects other companies and people to the market. So what we do so an X has its roots in FC stone, we actually, had our 100th birthday this year. We started with commodities, futures, and options. So we have now not just trading services, and payment services for companies. But we also have retail brands such as forex.com and CTA index, my journey at Stone Axe was actually joining the retail part of it, it was gaining capital at the time in 2019. So I was working with the head of analytics at the time as principal analyst on CTN, x, and forex.com. And then we've been acquired by n TL FC stone. And in July 2020, the whole group rebranded as tonics. So that's a nutshell.

 

Rob Stevenson  3:12  

Got it,  

 

Elettra Damaggio  3:13  

how I've been brought into the stone axe work.

 

Rob Stevenson  3:17  

I suppose this was back in your academic days, you were working on early neural networks. And I'm curious how you got from neural networks to kind of focusing on a career in data science. But first, what was the state of neural networks at the time you were studying them?  

 

Elettra Damaggio  3:33  

Well, the origin and neural networks and to be honest, until recently, they were pretty much developed in the 70s. So the perceptron, or the deep neural network, as we know so far excluding the more recent transformers for generic TBI? Well, that concept has been developed in the 70s. even earlier than that, if you want. So what I've been studying in university was all of that all the history of perception and neural networks, different types of neural networks, Boltzmann map, whatever, all things that have been designed in the 60s 70s, and 80s, I would say,  

 

Rob Stevenson  4:13  

What drew you to neural networks?

 

Elettra Damaggio  4:15  

So my major when I started my Master of Science was actually AI. So neural networks are part of it, but definitely not, you know, the only part of it. So coming from my bachelor in computer science, I was mostly drawn to databases, and the AIS major had a lot of data-driven exams and courses and I figured I mean, why not?

 

Rob Stevenson  4:45  

Get a reason there's any now stone axe is a 100-year-old company, as you say, and I don't want to beat up your employer. But here I go. Know. When I speak to people from older companies, it's common that they don't have the technical infrastructure that lots of younger companies do. super common. And it's not merely in your case, technical infrastructure, but the ability to perform data science ml, on top of that is, you know, even more advanced. So was that the case when you came into so next group, how mature was a typical organization? And what was your role in the data science side of the house,  

 

Elettra Damaggio  5:20  

when I joined, there was no data science at all my manager, when he interviewed me, actually always said, I didn't have a role for you. But so your CV and I want to hire you. So I made up a role for you. When I joined the company, he actually asked me, to please start some data science in whatever shape or form in this company, so when I joined, there are like pros and cons. So I didn't gain capital the company was actually a very much younger part of the company. We're talking 90s, though, so not younger, as in 2010, but 90s, but most of all, the trading platforms, all of that that was a digital service. So the company was born and grew as a digital company. So what's about that we didn't have access, we didn't have like stuff flowing around. Everything was being data streams, feeding databases, it was kind of already streamlined, as seen since the business was a digital business. So that was really good. That kind of laid a very strong foundation from a data perspective. And even though I had to develop stuff on my laptop on X cells, Jupyter notebooks, and this Microsoft SQL Data Warehouse, the data was in a very good state. And we could do a lot of things. So that is gain capital. That said, on the other side, when we're being acquired by the LFC, stone, we had a look at the wholesale, the older brother, if you want, and that is definitely in a much more complicated state. Although I have to say the company from last year, I guess, actually, in the last five years, has been heavily investing in data and data modernization. So we're now moving to cloud stack. So it's getting there, it's getting there, I have to say I've seen much worse, like retail banks and those types of legacy financial services are definitely in worse shape than we are.

 

Rob Stevenson  7:37  

I'm obsessed with this moment where your then manager was like, do some data science. And I mean, it doesn't exactly sound like you were set up for success. But where did you start?  

 

Elettra Damaggio  7:49  

So I had said, the data set up for success, I think the best thing that you can have as a data scientist to be set up for success is to have a decent data warehouse, data, Lake, Lake House, whatever you want to call it, if you have decent data, you can be set up for success. What you can do with that, you need to understand the business. So I spent my first month as a normal commercial analyst, if you want like a normal data analyst, and would just hand over reports to the CMO we were we were working under the CMO at the time chief marketing officer. So that allowed me to understand what we're delivering of the business, what were the pain points, things that they were trying to resolve from a marketing perspective from a commercial perspective. And that one of the first projects that I did was pretty basic, if you want but it's still used heavily used by the company. Why is customer segmentation that we apply application submission so as being a trading service, to perform trading to use our platform, you need to open an account with us being a financial service, a trading service, you need to go through some questions KYC so know your customer questions. If you ever open even a current account, you probably know what that is. Those are all the questions that ask you, what's your income, all of this things, credit checks, and all of that. So by using that data, we're able to define an internal segmentation that would allow us at an early stage to understand the potential of the customer and the quality of the customer as well that had implications on marketing investment, but also KYC operation in terms of case management, so prioritize people that were more like on the good quality side. So that is still one of the projects that I go by. I am the person that did that. So still now that my that we have like end of year reviews and stuff, it's still a tag that I have on me. And that was my first project.

 

Rob Stevenson  10:05  

You know, the more I speak with folks in your position, they keep telling me how important the non technical stuff has been for their role, right? Like you have all this fantastic background, technical background and know how that most people can't do or understand. And yet to really succeed, I'm hearing about all this multidisciplinary need to navigate the business, do internal PR, cozy up to internal stakeholders, that sort of thing, which I imagine is frustrating for someone with an engineering background, to be like, Okay, you really want to succeed, you need to do all these things that maybe you weren't trained in as an engineer. So for you, who are some of those folks you spoke to when you were getting to really know the business, who are the stakeholders and people you needed to learn from?  

 

Elettra Damaggio  10:48  

So I learned a lot from data. I went through this. And I went through the reports, of course, my manager gave me a sort of onboarding, run through and told me, okay, this is how the business makes money. This is, you know, our costs, do this the process. And then I went through all of these things. And then I, it's, I'd say, and I am very much an introverted person. I mean, with the age, I learn how to talk to people, but that wasn't the case. So I don't look for conversation, especially when I started a job, but very much closer to myself and look at all the things I can go through. So I was curious, like, curious about the business, how the things work, and detail led me to that. So you need to go through. One thing that I noticed, hiring data scientists is that, and I don't want to be mean, but I'm repeating this because I've been hearing this from other people in the sector. And I actually agree with that. They expect problems to be in a Kaggle form, I do not pre probably know what Kaggle is the platform, you know, the so a lot of data scientists and I have been hiring and talking to and working with, sometimes they really, really think data science is like, you know, a Kaggle competition, you've been handed over a data set, you have to extract the feature from that. And you're gonna solve the problem. This is not how bad is this is like, this is nice. This is a puzzle, and nice game. But this is not how reality is, in reality, the hard part is to get the data set that in Kaggle, was just handed over to you. And the way you get to that data set is by being curious about the business you're working with. And personally, I am no trader, at all. I am a very much risk averse person never traded in my life. But I was hired in that place. And I had to understand, who aren't these people? I was curious. I mean, I was like, Who are these people that trade? Why would you do that? And I went through all the stats, all the things and see, okay, those are the people that are making money for this business, what they look like, what type of information can I gather from that. And, of course, this might seem like a long time that you're not productive, because you spend a lot of time looking at things. But you're actually building the reality in your mind the reality of that business, of course, in your mind. And by having death, that's already I'd say 70% of the work. If you have this done like decently you have 70 Because once you have that, you have the questions that you can ask to the people that work with that stuff. So you cannot as a data scientist, you cannot go to a business stakeholder, for example, Acquisition Manager, or vice president of marketing acquisition or or a commercial leader saying, what data science model do you need? They would have no, right. I mean, what's data science to start with? They think it's sort of like magic things that happens. And that is actually a very dangerous question to ask, because they might ask you things that are absolutely impossible to do with data science. So don't do that. Don't go there and ask what's your problem? You need to know what at least might be their problems. You're just go there to confirm what you already know. Because you cannot really ask them to understand the technical side of what you do. You need to say, you know, this is what I can give you. What is your top priority? And they would tell you Yeah, that's that's my top priority. So this is very much how I approached it, my manager was very helpful in giving me that guidelines on who to talk to, I would go to him and say, I mean, I found this stuff is this, like the right path, the wrong path, who I need to talk to, to go through these things, and he would advise me where to go. So in this sense, it set me up for success by giving me the right guidance, but I had to give something to him, I had to bring my technical expertise. So this is what it means to bring a technical expertise is like, you understand where you can solve the problem with that technical expertise. And then you propose this to potential solutions to people and you work out what is the next best move, you cannot expect just to stay still, and have problems and over to you. So that is the difference, I believe.  

 

Rob Stevenson  16:02  

And is that just because most people outside of the technology or the business don't understand what data science is, like you gave an example that they might ask you to do something that's impossible with data science. So in this case, you're sort of like figuring out problems, and then asking them if you can solve it for them, essentially?

 

Elettra Damaggio  16:19  

Data Science is really tricky, right? So well, using machine learning to solve issues is very tricky. You have many, many different problems, many, many different ways to solve the same problem. And some of these things might not be achievable, because you don't have enough data, or you don't have the right data, or the data that you have is not good quality. So you first need to have that gut feeling that you have as a data scientist, when you see the data and like, Oh, this is not gonna work. This is not something that a business stakeholder might have. So because data science, machine learning problems are not straightforward, or definitely not straightforward, you need to experiment a lot, you need to do a lot of research. And you don't want people to waste their time by setting up their expectation. And then basically say, I cannot do that. It's good to do first sort of wide spectrum, analysis yourself. And then have a sort of aware conversation with the business stakeholder on what can be done and what cannot be done. If you go there and ask for a brief in a blank, unfiltered way, you may end up to spend a lot of time in exploration and experimentation and go back to them saying there's nothing and that is all time wasted. So you need to help them help themselves.  

 

Rob Stevenson  17:47  

Yes, you know, your domain best. And you have to be more prescriptive in what you're offering, right? Because if you just leave it up to someone who doesn't know your domain, they're gonna ask for something irrelevant.  

 

Elettra Damaggio  17:58  

Yeah, exactly. I mean, it's a bit like if you have like, works at home that needs to be done, but you are no plumber, and you're like, I want to renew my boiler. And of course, like, your plumber is gonna say, you can do this, you can do that. Looking at the house is gonna say, you cannot definitely do that. And at that point, you have a conversation, right? And say, Oh, but I would like to achieve that. And they would say, Well, no, I don't think you can achieve that. Because ABC. So you have this conversation before your plumber starts to, you know, tearing down walls and do things. And that's exactly the same thing. You need to have an idea what is doable? What is not doable? What be more important, what might solve the problem that they have. And then you can have a conversation with them, not just you know, one way you go, and then you disappear for months. And then you know what, you cannot do it? Yeah, that is the I think that's the trickiest part of data science. In businesses.  

 

Rob Stevenson  19:00  

When you shared, there's that moment where you're looking at the data and your gut tells you, you don't have what you need. It's not merely like a gut instinct, like it's trained based on you know, all of your experience. But what is standing out to you, or I guess not standing out to you that would make you feel like this dataset is not going to give me what I need.  

 

Elettra Damaggio  19:19  

Well, there's many things and depends on the type of problem and most of all depends on how much experience do you have with that data specifically? So of course, like a data scientist that just started to work with the data. The company might not have that gut feeling, but after a while, you get into that. So, for example, well, sometimes it's like you have a limited amount of features by design, you know that at the time of the process, you're going to have only so much information that isn't a lot about setting up the baseline right. So at some point, like for example, I had to work on a model.Build that was estimating the lifetime value of a customer after only one week of account opening. And I mean, we had a model that could classify some classes with 85%. of overall, I'm going to use the level of accuracy, although not exactly the accuracy. But let's stick with that. And then we had other classes that could have been like, predicted a 65. And at some point, we accepted that because there was only so much you could know of a customer that joined the business after one week. But you have a conversation, you understand me? Why do you want to use your Why do you need that exactly what you want to know. And the baseline might just work for that. Because what they want to know is I know, I need to know the ones that are really now going to do anything for me. And for that the model was pretty good. And at least assessing the ones that were absolute. Well, I don't want to use the word I'm thinking but let's say absolutely useless for the business. And for that purpose, the model was good enough, sometimes the model is good enough setting up the baseline, that's a very important part of that conversation we were thinking about because you feel like an absolute talking to 65% of accuracy is definitely something you don't want to have. It's not like it's yeah, it's a little bit better than flipping a coin. But if you go into the probability, if you slice and dice, maybe there's something usable there anyway. So it's better than nothing. Those are things like that you work out after a while that you look at that data, and you just know that you won't be able to do better than that.  

 

Rob Stevenson  21:47  

Got it. So you begin at then game capital announced on x, and you're tasked to do some data science. Fast forward, though, a few years, you put into production, some more serious models I imagined. So what has been the more recent stuff you've been working on?  

 

Elettra Damaggio  22:04  

Let me think, well, leaders deliverable had been in our first gen AI applications. We're still in release process. It's a chat assistant, actually, multiple chat assistants. I mean, there's a POC, actually, as a sort of first release, we released one that was multitasking Chatbot. But now that we're thinking of scaling up, we are definitely going to split that multitasking chatbot in multiple, like more specialized chatbots. And that was really interesting. While we use open AI LLM in Asia, we started with GPT 3.5. But I think we ended up at the end using GPT. For turbo, especially for querying structured data, the type of service that this chat bot does, or well, there's a module couple of modules that takes our customer support information and is supposed to be a client facing chatbot that assist people but of course, we need like some fences and guardrails and things because as a chatbot, well, no one can give you assistance in trading as in give you advisors in trading. So this is not something that the Chatbot does just kind of explains to you how the platform works, how the account works, what you need to do to I don't know, we DRO or fund your account. So that's one then we had internal supporting chatbot for our operation. And that is another function. And last but not least, we also experimented with structured data, so a chatbot that is able to answer analyst questions like oh, I need to know what was the revenue for this market last week. So we had the Chatbot queering our databases and use also a wiki that we maintain that defines all the KPIs in the business kind of gives you that type of explanation and interaction which is very useful for a lot of stakeholders and then have the capability to go into databases or sometimes not even you know, our BI reports. We have so many of them. I would just want to know this KPI and that's it. So this is definitely one of the latest release that we have.  

 

Rob Stevenson  24:32  

The generative copilot use case popping up in lots of places and I am intrigued by like the guardrails you must put up in this case, it's like the compliance guardrails where it's like okay, there's, we operate in this industry, there are rules that are specific and unique to our industry and the Chatbot doesn't know to go and ask legal Hey, am I allowed to tell a customer about this? You know, it just has to be like baked in. I guessing it's not enough to instruct the chatbot to say, I am not a financial advisor, which is like the magic incantation, you're allowed to wave the wave your hands and say, and it absolves you of all culpability when you then go on to give financial advice, right? But what did that look like when you were trying to train the chatbot to be compliant?  

 

Elettra Damaggio  25:18  

Well, so first of all, of course, we put, while your good old disclaimer every time you open chatbot, this is not yet in production, by the way, the external facing will go under a lot of beta testing. So of course, you have your big old disclaimer, and the temperature of the large language model is set to be very low so that the chatbot doesn't you know, take any creative inspiration of saying anything. Ideally, honestly, what we experimented if you basically have to chatbot referencing specific documents and specific content, it won't go as far as just having a creative summary of what he's pointing at. So we're not talking like GPT, four, and it's much more limited to the content that you gave to them and the setting the temperature and very low, I'd say it would be, I think is good enough, honestly, to ensure that the model is not going to drift from what is already said in the documentation that you give to it. That's what we observed.

 

Rob Stevenson  26:39  

Gotcha. Well, I couldn't stand to hear more about that. But we are creeping up on optimal podcast length here, Elektra. So before I let you go, though, I wanted to ask you if you could share some advice for folks who are forging a career in the AI ml data science space, what advice would you give them for them to make meaningful contributions and continue to climb this career ladder?

 

Elettra Damaggio  26:59  

So my personal advice is always to think that AI and ML are not the goal, that's the tool, and the goal is solving the problem. So you need to find good problems for yourself. And sometimes, actually, most times, AI and ML are just one of the tools to solve the problem, you need to know the problem, you need to know, and maybe you need to learn a little bit of Java or other programming tools to understand the problem because you're doing like a translator from a code to another code. Maybe you need to learn, I don't know multiple things. Maybe you need to learn a little bit of business administration, maybe you'll need to learn a few logistics of how logistic model, for example, works like knapsacks, or portfolio optimization or something like that. So depending on the type of problems that you need to solve for AI and machine learning or tools, don't forget to learn about the business problem that you're solving. That is personally my advice.  

 

Rob Stevenson  28:04  

That's great advice. At the end of an episode full of great advice, Elektra, this has been a delight. Thank you so much for being here and recording with me today. I've loved chatting with you.  

 

Elettra Damaggio  28:12  

Thank you very much for having me.

 

Rob Stevenson  28:16  

How AI happens is brought to you by Sama. Sama provides accurate data for ambitious AI specializing in image video and sensor data annotation and validation for machine learning algorithms in industries such as transportation, retail, e-commerce, media, medtech robotics, and agriculture. For more information, head to Sama.com