How AI Happens

Lead Full Stack AI Engineer Becks Simpson

Episode Summary

Joining us today is the lead full stack AI engineer at Rogo as well as the lead guitarist for Des Confitures, Becks Simpson. Becks studied mechatronics in Australia and has a background in robotics. She moved into AI at a medical imaging startup before she came to Montreal to do a research project at the Montreal Institute of Learning Algorithms.

Episode Notes

Tune in to hear more about Becks’ role as a lead full stack AI engineer at Rogo, how they determine what should and should not be added into the product tier for deep learning, the types of questions you should be asking along the investigation-to-product roadmap for AI and machine learning products, and so much more!

Key Points From This Episode:

Tweetables:

“People think that [AI] can do more than what it can and it has only been the last few years where we realized that actually, there’s a lot of work to put it in production successfully, there’s a lot of catastrophic ways it can fail, there are a lot of considerations that need to be put in.” — Becks Simpson [0:11:39]

“Make sure that if you ever want to put any kind of machine learning or AI or something into a product, have people who can look at a road map for doing that and who can evaluate whether it even makes sense from an ROI business standpoint, and then work with the teams.” — Becks Simpson [0:12:55]

“I think for the people who are in academia, a lot of them are doing it to push the needle, and to push the state of the art, and to build things that we didn’t have before and to see if they can answer questions that we couldn’t answer before. Having said that, there’s not always a link back to a practical use case.” — Becks Simpson [0:20:25]

“Academia will always produce really interesting things and then it’s industry that will look at whether or not they can be used for practical problems.” — Becks Simpson [0:21:59]

Links Mentioned in Today’s Episode:

Becks Simpson 

Rogo

Des Confitures  

Montreal Institute of Learning Algorithms

Sama

Episode Transcription

[0:00:04.5] RS: Welcome to How AI Happens, a podcast where experts explain their work at the cutting-edge of artificial intelligence. You’ll hear from AI researchers, data scientists, and machine learning engineers as they get technical about the most exciting developments in their field and the challenges they’re facing along the way. I’m your host, Rob Stevenson, and we’re about to learn How AI Happens.

[0:00:32.1] RS: Here with me today on How AI Happens is the lead full stack AI engineer over at Rogo as well as the lead guitarist for Des Confitures, Becks Simpson. Becks, welcome to the podcast, how are you today?

[0:00:43.3] BS: I am good, very great to be here.

[0:00:46.1] RS: I’m thrilled to have you. I had to throw in the Des Confitures shoutout, you’re a cover band that is also all manner of puns and I love that your band is made up of machine learning engineers and other various serious academics, that’s the case, right? You’re all like former coworkers.

[0:01:01.9] BS: Yeah, exactly, yeah, we all work together previously at Imagia and yeah, we love the fact that in addition to our very technical day jobs, we also love to play music and we’re able to actually bring together a band and play a few gigs so yeah, super good.

[0:01:17.5] RS: Get you a coworker who can do both, right? Who can argue with you about what’s in looker and then get on stage and shred.

[0:01:23.5] BS: Yup, exactly.

[0:01:25.3] RS: What’s some of the music you play? You mostly do covers, right?

[0:01:27.8] BS: Yeah, so we do – I should remember the set list because I always have to introduce the songs as the chief band or officer, we play stuff like, Bon Jovi, ‘Living on a Prayer’, ‘Sweet Dreams’, the kind of mashup of the Marilyn Manson ‘Cross’ with the rhythmic version, ‘Zombie’ by the Cranberries is another big one. What else? 

We play a lot of like, 90s punk rock chick stuff as well because our lead singer is a lady, so it matches her voice. So there’s Joan Jett’s ‘Hole’ is another one, ‘Celebrity Skin’, and ‘I’m just a Girl’ by No Doubt. Yes, nailed it. I was like, “Oh, who sings that song again?” Yeah, so lots of good stuff like that. 

[0:02:08.8] RS: Amazing, the kind of stuff that the crowd gets into as soon as they hear the opening notes, right?

[0:02:13.0] BS: Yeah, exactly. So we have a lot of good videos of people singing along so it’s really good, yeah. We aim to like pump up the crowd.

[0:02:20.8] RS: I love it. I definitely could speak with you about that for the next 45 minutes but I think I’m meant to discuss AI at some point here. Can we learn a little bit about you, Becks? Could you share your background and how you wound up getting to your role at Rogo?

[0:02:34.9] BS: Yeah, definitely. So I guess originally, my background was in robotics so I studied at mechatronics degree back in Australia and have since moved to Montreal but I guess at that time, there weren’t as many robotics jobs and the degree was very software oriented. So I ended up moving more into doing software positions and my first gig, I guess, in AI was at a medical imaging startup. 

So looking at using like the new deep learning models that were coming out for detecting prostate cancer and classifying it from MRI and histopathology. So I did that for a couple of years, we built that startup back in Brisbane and then at the end of 2018, took a sabbatical, which brought me to Montreal where I am now and that was to do a research project at the Montreal Institute of Learning Algorithms with the famous Yoshua Bengio is and we were looking at new methods to prevent models from looking at spherious features when learning. 

So after that, as part of the little sabbatical, I went to China for about a month, advising some of the startups there at the hacks accelerator in Xian, which is pretty fun just you know, advising them on their data strategy. Some of them are building stuff that will collect a lot of data and then looking at some proof of concept ML algorithms that they could use and in the meantime, I decided from my stint in Montreal that I would move here, even though it was winter. 

So I did that, got my visa, came over here, recently got my permanent residency and I was at Imagia, one of the also, medical AI startups for a couple of years as well, started as a ML developer and then moved to be the team lead of an ML development team, building tools for the researchers, which is really fun and yeah, since moved to Rogo, where I am now, doing full stack AI so it’s also super interesting.

[0:04:20.9] RS: What made you decide to lend your expertise to Rogo?

[0:04:25.1] BS: So what we were working at previously at Imagia is kind of a thing to let people pull in their data, query it, how they wanted to and then run some machine learning on it but it was very much for the medical space.

Then on the tooling side for the researchers, what we built was basically this configurable system where you can just write like a YAML file and it would run the pieces that you needed so it would transform the data that you had into what you needed, run it through the algorithm that you specified, whether it was some deep learning model or some traditional ML thing and so when I spoke to Rogo, I saw that they were doing that kind of thing but for everyone. 

So it was democratizing that kind of data access that ML access for everyone and so for me, that was an amazing use case, super interesting and then something that I will definitely wanted to lend my expertise to and get onboard with because you see, there’s a lot of data out there, there’s a lot of insights that are locked away in unstructured information that people either need to have intense expertise to be able to access or you know, they just can’t and so a platform like Rogo is definitely giving people the power to learn from the stuff that they have, so yeah.

[0:05:33.6] RS: Got it. So is the idea to unlock or be able to make sense of unstructured or unorganized on a data, that kind of thing?

[0:05:43.7] BS: Yeah, so basically, it’s pulling in a bunch of different data sources of interest and where you know, working on letting users connect their own data sources whether it be databases or like, SQL, no SQL, APIs, that kind of thing and basically, surfacing that or exposing that through a natural language interface. So then people don’t need to know how to code, they don’t need to know SQL, they don’t need to know the intricacies of what the data looks like underneath to actually be able to get information about that so.

[0:06:11.2] RS: Got it and how would you characterize your role? Are you kind of like an AI Swiss Army knife as I understand it?

[0:06:18.2] BS: Yeah, so at the moment, a lot of it is obviously building features because we’re still in beta at the moment, so building the features that customers actually want so whether it involves frontend work or backend work but then also looking at where to put machine learning or deep learning into the product, making sure we’re collecting data correctly for those use cases and yes, starting to build that kind of strategy.

[0:06:40.6] RS: So it sounds like you're considering a couple of different techniques to add into the product tier for deep learning specifically. When you have this conversation about whether it’s right to add, what is that conversation typically sound like?

[0:06:54.7] BS: Yeah, it’s a very good question. Initially, we need to look at like, for a particular use case, making sure the use case is well defined but also making sure we know what our performance is with the methods that we already have and making sure that we underrated what the return on investment will be particularly ensuring that this is really a pain point for customers because you know, as an early startup. 

Basically, your early customers are your most important people. So if maybe we find something annoying or we think that something could be better but actually we look at what they’re doing and it’s not that bad for them, then there’s not really any point trying to put a bunch of time and effort and resources into developing or that can help there.

[0:07:31.5] RS: The reason I ask is because I fear that often, particularly these like, hype technologies, it’s sort of a top down thing that you know, a CEO or chief data officer or something hears about, I don’t know, I’m being pejorative here maybe. But here’s when you’re on the golf course and like, “You know what? We should do some machine learning” and you sort of have to get them a dose of reality like, we could but it’s not just this panacea you put it in any product to excite your investors and solve all of your company problems. 

So when that happens, when maybe you have to give a higher up a reality check about the technical needs, have you had that experience in your career? Maybe not just at Rogo but anywhere where it’s like, “Look, I know that this is a sexy technology but it may not be right for us.”

[0:08:15.0] BS: Yeah, so definitely, I would say probably the startup back in Brisbane was a little bit like that. Like I said, it takes to understand the use case in the problem space that you're trying to solve and avoid the whole AI solution, looking for a problem and the best thing you can do is sort of, is come with data about why it might be a good idea or why it might not be a good idea.

I like to think of it kind of like a product manager that will come with data, not just about like what value this could actually bring or lack of value if it was undertaken but also looking at what extra work is required to even get this thing off the ground. Like, “Do you have the data, have you been collecting it, is it in some kind of useable state or are you going to require a certain number of hours from a data engineer or data engineering team?” Do you even have that expertise in your organization? Once you’ve done that, do you know what your current performance is on the task? Is it a new task that the company hasn’t attempted before so now you need a benchmark to look at how well you could actually do? 

So I read a quite good article recently, looking at how well they could predict the brand of specific grocery items based on just like, the description. 

The first thing they tried was a rule-based model so like, you don’t want to just throw that thing through some kind of bare transformer or whatever, to see how good it is before knowing what’s the easiest thing that you could do that will give you an idea of whether this is even possible because if you try and do something like that and maybe no human has ever done it before or humans are notoriously bad at it or just looking at the data here.

It's not clear like what features are in there that can be useful or there’s a whole bunch of data point that you can bring together to put to that argument forward and I think that’s often what’s needed because if you can show that, “Okay, we want to do AI to predict…” I don’t know what products people are going to buy or recommend products to a person when they come to our webpage. 

But actually, we can show that either through the data we’ve collected on how people are buying things or the data we know from other sites that are doing similar things, it's only going to increase purchasing by like, 2% and the amount of effort and resources and time that we were going to spend from the data engineering team, or software who are going to have to like, productionize it, the resources of actually putting a model in production and serving that and also, you know, the algorithm development time whether you’re making new thing or pull something up the shelf or have to do feature engineering with a statistical method that that is going to cost XYZ and it means that the break even point is going to be in like 10 years, that’s the year it’s probably going to be like, “Oh, yeah, okay. Well, let’s not do that.” So it’s a lot about like constructing that argument.

[0:10:47.6] RS: Right and that’s almost a nontechnical argument. Like that is making a business case or the opposite of a business case in the example you gave and what I’m learning speaking of all the folks in this podcast is, all of the hats that people in this role need to wear, it’s not enough to be really good at developing an algorithm or being able to process and run advanced ML approaches on big sets of data.

I’m learning more and more that there’s just all these nontechnical hats you need to wear. Is that because the field is relatively new and misunderstood on a business level or why do you think it’s fallen upon AI and ML experts to also figure out non-technical ways to contribute at their companies?

[0:11:25.9] BS: Yeah, so that’s a super interesting question and I’d say it probably is because it’s quite a new – I mean, it’s new ish, like I’d say it’s probably 2016 was when, like, stuff started coming out and as well, like, there’s this hype around it. So people think that it can do more than what it can and it was only been the last few years where we realized that actually, you know, there’s a lot of work to put it in production successfully, there’s a lot of catastrophic ways that can fail, there are a load of considerations that need to be put in and yeah, so I think it could be that gap between like, what a project manager does now and what they used to and what they know in terms of software because that’s been more tried and true for a while. Yeah and so that’s why often, it ends up, you need to lean on the technical experts on AI to tell you like, “Well, I can’t really do that” or “If we were to even tell whether or not we could do that, we need XYZ first.” 

So yeah and I’ve seen a few, you know, like AI product manager type roles and things coming out which is interesting. Oddly enough, I always see them go like product managers being taught how to do AI and never AI people being taught to do product management, which I think is interesting, so yeah. 

[0:12:33.7] RS: Yeah, is that a role you would want on your team? 

[0:12:36.5] BS: I would say that is probably something that I’ll end up doing as I progress in the role and we build a bigger team and that kind of thing and it is also something I have kind of done along my career path is looking at what niche things are we trying to solve and whether or not it makes sense from a customer point of view and it makes sense from technical point of view. 

So yeah but definitely I think making sure that if you ever want to put any kind of machine learning or AI or something into a product, having people who can look at a road map for doing that and who can evaluate whether it even makes sense from an ROI business standpoint and then work with the teams because there is levels of escalation that you’ll go through from your benchmark and baselines, to doing more sophisticated models to maybe in the end putting something deep learning based in there and understanding that there is like an interplay of research and development that happens with that. You know, it takes people who can understand that.

[0:13:27.6] RS: Yeah and that sort of domain education I think is necessary for someone at any function even in my role, like I can assume that a CEO really understands marketing or really understands what is accomplishable given the current team and what is going to actually move the needle and that’s just like it is managing up in a way. I think that’s no matter where your career takes you, I think that is an important skill. 

[0:13:50.4] BS: Yeah, one hundred percent. 

[0:13:52.2] RS: Also you mentioned that you really need to bring data to the conversation. And what is interesting is in other functions, for example marketing, when you say that you are talking about specific metrics and insights but it sounds like part of it for you also is like you’re bringing data about data, right? You are speaking about a dataset at large as oppose to an insight you gleaned from it, is that right? 

[0:14:12.8] BS: Yeah, it’s a combination of both. Yeah, so part of the data that you will bring to support, it’s kind of I guess evidence is another way to put it to where without being too meta but yeah, definitely some of the evidence is about the data that you have. So like how much do you have, do you even have stuff that represents the problem that you are trying to solve? Do you need that much? Depending on the approaches that you might take in the beginning, what’s the data quality like? Is there a bunch missing, is there a label quality as well? It’s also another thing that’s important and how you’re going to deal with things that can balance all that sort of stuff? 

And yeah, if you find that someone is saying, “Oh yeah, we should do this used case” and you look through all the databases of things that you’re collecting on the company side and you’re like, “Well, there is no model out there that will predict what we’re trying to predict and we don’t have nearly enough data or we’ve not collected anything.” Then it’s like your first step is, “Okay, if you really want to do this and you can see that if we can do this, our ROI would be like a million percent," or whatever, then the first step is to collect that kind of data and you know, set the schemas around like what information you want, all that kind of stuff but yeah, it’s definitely always part of the conversation when you are trying to do some machine learning or AI thing. 

[0:15:19.6] RS: Yeah, you rattled off a bunch of really excellent questions that someone needs to ask whether you are trying to make a business case or not, I think even if you are just trying to be effective in your role, whatever that looks like, I think you have to have the answer to that question about your dataset and it’s interesting before you can even really hammer on it, you need to do that. 

It’s sort of like when you go to clean a room in your house, you have to make it more messy before you can make it clean, you know? Like, “Oh, I have been cleaning this for half hour and it is a disaster because I was pulling things out and moving things around” and then only then will it actually start to get clean. 

[0:15:48.3] BS: Yeah, exactly. 

[0:15:50.9] RS: So where does that lead you? Once you’re answering those questions, is that about just like which technologies we should put in, which hires we need to make? Where do the answers to really understanding your dataset take you? 

[0:16:01.6] BS: Yeah, so I guess once you’ve understood while it is more than just the datasets or once you can say like, “Okay, well, my dataset looks not too bad or because of XYZ reason, we can use some, I don’t know, off the shelf model and fine tune and therefore we don’t need like a million data points. We only need a few hundred thousand or something and we’ve got that, that’s fine” and then it’s a question of like evaluating. You know, on the data side like what things do you need to do to actually get that in a shape where you can start experimenting what are your capabilities for doing ML experimentation. Now, you are going to use some platform that lets you run a bunch of things. Are you going to build some internal tooling that will let people do stuff? Are you just going to run it all in notebooks like what expertise do you have on the teams to do those sorts of things?

Then yeah, whether or not you need to hire new people or whether some people already know stuff and you can just sort of like up-skill some others. And then also breaking down I guess the investigation to product roadmap, which you’ll look at. If you need to do benchmarking, you know you will do that first to see like, “Okay, how well can we actually perform with some boring thing? What would we do next if we wanted to improve it and how will we know that that thing is better?” 

Once we’ve reached that, is there a next step? Is there a next step? Is there a next step? And then then mile stones for those kind of things and also how you’ll know whether or not it didn’t work, so that at any point in that roadmap, you can just go like, “Okay, well, there is no point going to these next steps because we have pretty much identified here that this is not going to work.” 

[0:17:23.3] RS: When you think back to your own education development in the space, how relevant was that to this process you’re talking about? Like the investigation to product roadmap for example, is that something that is normal to work on and understand when you are just getting expertise in academia for AI machine learning? It feels like it’s not. 

[0:17:42.5] BS: No, not at all. It couldn’t, definitely, no-no-no. So the interesting and the thing that academia brings to it is that you can clearly see that there’s not really a process like that. You will have a list of things that you’ll kind of want to do but one of the things that I like about industry is that if you are at the right place, there are very strong software quality checks in place. So that you know that you could run in some experiments, you can trust the results that you are getting and if you have a series of experiments that you want to run that will tell you a specific answer, whether or not your approach is performing. 

But yeah, having those software quality checks means that you can do a bunch of things and be sure that like, “Okay, I tried this, I tried this and I tried this and I tried this.” We tried these 60 things and all of them showed the performance was like 65% accurate. Therefore, we have a fairly strong conviction that we’re not going to get better than that,” whereas often I found in academia, it was very much not like that and so it was kind of you don’t learn that process at all from that. You glean it because you have to experience it and you’re like, “Wow, I wish this is a better process” and then you go into industry. You’re just like, “Okay, well, software developers actually have these sorts of processes” you know where they have tests to make sure that things didn’t break when you change some stuff. They have revision code versioning, so you can push new changes and pull rollback things in case something is broken. 

So to answer your question, no. A lot of that knowledge of that process came from working on stuff, seeing what didn’t work, seeing people trying to do some particular project for the longest time and not getting anywhere, partially because the tools that we are using in the software, they were trying to build, didn’t have some of those quality checks and so was unclear like, “Is it not working?” because that is just not a problem that will have a solution. 

Sometimes people want to learn like medical imaging is a huge one. You know, sometimes they’re like, “Oh, can we predict whether or not this cancer will turn into something in this time?” and sometimes, you just can’t get that from the imaging. It is like the predicting brand from a string of the product description. If this product description just says “box”, I don’t know. Is that a box of like Coca-Cola or is it, you know? You can’t know and so yeah, having good quality in what you’re building can help you really answer those questions and say like, “Look, we know for sure that this is not a question we can answer with this data.” 

[0:20:03.2] RS: You know, the space is really unique in how common it is for someone to have one foot in the industry and one foot in academia. Do you think this is kind of why? Because in academia you don’t have the considerations of how to navigate organizational needs or roadblocks, you just have the pure study. Is that why people do both? 

[0:20:23.6] BS: Yeah, so that’s a good question. I think for the people who are in academia, a lot of them are doing it, well, what I saw is to sort of push the needle and to push the state of the art and to build things that we didn’t have before and to see if they can answer questions that we couldn’t answer before. Having said that, there’s not always a link back to some practical use case. 

Some people don’t care, some people are like, “Yeah, that’s cool.” We know someone could figure out a use case for this, that’s fine. My favorite example is like imaginary numbers in mathematics like when they first came out with those people like, “Wow, cool, yeah” like whatever that’s the one and then electrical engineers are like, “Oh wait, we can do all these task. I think that’s really cool.” 

So I see it as it’s always a kind of spectrum but there always needs to be people who will like sit sort of both sides because otherwise, there is no bridge for that and that can also happen a lot in industry as well like in particular companies. They’ll have someone who is at the really pure end of the research just like trying to dig down into first principles and develop these models from scratch and completely new methods that people hadn’t thought of before. 

All the way through to like the applied researchers and see like, “Yeah that’s cool but it doesn’t work for medical imaging, it doesn’t work for object detection. It doesn’t work for natural image processing” and if it does, then there’s someone who will be like, “Okay, cool. You got it to work” like maybe on a notebook or you got it to work on the small dataset. Now, we have all these data that we need to be bringing in. We have data and production that we need to check actually matches distribution, stuff like that. So that is where you like your ML developers and things come in. And then there is your software people who will like bond a little together with all of the infrastructure and pieces around it, so yeah. So I think that is kind of where that bridge comes from is because you know, academia will always produce really interesting things and then it is industry that we’ll look at whether or not they can be used for practical problems. 

[0:22:07.3] RS: Yeah, that makes sense. Before I let you Becks, as we creep up on optimal podcast length here, I would just love to hear your thoughts on an area of AI machine learning that you are truly excited about, whether it is at Rogo or related to you previous experience or not, if there is like some paper or some application that just really gets you excited and you think has a ton of opportunity. 

[0:22:29.3] BS: Yes, so I touched on it a little bit earlier but the whole idea of off the shelf models for different things, I think that that in itself is pretty amazing having seen, you know, I sort of started out in 2016 where you know, TensorFlow was just becoming a thing and everyone was like, “Oh, what is this?” and now to see that there’s a whole suite of computer vision models that you can pull and use for different purposes. Now, there is a bunch of NLP ones as well, so like hugging phase and also TensorFlow but yeah, just seeing that kind of thing democratize so that people can, not just use those models but probe their performance, see what kind of data they were trained on, all of that good stuff, I find that really exciting because it means that any company can start to incorporate those into the things that they are doing. But it also means that because it is open source, we can do more investigation into how those models were trained and that kind of thing to make sure that they’re good and they’re doing what we need them to do, that kind of thing. So yeah, that to me is super exciting. 

[0:23:22.6] RS: Yeah, it is and I do think it will lead to like an explosion in the types of careers in the space, right? I guess it is never been like the raison d'etre of the AI expert to just write models from scratch, you know? I mean, some of them I guess but when this becomes commoditized, it is more accessible. It is more where you take it rather than where you got it, right? 

[0:23:41.1] BS: Yeah, exactly and I would be really, really interested to see and I am not sure if there is already this, so I might be speaking too soon, maybe there is this but reinforcement learning models as a thing as well, you know? Because I guess robotic process automation is becoming a big deal and many companies where they have a bunch of processes that a human can do but really, should they? It is a waste of time and it is probably something that a well thought machine could do, so having those kind of models as pre-trained and off the shelf with particular tasks would be pretty cool. 

[0:24:09.2] RS: Got it. Well Becks, this has been a fascinating conversation. Thank you so much for joining me and sharing your experience and expertise and I will make sure to link to your awesome cover band in the show notes. 

[0:24:19.9] BS: Yes, awesome. Thanks so much, Rob, it was great to be here. 

[0:24:25.1] RS: How AI Happens is brought to you by Sama. Sama provides accurate data for ambitious AI, specializing in image, video, and sensor data annotation and validation for machine learning algorithms and industries, such as transportation, retail, e-commerce, media, medtech, robotics, and agriculture. For more information, head to sama.com.