How AI Happens

dbt Labs Co-Founder Drew Banin

Episode Summary

Drew details the emergence of cloud data warehouses and the rapid adoption that followed, unpacks the practical uses of LLMs , and demystifies some of their reasoning-based limitations. He also sheds light on vector embeddings, their transformative potential, and what’s next for this dynamic space.

Episode Notes

Key Points From This Episode:

Drew and his co-founders’ background working together at RJ Metrics.
The lack of existing data solutions for Amazon Redshift and how they started dbt Labs.
Initial adoption of dbt Labs and why it was so well-received from the very beginning.
The concept of a semantic layer and how dbt Labs uses it in conjunction with LLMs.
Drew’s insights on a recent paper by Apple on the limitations of LLMs’ reasoning.
Unpacking examples where LLMs struggle with specific questions, like math problems.
The importance of thoughtful prompt engineering and application design with LLMs.
What is needed to maximize the utility of LLMs in enterprise settings.
How understanding the specific use case can help you get better results from LLMs.
What developers can do to constrain the search space and provide better output.
Why Drew believes prompt engineering will become less important for the average user.
The exciting potential of vector embeddings and the ongoing evolution of LLMs.

Quotes:

“Our observation was [that] there needs to be some sort of way to prepare and curate data sets inside of a cloud data warehouse. And there was nothing out there that could do that on [Amazon] Redshift, so we set out to build it.” — Drew Banin [0:02:18]

“One of the things we're thinking a ton about today is how AI and the semantic layer intersect.” — Drew Banin [0:08:49]

“I don't fundamentally think that LLMs are reasoning in the way that human beings reason.” — Drew Banin [0:15:36]

“My belief is that prompt engineering will – become less important – over time for most use cases. I just think that there are enough people that are not well versed in this skill that the people building LLMs will work really hard to solve that problem.” — Drew Banin [0:23:06]

Links Mentioned in Today’s Episode:

Understanding the Limitations of Mathematical Reasoning in Large Language Models

Drew Banin on LinkedIn

dbt Labs

How AI Happens

Sama

Episode Transcription

Drew Banin: The claim made by these researchers is that LLMs fundamentally cannot reason. And it ends up being pretty interesting because it frames for me, like, what are LLMs for? What should we do with these things?

Rob Stevenson: Welcome to How AI Happens, a podcast where experts explain their work at the cutting edge of artificial intelligence.

Rob Stevenson: You'll hear from AI researchers, data scientists, and machine learning engineers as they get technical about the most exciting developments in their field and the challenges their faces along the way. I'm your host, Rob Stevenson, and we're about to learn How AI Happens.

Rob Stevenson: Okay. Hello, hello, one more time to all of you wonderful AI practitioners out there in podcast land. Welcome back to How AI Happens. I'm your host, Rob Stevenson. And would you believe me if I told you I have an awesome guest line up for you today? You better believe it because he is the co founder of DBT Labs, has a ton of experience in our space and software engineering and other places. Drew Banin, the co founder of DBT Labs. Welcome to the podcast. How are you today?

Drew Banin: Hey, I'm doing well. Happy to be here. Thanks for having me on.

Rob Stevenson: Happy to have you. You know, I enjoy meeting people in your position who are founders of companies because usually it means you suffered through some problem and then figured that the best way to resolve it was to fix it yourself and start a company. I'm hoping that's the case with you. Is that true?

Drew Banin: I think that's almost exactly right. That's exactly what happened. I, can tell you about it if you want.

Rob Stevenson: Oh, I'd love it if you told me about it.

Drew Banin: So, yeah, my co founders and I all worked at a BI company called RJME Metrics back in 2014, 15, 16, and RJME Metrics was sort of the big up and coming all in one BI tool. So they did data extraction, kind of a version of data warehousing, BI and analytics, charting, email dashboards, stuff like that. And in 2016, we also saw the launch of, or really the beginning of adoption of the cloud data warehouse with Amazon Redshift. So it was this confluence of kind of two worlds. It was the best in class previous generation BI tool, coming up against this sort of emergent new paradigm, new way of doing analytics with cloud data warehouses. And we kind of had front row, seats to what analytics would become with the shift to the cloud. So DBT was born out of that sort of environment and really trying to take the way that work was done in legacy BI tools, bringing that into the cloud data warehouse. And really our observation was there needs to be some Sort of way to prepare and curate data set inside of a cloud data warehouse. And there was nothing out there that could do that on redshift, so we set out to build it.

Rob Stevenson: So redshift was then like a canary in the coal mine moment where you're like, okay, this is going to be totally disruptive to how BI has been doing this thus far.

Drew Banin: Yeah, I think so. At RJME Metrics, we were running these gigantic batch Hadoop, jobs in 2016 and then we sort of looked at redshift and it was, oh, you can run a SQL query and it runs 100 times faster. So okay, there's a lot in the nuance that we can get into on where Hadoop shines and where cloud data warehousees shine. But at least in that moment, things that were previously very hard started to become very easy and very accessible because if you knew SQL, you could do it yourself.

Rob Stevenson: You're not the first person to tell me recently that they were like hearkening back to the days when had Hadoop was the data management language de jour. Is it passe now? Is there still a place for Hadoop or do we kind of look wistfully on those days in that approach?

Drew Banin: I mean, I'll tell you, I signed up for Blue sky recently. We're on there. The hashtag is databs's, which is, I think kind of funny. Recently, like past couple of weeks my feed has been all DuckDB content and DuctDB is sort of the polar opposite of Hadoop in that it's super ergonomic, it works great with small to medium sized datats. There's kind of questions about what you do with bigger data set and Hadoopa on the other side of the spectrum is, well, it's great for really, really, really big datasetss and anything smaller than that. Just a ton of complexity and overhead that might make it not worth it.

Rob Stevenson: Yeah, okay, that makes sense. I guess the answer to like, is there a place for it is anything that's sufficiently enterprise is going to stick around just because if it was built on that, then those companies aren't that agile. So they may be stuck with Hadoop whether they like it or not.

Drew Banin: Yeah, I think that's real. And I mean there's real companies out there that have such, meaningful scale of data that it continues to be a good approach or running your own spark cluster even continues to be the right approach. Just in terms of price and performance and scalability. 99% of companies don't have that type of Data scale. And so I would argue that running your own spark cluster, or surely like running Hadoop is not the best way to do it for most companies. But there are certainly companies we talk to and they couldn't easily port their workloads over to a cloud data warehouse in an economical way. That's a minority. Yeah.

Rob Stevenson: Right, Right. So you explained a little bit about the problem you were experiencing that led you to found DBT Labs. I'm curious what you started hearing from the folks you were speaking to. Customers, prospective customers, did they get it right away? Like, was there a thirst for this? What was that experience like?

Drew Banin: Yeah, so it's funny and let me give you a little bit more backstory. So one of my co founders, Tristan, who's our CEO, is a seasoned analytics industry thatt when I started the company with him, I had graduated from college two weeks prior, something like that. So I had a lot of youthful energy, but not a whole lot of wisdom, especially not in the data space. So my perspective on how all this shuck out was pretty interesting because we just started building DBT and people found it and fell in love with it. And so to me it was like the first real project that I ever worked on. I thought, oh my gosh, every open source project must be just like this. You put up a repo on GitHub, you stand up a slack channel and then hundreds of people join and want as much of it as you can give them. So I understand that's not usually how it goes, but for us, I think we hit this nerve where there were a lot of talented, capable people that didn't have tools that served them well in 2016. And DBT was this way for them to be empowered and autonomous and get their jobs done better than before. As we say, solve hard problems exactly once with dbt. And so I just think there was nothing quite like that in terms of ergonomics and DX, but also like capability and power back in 2016.

Rob Stevenson: Yeah, it's so funny that your first experience was like, wow, starting a company rules. This is so easy. People just immediately want to give you money. It's not always like that like you said, but I'm pleased for you that that was your first dalian with entrepreneurship. And that's exciting. But it's not merely this dx, approach, this data handling. It's a lot more sophisticated than that. We're going to get into it. Maybe a good way to do so is this paper that Apple released recently. By the time this goes out. Maybe it'll be a few weeks old, but it's topical because they basically release this paper highlighting the limits of reasoning in LLMs. And I know that you have a semanta layer on top of the data. You were working with LLMs to help your users query their own data. So I was hoping you kind of reflect on that study and whether that aligns with your own experience and you agree with sor of their conclusions about how limited LM reasoning is at present.

Drew Banin: Yeah, it was a fascinating article and it's actually one that I printed out. I don't know if folks listening can see, but I have it in my hands, heavily annotated. I know. And, I apologize to the environment for doing soff. So, you know, DBT started out, you write SQL statements, we build tables, and that was the DBT experience. Okay. Plus a lot of other stuff that we probably won't get to talk about today. But that was pretty much the DBT experience for years. But then maybe 2 22ish, we started to get really enamored by this idea of a semantic layer. And it was also, like, I'm going to say, pretty buzzy in the industry. Like, we weren't the only people thinking about this. And semantic layers are an old concept that are kind of coming back. So, our big thought process there was DBT does a great job of governing your business logic for the tables that you create in your data warehouse. But there's all sorts of business logic that doesn't actually get manifested as a table, or at least like, shouldn't be manifested as a table. It's like time series metrics or specific answers to very precise business questions. So the idea behind a semantic layer is like, you're almost virtualizing tables. You're like virtualizing this data set of. There's a metric called annually recurring revenue, and it's calculated this way. And you can look at it by these five dimensions for plan, tier, goo, etc. Or at the grain of daily, weekly, monthly, quarterly, annually, you know, all time, whatever. So you like, virtualize all these combinations of the different dimensions we would say for the metric. And that's kind of the thing beyond this semantic layer. It's this point of consumption to actually, like, okay, go a layer beyond a table of data, but, like, give me the metric value for this very important business KPI. So, okay, that's the big idea behind the semantic layer. And one of the things we're thinking a ton about today is how AI and the semantic layer intersect. So our thinking here is LLMs, serve as really good natural language interfaces into whatever like they can, maybe we'll get into this, but I'm going to air quote they can understand questions and do stuff with them. And so to us, we think that a really good approach is using an LLM, as this natural language interface and do querying the well defined metrics and measures and dimensions represented in the DBT semantic layer. So you asked a question about this paper and I'm finally there. This paper is really interesting because the claim made by these researchers is that LLMs fundamentally cannot reason, they're not good at reasoning, they show some examples, we can talk about the details. And it ends up being pretty interesting because it frames for me like what are LLMs for? What should we do with these things? And I think it emboldens my point of view that like they're great for building natural language interfaces. They are bad as the, I'm going to call it like decision engines behind actually like operating in a business.

Rob Stevenson: Well, so that is the gap between it being in an agent, having an autonomy. Right. Being able to make decisions versus it being Microsoft Clippy without it being just like the copilot approach, the assistant approach versus goat worker approach. So are we just simply not there yet? This is like a gut check to the hype about LLMs, or is it a criticism of LLMs that will persist?

Drew Banin: So it's a great question and I'll tell you I'm prepared to make bold statements on this podcast accepting m the fact that I'm probably going to be wrong. But I think that might be more interesting than saying like, I don't know, we'll see what happens in 18 months because like, great, yeah, here's my belief. I don't think we should have any reason to believe that LLMs are inherently good at doing the types of reasoning tasks that are laid out in this article. So here's an example question that these researchers posed to some of the state of the art large language models and they were trying to gauge did it get the answer right or not? Okay. To make a call from a hotel room, you must pay 60 cents for each minute of your call. After 10 minutes, the price drops to 50 cents per minute. After 25 minutes from the start of the call, the price drops even more to 30 cents per minute. If your total bill is more than $10, you get a 25% discount. How much would a 60 minute call cost?

Rob Stevenson: That's just like an SAT question.

Drew Banin: It's like an SAT question. So there's a lot of interesting stuff in this article about the different types of questions they frame and Apple's own evaluation framework that they use to evaluate thesems.

Rob Stevenson: Wait, so did it answer the SAT question correctly? Did we get a 1600 score? I guess it's 24 now.

Drew Banin: Well, I'll tell you what I see here. GPT4O. I'm trying to parse it live. Maybe this's a bad idea. But I would say, like, it turns out 4.0actually does like a really, really good job. Something like a less recent model, not so old, but less like state of the art is like 10% off of this baseline.

Rob Stevenson: So were you looking at, different results from that question just there?

Drew Banin: Well, there's two things to say. One is if you want to get really into this paper, basically they found that LLM models pattern match the exact question. Like the structure of the question, the stock framework. And so they would change words like Instead of Drew Botht 10 apples, they would say Rob bought 10 apples. And then the LLM at least these older ones would get it wrongus they pattern matched like the specific name of the person from this question.

Rob Stevenson: It was like, oh, interesting. There's no calculation happen. I guess ye I should have figured that there's no calculation. It's like they think that drew a rob is a variable.

Drew Banin: Yeah. So when they subm it out, they find that some of these models do. This is what I was saying. 9.2% worse. Mistreral7b. But GPT4.0 is about on par whether you use the actual name or the real name or change the numbers or whatever. So that's kind of part one. Part two is I couldn't actually find the breakdown for accuracy, but it's like.

Rob Stevenson: 90% accurate or something useful to call up because we speak about its ability to reason and inherent reasoning is knowing that it doesn't matter whether it's Rob or Drew buying apples.

Drew Banin: S sure. And maybe one interesting call out is they'll also include extraneous information like Drew bought 10 apples. The apples are $0.40 a apiece. Some of the apples were really small. How much did Drew spend on apples? And so it'll pattern match. Like, oh, if the apples are small, then discount them from, you know, four apples were really tiny. I don't know. It's sixradius.

Rob Stevenson: It's so interesting because in this case, where there's a right answer, where there's math involved, there's no margin for error. Right. If you're using it for there are Plenty of other reasons to use generative and use LLMs to get you started on something that is more qualitative and you can edit in blah, blah, blah. But this is not one of them. This sort of calculation is not one of them.

Drew Banin: No. And if there's a link here, it's just like sometimes SAT questions are pretty confusing and they may.

Rob Stevenson: Deliberately.

Drew Banin: Yeah, yeah, deliberately. In the real world, especially if we think about like analytics and AI, a lot of the questions that you get from stakeholders are confusing. Not deliberately. And then it's this question of like, is the job, the lm, to answer the question or say like, hey, what the hell are you talking about?

Rob Stevenson: Yeah, yeah, that's his other. I think criticism of LLMs at present is like, they won't tell you they don't know and they will not ask clarifying questions. You can prompt it to ask clarifying questions, which I've started doing and I found that actually really helpful. Now you're more in a dialogue now it is more like a copilot. And I have found the results have improved. And my first approach with that with chattptt, was just like, evaluate this prompt and give me some recommendations for improving it. And so now I'm just asking it how to ask it, which is a layer I don't think the average user, not to pat myself on the back, can be expected to do. Right. Like, that is not how we interact with search and with text and with research right now.

Drew Banin: Totally. And my belief is that there's the model itself and then there's also the application around the model. And I think that this might be like a big debate that people much smarter than I am engage in frequently. But just my naive belief is that we should consider the application parts of it too. And maybe like an interpretation of your prompt and Maybe, I mean, 01 attempts to do this, I think, where it sort of evaluates its response and like thinks a little bit and the whole chain of thought thing. So anyway, I just think that when you interact with OpenAI, you're interacting with an application, not solely the model. And there's a lot you could do in the application layer to help direct what actually gets sent to the model and how its responses are interpreted.

Rob Stevenson: Definitely. You know, I'm wondering if the problem withlms is not the first L. If we can constrain the search space and constrain the attempt, do you believe reasoning would improve?

Drew Banin: Well, I don't fundamentally think that LLMs are reasoning in the way that human beings reason. And maybe what I would say is like language does kind of encode reason in a sense. Maybe not so precisely, but ever since I got into the space ofms and vector embeddings, I think a lot about the structure of sentences and what is being conveyed by different words in ways that I don't really think so much about when I'm talking. But there's just so much like information laden in the words that we use and how we communicate. So Anyway, I think LLMs, I mean, they've proven that they're able to get the right answer in pretty surprising ways for things that are kind of, you know, next token predictors. But I don't fundamentally think that they go through a reasoning process. I think that the least surprising next token in response to a question, the least surprising next token is often the right answer. Like the first president of the United States is, okay, if you answer George Washington. That's both the least surprising answer and also like the correct one. I think you can kind of extrapolate that example to math problems to some extent.

Rob Stevenson: Do you mean like Occam's razor? You mean like trying to constrain the output to, something obvious?

Drew Banin: Well, sorry, I more meant like to the extent that we think LLMs can reason and answer questions like the correct answer to a question is frequently the most likely response, the most likely set.

Rob Stevenson: Of words which doesn't require like a lot of independent reasoning to arrive at. Is that what you're saying?

Drew Banin: Well, I mean, this is why the Apple paper is so interesting. When they throw SAT style math questions, it turns out, like, they're not calculators, they're not great at doing math, they're predicting tokens. And so to kind of get back to your original question, I do think that finding ways to constrain this sort of question and answer space really helps these models do a good job, especially when paired with an application. To take our semantic layer example, if I were to ask an LLM, how has revenue changed quarter over quarter? Well, we've constrained the space of possible answers because the LM's job is to translate that question to some sort of well defined indexed metric. So just to keep it simple, say there's three metrics defined and it's ARR, new users and churn rate or something. Well, it's going to pick one of those three answers and the ARR metric is obviously the right one. So in practice, companies have hundreds and hundreds of metrics and lots of dimensions that they can slice and dice buy, but that's still significantly more Constrained than the total universe of possible metrics that could be looked at, whether they're valuable or helpful or not. So yeahraing, the search space is absolutely helpful.

Rob Stevenson: Yeah. Even if it's just like, okay, all of the data you're trained on, so much of it is just not relevant. And there was, I feel like going back to the Hadoop approach to data management. There was this period of time where it was just like, more is better. I'm feeling a little bit of a pullback there. Do you agree?

Drew Banin: Gosh, I don't know. Do I agree? I think LLMs are so decently good at so many things that it's easy to want to treat them like a black box. And I think that in practice, if we break down the different types of questions or tasks that we ask them to do, we'll find that they're better at some and better, at others. So if it's a answer any question that I have, Jeannie, then like, okay, as much data as you could feed it, it's going to help you get better answers. But if it is a, you know, super constrained customer support chatbot, just say as an example, you probably don't want a lot of extraneous things coming out of this model. You probably want it like laser focused on your business. So it's a question I have, and I'm sure people much deeper in the space than me know the answer. But maybe that has to do with training data. Maybe it has to do with fine tuning. Maybe it has to do with how you assemble an application around this with you rag techniques and the actual interface that you provide to users. Maybe it's a combination of more than one of those things. But not every task is created equal. And I think there are different characteristics you want out of an LLM for different types of these tasks.

Rob Stevenson: Yeah, definitely. And this is the difference between what your average user needs a response to in their daily life versus what an enterprise company will pay for. Like, okay, in the example of the customer support chatbot, it needs to be very, very specific and constrained and have all of these guardrails and you don't have access to this data. And I can give you this, and it's much more straightforward versus the LLM that might be better at what can I cook with the ingredients in my pantry? And you might get some silly answers like put the peanut butter in the chili. I don't know. But yeah, to your point, it's like, okay, the use case, I guess, dictates.

Drew Banin: The amount of data yeah, and there's a specificity thing. And I also feel like I'm more dialed into this now that I've been playing with LLMs. Do I need an answer that's 80% ish of the way there or do I know exactly what I want and get exactly a certain output? So if you're writing documentation for a new feature, there's thousands of ways to write good, valid, helpful documentation and you could pick any one of them and it would be fine if you are to use your example, you want to figure out what's made for dinner, many, many different good options. Maybe you don't care about which one you want to switch more to. Like the image generation side of LLMs. In these models, if you want just a generic picture of somebody on a spaceship, at least for me there are millions of pictures that would satisfy that curiosity. But if I know exactly the picture that I want in my head, trying to get an LLM to build exactly that picture is, you know, excruciatingly painful and challenging. So to me it's like how precise do you need to be in your answer? And I think this coming back to the Apple paper about reasoning, there's like one precise answer to these questions. So it sort of highlights where LLMs are great and very helpful and maybe places where they're not so well suited today.

Rob Stevenson: Yeah, it feels like it comes down to again the use case. And on this show I'm mostly speaking with folks who are building just like theuveo version of SaaS and who are just like, okay, I need to sell this into the enterprise. And so that is the use case. That's what people will pay for more than a consumer facing product. And so that does require folks to constrain the search space. So in the realm of the practical here, what do you think folks can do? What can developers listening to this do to kind of constrain the search space to ensure there's better output provided that they have a very specific use case.

Drew Banin: Sure. Well, a lot of it comes down to, I'm going to say like program design, just being thoughtful upfron with how you want to use an LLM. so a practice that I've used that's pretty helpful is if you're dealing with natural language. So just take the case of somebody's going ask a data question, natural language and you want to give them back a line chart. Well, step one for us is interpret the question and decide if it's a question we can answer or not. So if the person Asks something that has nothing to do with data in the data warehouse, the right answer is, hey, sorry, I don't think that's a question I can answer for you here. Maybe you could ask about ARR. Churn rate inst.

Rob Stevenson: Right, right.

Drew Banin: Okay. And then maybe step two is figure out specifically which metrics they're asking about. You can use an LM for that. There are other techniques that you can kind of blend together withms. Like it's a search problem to some extent. And there's a lot of great search techniques that aren't powered by LLMs that are still useful. So, okay, now you can start to make LLMs a part of this overall application experience. But you're not just throwing text at an LLM and hoping gives you the right answer. That's not a recipe for success in the enterprise.

Rob Stevenson: Right, right. So the idea would be to guide the user a little more. There's also this balance between okay, systems improving prompt engineering versus users, better understanding the domain context they're operating in. Like getting better at Googling, for example. I suppose we'll see both, but in the short term, folks will need to index on the former.

Drew Banin: I think that's right. My belief is that prompt engineering will probably become less important rather than more important over time for most use cases. I just think that there are enough people that are not well versed in the skill that the people building LLMs, will work really hard to solve that problem themselves rather than federated it out to every single user. I imagine it'll still be useful, especially if you're really talented and thoughtful. But I just remember 18 months ago there were entire websites about prompt engineering. And, I said it's the hottest job of the 21st century for like three weeks. And I just don't think that's exactly the case anymore.

Rob Stevenson: Yeah, can you speak a bit more about that? So you think companies are better served educating folks on how to do their own prompt engineering or people will naturally do that.

Drew Banin: I more mean like the models and the applications around the models will improve such that prompt engineering will be a less important skill in a person's toolkit. The thing that will remain important is knowing what you're trying to do. And I think that's sort of demystification that we have to do an education for our customer base. Like fundamentally, if you ask for the wrong thing, you will not get the right answer. I think that some people today don't believe that that's the case. It's just like if you ask for the wrong Thing you're not going to get the right answers's always going to be true. Yeah.

Rob Stevenson: You said a moment ago that, like, if you treat this like a, genie, that you can ask anything, a crystal ball, then you know, you will be disappointed probably. And maybe there is a case for that. But this need to be specific. I'm reminded of just like what it takes to build a report in Looker in Salesforce. Like, you really need to know what you're doing and you have to be very specific or it'll give you something. May, maybe you'll just break it and it'll give you an error message, but it will give you something and it may not be what you wanted. This is the same problem 100%.

Drew Banin: And I think that the most obvious thing you can do with an LLM is build a natural language interface into whatever. Like, here's our chatbot. And I think the more intricate and challenging but also more useful version of this is really thoughtful user experiences that maybe use, I'm going to say LLM magic behind the scenes to do things that users historically didn't expect from their applications. So maybe you're in a tool like Looker and you're pointing and clicking and it says, hey, you're looking at nonsense. Like you clicked buttons and there is a query and you are seeing results, but that's not a useful result for you. Maybe that's not the perfect interface, but, weaving it into the experience I've seen is much more powerful than giving someone a chatbot.

Rob Stevenson: Yeah, yeah, definitely. Well, again, because it's like they don't know what it does. Well, Drew, we are creeping up an optimal podcast length here. Before I let you go, I was hoping you could share us a little bit about when you are monitoring this space, whether it's for DBT or for your own curiosity. What are kind of the areas that really excite you and sort of stoke that tech curiosity inside you?

Drew Banin: Yeah, I found myself really interested in vector embeddings in particular. I think they got really hot and maybe're more at the plateau phase of the hype cycle, but in a way that is really deeply valuable and well understood. I think a lot of the problems that we SOL in the Enterprise, what I see, like we collectively there are search problems or recommendations problems. There's certainly a lot of opportunities for RAG coupled with large language models and vector embeddings are not a silver bullet, but they're these sort of mechanical, deterministic things in a space that's very, stochastic and hard to reason about know like AI overall. So I'm really intrigued by vector embeddings. I try to read a lot of the articles that pass through Hacker News or the local llama subreddit or whatever. Beyond that, I think just like playing around with this tech is really important, I find myself surprised constantly if it's been a month or two since I poked around with mstral or some of the models I don't use as frequently at work. Just seeing what they can do and staying abreast. All these updates like this space is still moving very, very quickly.

Rob Stevenson: Yes, constantly. Cool. Well yeah, that is some fun stuff for people to google, to read up on. And I appreciate that advice of just getting your hands dirty and playing around with this stuff because that level of curiosity, I feel like there is no substitute for that. Like there is no amount of Apple research papers you can read that matches the quality of experience. So that is a good bit of advice. And Drew, it's been really fun man, chopping it up about LLMs and the use cases that make sense or don't make sense. So at this point I would just say, thanks for being here, thanks for sharing your experience. I'LOVE chatting with you today.

Drew Banin: Thanks for having me on. Rob.

Rob Stevenson: How AI Happens is brought to you by Sama Sama'agile data labeling and model evaluation solutions help enterprise companies maximize the return on investment for generative AI, LLM and computer vision models across retail, finance, automotive and many other industries. For more information, head to sa.com.