In this episode, Colin explains how AgriSynth is “getting humans out of the way” with its closed-loop control system and offers some insight into the sheer volume of data required to train its AI models.
EXAMPLE: AgriSynth Synthetic Data-- Weeds as Seen By AI
Data is the backbone of agricultural innovation when it comes to increasing yields, reducing pests, and improving overall efficiency, but generating high-quality real-world data is an expensive and time-consuming process. Today, we are joined by Colin Herbert, the CEO and Founder of AgriSynth, to find out how the advent of synthetic data will ultimately transform the industry for the better. AgriSynth is revolutionizing how AI can be trained for agricultural solutions using synthetic imagery. He also gives us an overview of his non-linear career journey (from engineering to medical school to agriculture, then through clinical trials and back to agriculture with a detour in Deep Learning), shares the fascinating origin story of AgriSynth, and more.
Key Points From This Episode:
Quotes:
“The complexity of biological images and agricultural images is way beyond driverless cars and most other applications [of AI].” — Colin Herbert [0:06:45]
“It’s parameter rich to represent the rules of growth of a plant.” — Colin Herbert [0:09:21]
“We know exactly where the edge cases are – we know the distribution of every parameter in that dataset, so we can design the dataset exactly how we want it and generate imagery accordingly. We could never collect such imagery in the real world.” — Colin Herbert [0:10:33]
“Ultimately, the way we look at an image is not the way AI looks at an image.” — Colin Herbert [0:21:11]
“It may not be a real-world image that we’re looking at, but it will be data from the real world. There is a crucial difference.” — Colin Herbert [0:32:01]
Links Mentioned in Today’s Episode:
Colin Herbert 00:00
And every round of training, it learns a bit more and learns of the weaknesses and gets more data. So we could start off with 5000 images or 10,000 images and leaving it running for a day or so we can end up with 100,000 images.
Rob Stevenson 00:19
Welcome to how AI happens. A podcast where experts explain their work at the cutting edge of artificial intelligence. You'll hear from AI researchers, data scientists, and machine learning engineers as they get technical about the most exciting developments in their field and the challenges they're facing along the way. I'm your host, Rob Stevenson. And we're about to learn how AI happens. Here with me today on how AI happens is a guest whose company is doing some really fascinating things in our space. He's the CEO and founder of agri synth. Colin Herbert. Colin, welcome to the show. How are you today?
Colin Herbert 00:59
I'm very well, thank you very much,
Rob Stevenson 01:01
really excited to have you lots to go into. Before we do that. I always love to hear a little bit about the founding story a little bit about what led you to a point in your career slash life where you're like, you know what, I'm going to found this company, and it's going to be around this particular use case in this particular industry. So I would love to hear a little bit about the story behind acquisition.
Colin Herbert 01:21
Okay, sure. Well, it wasn't planned for sure. I guess I've always been interested in biology. That's always been my thing. My father had his own engineering business that never interested me. But what did interest me was looking at trees, plants, human beings, animals and working out what was happening inside. I applied to medical school and got two offers back in the day when it was really difficult and missed one of the grades. So anyway, stayed in biology and as part of the first degree, got involved in agriculture. And I was amazed really, when you look at the development of agrichemicals, which was the area that I initially went into, it was almost identical to the development of medicine in pharmaceuticals. It's 99%, the same industry, tremendous overlap. So I got into agriculture, and did that for many years working with BASF, Syngenta, Novartis in different parts of the world, and then started my own business and went into other areas, lots of other industries, but always with a science and a biology connection. And I ended up working with Rush in Switzerland, looking at clinical trials. And it was during that period, I spoke to one of the developers were using in India, in Mumbai. And we were looking at mammograms for breast cancer. And looking at the beginnings of what was then AI, early machine learning. And as we finished one call, he made a comment about his pitch. We can't draw them ourselves. And I didn't think much of it, but it did register. I knew that it had registered, but that was about five years ago. And then, about three years ago, I happened to be watching an episode of Star Wars, which is sounds a bit nerdish. But it was with my son who was 22, it just happened to come on TV, and we started watching it. And we were talking about the way the spacecraft didn't need to bank when they turned because there wasn't any air. And we talked about CGI. As soon as we mentioned CGI, I remembered what the guy in India had said, and because of my interest in agriculture, I kept abreast of developments was robotics and things like that in agriculture. And it just came together. And I thought, yeah, why don't we do that? Because I knew people were struggling in agriculture with AI machine learning, because it's not the kind of area can just go out and click more images, because we've got seasonality, we've got growth stages. So if you want more imagery, you have to wait a year. And you know, people have been trying for years to nail certain datasets, for example. So yeah, that's how it came together Star Wars.
Rob Stevenson 04:13
So it sounds like the data set in agriculture has lagged compared to other areas. You know, like, for example, self driving cars can rely on the closed circuit cameras at intersections to harvest data and, you know, roads are well documented, but are the growth stages of particular kinds of plants at particular, times of the year in particular places maybe less. So?
Colin Herbert 04:35
Yeah, we don't have even today, a lot of good quality imagery data set. And the problem is if you take one crop in one year, you will have the crop doing certain things it will grow to certain rate. It'll be a certain variety, which may look slightly different from another variety. And if you have imagery at certain growth stages and certain growth stages have certain spectrum of weeds, for example, you can bet your life that when you come to look at that whole data set, you're going to say, well, I could do with a later growth stage in the wheat, or I could do with another species of weed, or I could do with another growth stage of weed, or another variety of wheat. And the questions go on, and you collect that next year, because you don't have the opportunity to do it again, that season, things can grow very quickly. And that can go on year after year. So it's very problematic. Nevermind the annotation problems. It's very, very problematic to collect what I would call a homogeneous, good rounded data set for a particular area.
Rob Stevenson 05:44
This is somewhat the challenge in lots of different verticals within companies trying to deploy AI, right? Is this problem of enough data or enough of the right data, various solutions about? What did you come up with? Or how are you trying to fill these gaps?
Colin Herbert 05:58
Essentially, there are two problems. One is the collecting real world imagery. The other is annotating, or labeling it, the problem is that the images in agriculture tend to be very complex. If you look at the you mentioned self driving cars, you have a road, you have white lines, you have a bus, you have a car, the number of objects are relatively limited. And the occlusion of those objects is, even if it is a complex image is fairly well understood in a crop of wheat you've got it's like a lawn, if you can relate to that it's a one species of grass type crop with weeds and disease and pests and slugs, and old tin cans and leaves off trees. And it's a very, very complex scene. And it always will be and different soil types different clouds of soil that break down, and it's just very complex. And the occlusion of leaves, the overlapping of leaves in particular, is extensive. So the complexity of biological images, agricultural images is way beyond driverless cars and most other applications. I mean, medical applications looking back at mammograms seem really simplistic now, but agriculture is complex.
Rob Stevenson 07:17
Interesting. Yeah, I guess that makes sense when you put it that way. But of course, a natural environments would be more complex than a human constructed system. Right? Yeah. Which is what a road and what cars represent.
Colin Herbert 07:28
Yeah, absolutely. And the US what we did well, we thought, okay, we need to build synthetic imagery. It's as simple as that. That's the solution. And we did a lot of research myself and to others, and found out that there had been some attempts in agriculture, but very few and very simplistic, people cutting out images and putting them on of apples and laying them on the ground under the tree and things like that. So it was very simplistic, and we thought, it's not going to be simple, we need a much more sophisticated approach. So we started building what we refer to as a maths, physics engine to procedurally generate images of any plant and any other object at any gross state. So what this engine does, we've actually got three of them, I'll clarify that. But what these engines do is essentially we give the plant and age a lifetime zero to 100. Zero is when it germinates. 100 is when it either dies, or it's mature, ready for harvest. And we tell the software, the rules of how leaves come out of that plant, how they emerge, how they branch, their thickness. And for every single part of a leaf, for example, that could be 150 millimeters long, there's probably 20 to 30 parameters along that Li, the width, the curl, the type of folding, the relation of the leaf, and lots of other things. And for every one of those parameters, we have a distribution of data. So we don't just put values in, we put a distribution of values, and that may be skewed, it may be normal, it may be t distribution, whatever. So it's incredibly complex inputs. And one of the problems we're having, which we'll probably come back to later is how to talk to this thing. You know how to set all those parameters, it's really difficult. So it's parameter rich, to represent the rules of growth of a plant. And then depending on the species, we vary that because some species grow rapidly at the beginning, have different leaf shapes and things like that. So we've got three engines one is for broadleaf weeds, you know, common wheat and the lion things that we're all familiar with. The second engine is for grass type plants like wheat and grass weeds. And the third one is actually for grain wheat grain, because we have a customer and a project go in there. So these engines kind of map the lifecycle of a plant. And we can create a species and inject that species into a scene image, it may be 10, crop plants at random or some other distribution. And then we inject species of weeds, or disease on the leaves of the crop, lots of different things we can do. But all the time, we're controlling the scope and the variance of that dataset. And that's the key thing. So we know exactly where the edge cases are, where we go to, we know the distribution of every parameter in that data set. So we can design the dataset exactly how we want it, and generate imagery accordingly, we could never collect such imagery in the real world, we wouldn't get anywhere near it. So that's essentially what the essence of what we do.
Rob Stevenson 11:14
when you are building this dataset when you are setting these parameters. I'm curious what that code I guess looks like it's fairly it's not natural language. It's not like, oh, the leaf looks kind of like this, like, are you saying, oh, there's this much distance between this part of the plant in this part of the planner? What is even the input?
Colin Herbert 11:31
Yeah, it's got quite complex, it's actually got quite complex, it's a bespoke software, it's written in C Sharp, and essentially, the parameters are dimensional. So the leaf is this wide at this point, it curls in this particular, like infected space, it has this particular curl to it at this point, it twists and to clot lies, like all leaves of wheat all across the world twist anticlockwise interesting fact. So all of these things are dimensional degrees, millimeters fractions of millimeters. But it's never one value. It's a distribution of values. So there's two questions, what is the scope of that? Distribution, maximum, and minimum values? And then what is the distribution is skewed towards a certain value, high value or low value? So all the 1000s of parameters have a distribution of data going into?
Rob Stevenson 12:28
Okay, so the parameter is a characteristic of a plant. And then the value is a distribution represented numerically.
Colin Herbert 12:35
Yeah, absolutely. And one plant will have many, many, many parameters to describes its leaves, the way the leaves, the plant branches, the way it holds itself, the rate of growth and things like that. So a species is defined by 1000s of parameters,
Rob Stevenson 12:52
surely,yes. So what is this synthetic data? And it's a very human centric question. What is the synthetic data? What does it look like?
Colin Herbert 13:02
Now that leads on to a really interesting area, which we will probably spend some time on, initially, we're trying to. And we still are, at the moment trying to generate images of complex images, complex scene images, multiple plants, occluded disease, on leaves, weeds in the background, things like that. And we wanted those to replace the images that we're taking with a camera in the real world. So it's a straight substitution. But the more we learn, the more is that we're doing that, because we've got a human in the loop. We've got me sitting there saying like, yeah, I want to see this seen this data set, and then we can train the AI. But actually, you can join the generation and the AI together, get out of the way, as a human, and then talk about different types of data. So we've we've submitted a pattern for what we call a closed loop train test system. And if you imagine one hand on the left, you've got this engine that can create images on demand, according to rules and within a data set of known scope, and variants. And then you train then you validate a new test on real world. And through various methods you learn what's working, and where the weaknesses are. And based on that, send a request automatically back to the engine and say, I need more data in this particular area, which is what happens in real life. The AI engineers always asking for more data, annotated data. So we can close this loop and get out as a human and leave it running overnight. And the parameters essentially will bubble up and provide more data in the weak areas and will optimize a particular AI model for a certain part Apples. Not quite as simple as that. But that's the patent. And that's what we have running in AWS at the moment. So as a human, we've kind of learned that we're the ones who want see the image, not the AI, the AI can feed on other things. So let's just get out of the wires are human and find the best way of the engine talking to an AI model.
Rob Stevenson 15:23
So in that scenario, the engine is noticing when there is a gap in a certain like, this is where it would require more data, right? That's a common human requests as you point out, and then it can see the gap, where it's weakened a certain parameter, okay, I need more data to fulfill that parameter, then it makes a request, and then generates synthetic data to fulfill the data for that parameter. Yep, absolutely. That's fantastic. And
Colin Herbert 15:49
we leave it running and with every round of training, it learns a bit more and learns of the weaknesses and gets more data. So we can start off with 5000 images, or 10,000 images, which are figures of brilliant heard of in agricultural AI, we don't normally have that number of images, and leaving it running for a day or so we can end up with 100,000 images. But they're all in a controlled data set. It's not just throwing another bunch of images in which is what you do in the real world. It's controlling it saying I want another 1000 images, looking at the leaf margin of a particular species, but build that in to the overall data set variance and scope. So we're balancing that data set all the while.
Rob Stevenson 16:40
And so it's generative, but as generative within very specific, even though there's a lot of them a very specific finite data set, basically,
Colin Herbert 16:47
yeah, although we don't always know where that data sets going, you know, we start off. And then if we leave the closed loop running, it may end up Oh, we didn't realize we wanted 20,000 images of that species, because it's quite difficult. LDA model is finding it quite difficult. So we don't quite know where it's going to end up. But that's determined by the model, and its success and failure at recognition, of in validation, real world recognition.
Rob Stevenson 17:13
And it works.
Colin Herbert 17:15
It works. I mean, I'm gonna say it's perfect by any means. And we're learning so much. I mean, the beauty of the system is that because we know how we constructed every image, we know where every pixel is. So we can annotate an image, pixel level, we can't do that, as humans with complex images. You know, if you imagine a scene in your garden, if you've left it for a while, and you take a photo, and you give a number of people colored pens and say, Can you color these in, it will be a mess, we can't do it. But we can account for every pixel, every occluded leaf, every single pixel belongs to that leaf and not that leaf. Now, as a human looking at the image, we can't see it. But from an AI perspective, we can describe everything at a pixel level.
Rob Stevenson 18:02
So if I were to look at some of the synthetic data that it generates, I'm guessing it wouldn't look like something that belongs on the cover of National Geographic. So as
Colin Herbert 18:11
a human, we'd look aesthetically at an image. And we'd say, Oh, that's a nice green. And you know, it's a nice curve. And that's nice horizon three, then bird singing and everything else. But from an AI perspective, we need specific things. And from a pixel level, we can put bounding boxes in, we can put bounding boxes in 3d space, for example, we can put key points of the base of the leaf, the midpoint, the end, or any position along we can describe vectors on the plant and how it bends. And ultimately, we can have instance segmentation. That's really where we had to really get rid of any noise in the training data and make every pixel in the training data relevant. So that's really what we're trying to aim. To answer your question on what the images look like, we started off looking for photo realism. And we had the annotation that we could describe any type of annotation for an image with this key points, vectors, instant segmentation, whatever bounding boxes, and depending on what AI model we're using, and we learned that we didn't need that photo realism. That's what we thought we wanted. But increasingly, we give the annotation data, an image that's sufficient for that annotation data. So if we describe for example, the margin of a leaf it may be like a serrated margin, like a sore tooth, we could just provide the key points and say the leaf is green. We don't need to have the texture of the leaf. Now we do have the texture of the leaf. But in a lot of cases where there's a lot of morphological features, you know the way a plant looks, then we don't have to worry about that. extra color is only we have to worry about that if a plant doesn't have anything exciting in terms of its structure and appearance. So increasingly, the images look simplistic, they don't look beautiful. They certainly don't look photorealistic. They look technical, if anything, depending on the species.
Rob Stevenson 20:18
Can you send me an example? So I can put it in the show notes, I would love for people to see this. And it's important because historically, data has been collected for human consumption. And so that is to say it is to be pleasing or understandable or comprehensible to human eyeballs. Now, machines see different lead than humans, even to the example of like, the CCTV footage, it's like, is it aesthetically beautiful? No, it's just like a very grainy high-angle shot of a road. But it's easy for us to figure out what's going on. And so then we have been training learners to process information as a human would. And it's not a human being. So with this generation of synthetic data, you're making data that is purpose built for a machine, it feels like an important shift,
Colin Herbert 21:06
I think it is. I mean, it's a journey we've taken, we've not finished by any means. Ultimately, the way we look at an image is not the way AI looks at an image. And I'm using AI as a collective term. So we look at an image in its whole entirety, aesthetically, holistically, and think, Oh, that's a nice image, AI essentially starts at the top left and works along every pixel cluster and approaches it like that it has no holistic view of that image, it's looking for pixel clusters, which are features, that's its language, that's what it's looking for, that's totally, totally different from a human being. So we're increasing, you're trying to feed it what it enjoys, and trying to get this annoying human aesthetic out of the way. And kind of stand back and join the two parts together. It's it's a learning process. I mean, but other things, I mean, that's taken example, insects, and I'm going to generalize like mad, but insects See the world differently. If we take, you know, the humble bee, it sees very strongly in the ultraviolet, we don't see in the ultraviolet, it sees blue and green, it doesn't see the red end of the spectrum. So the way it sees the world is completely different. And that's really an example of the way AI is looking at an image or looking at a representation of what's in the real world can be completely different. And we can use multispectral images, multispectral images and feed them into AI as well. And then we've completely lost the human because we you know, we just don't see things like that. So there's lots of opportunities, and it is really trying to get the human out of the way in a nice way. Because the way we see things isn't the right way for AI.
Rob Stevenson 22:58
Right, right. It's such an important deviation, I think, because AI in a very simple definition is this attempt to replicate human cognition. I think along the way, we've learned as a very limited view of processing information. So once you agree that like, Okay, we are finite creatures with finite processing power, inside of our brains in these meat sacks suspended by bone that ambulating around by healing themselves through gravity, and had this finite amount of tissue by which they can process and, you know, solve problems, machines don't have any of those trappings. And so why would you hamstring them into the same processing approach that you as a human have? So this, it seems like this is a really important thing to be like, Look, AI doesn't need to replicate human cognition. It's its own sort of life form. In that way. It's its own approach to processing information. And the more we stopped thinking about it in strictly human terms, the more powerful the tech will become surely,
Colin Herbert 23:55
absolutely. I mean, you can stand at a crossroads. If you imagine and say one way we can mimic what we do with real world imagery that's on the left, we can take that journey, we've done that, but we're we never get 100% We never get the high 90s. We're always short in some way, mainly because we don't have the right data set, and we don't have the annotation. And the other journey is saying, forget about the human. What are we trying to solve? What is the AI? And how do we join that gap? And sometimes I think eventually we won't have an image involved. It may be looking at the crop, say for certain diseases, free disease, symptoms can be seen with infrared, near infrared. So if we take an infrared image and feed that to the AI directly or not the image, but key points from that camera to the AI, we take out the image for two reasons. One, because we won't see an image anyway, the camera won't generate an image. it'll generate relevant data. And secondly, the humans out of the loop so we don't need to see it. So It's really fascinating, and I don't know how far it will go. But I, I have a suspicion what we'll be doing in the future won't involve images as we see them now, they won't be recognizable.
Rob Stevenson 25:12
The criticism often leveled at synthetic data is that how can data not generated not captured rather in the real world yield accurate real world output? How are you measuring the performance and making sure that while the data in synthetic, the output aligns to reality,
Colin Herbert 25:32
ultimately testing on real-world images, if the objective is to recognize the specific grass weed for targeting with a robot with a laser, for example, we have a number of robots around the world that use lasers to kill weeds. And then we have to be able to target that very clearly and accurately. So that's the ultimate test. And we can see very quickly whether we're killing the crop rather than a weed. So there's a very simple test at the end. And that's real-world imagery. Before then we do a number of things. One area that's often cited is bias, introducing bias and overfitting in certain areas, we have to be careful of that. And we can soon spot that we've learned to see when that's happening. And then we can make sure that when we do get to touch the real-world imagery, we're pretty sure we know what results are going to come out. So a lot of testing on the journey through the validation and then ultimately touched on real well,
Rob Stevenson 26:34
right, because it's performing upon real world images, like in the output of this tech, which I guess we haven't really, we've talked a lot about the behind the scenes, what what is the actual product? What is the hope that we're trying to perform?
Colin Herbert 26:45
There's a number of kind of verticals. But if we want to use robotics in agriculture, robots have been around for about eight years in agriculture globally, there are a lot of issues with adoption, a lot of trust issues with farmers, farmers don't take the latest tech quickly, it's well known that they adopt slowly. So that's one area is a robot, like a dining table size, going through a crop, looking down with a camera identifying weeds in order to kill them. Rather than using chemicals. They use mechanical means or very low amounts of chemical or lasers. Another area though, is research and development. Something I used to do many years ago, we have trials, rather like clinical trials in a field, you divide part of the field up into little plots, and you put different treatments on and then a skilled agronomist would go along and assess the effectiveness, the efficacy of those different treatments, this one has a lot of disease, so it's not performing very well. This one has very little disease on certain leaves, so that treatment is performing better. So research and development for varieties of crops for agrichemicals. And a lot of other uses is a huge market because that r&d That trials approach is the same all over the globe, there are billions of plots out there. So research and development is another one vertical farming more of a conveyor, where the the camera is still that the crop is moving past and looking for any blemishes perhaps using near infrared to look for blemishes on the leaf before a human would see them back to what we're saying. And being able to either treat them or take the leaf off or tag that plant as a second grade product. The other one that we're involved with in North America is grain where if you imagine grain in a trailer that's output from a combine harvester wheat, for example, the quality of that grain dictates the price, its value. And you can get certain things like green kernels and disease kernels, broken kernels. There's lots and lots and lots of different characteristics that can reduce the quality and the price of that grain. And these are assessed by skill humans takes about 567 years to learn how to grade wheat. And this is done on farm taking sample. It's done at ports before it goes onto a boat. It's done at silos when lorries arrive, and it's quite clumsy, we're looking at a method and it's going quite well that we use AI up against a glass where the grain is passing, and we flush an image and then use AI to actually identify these characteristics. So we have like a continuous output of the quality of that grain. And this has the ability the potential to change the way grain is marketed globally. So there are a number of robots vertical farming research and development grain, a number of areas where AI is starting to be used
Rob Stevenson 29:59
Gotcha. So there's just some of the applications. And I might have asked you that a half hour ago, but the interview, we maybe should have started with that. But that's okay, we got there eventually. I'm glad you detailed those. Because when we think about the training versus performance, the performance like is on real world imagery, real world data collection, things happening out, you know, in the material, corporeal world. And that is important, because I'm pleased by the parlance, training versus performance. Because training versus performance. If you think of an athlete or a musician, it's similar in that, like, the training is synthetic, it's like, it's not like the performance, you the performance in an arena is very different from getting ready for the performance. And so like a football practice, for example, there's all these drills and you you identify the weak spots you like where the perimeter is, I'm weak on right, and we could tackling so I'm going to we're going to focus just on that in the same way your engine does. And so Oh, no, the data is synthetic, how can it produce a real world output? Well, the training is always synthetic, it's always a simulation of a real world event you're trying to prepare for. So I just wanted to connect that dot because it feels like a criticism against something being synthetic as somehow less does not hold water.
Colin Herbert 31:13
I agree. It's an order of magnitude in a way that you know, we're totally synthetic, as opposed to a degree of synthetic Nassif, that's a word. And also, again, the way we look at the real world image, we tend at the moment to use machine learning cameras to capture RGB images, you know, normal images. But increasingly, we will, perhaps have multiple AI models being used in sequence and have different cameras. So we might do an initial screen with one model on black and white, take color out, and we might blur them to take texture out. So we're just looking for these big morphological shapes, to very quickly say, that's that we that's that that's that. And then add on another layer of AI model to determine objects on leaves, for example, may not be a real world image that we're ultimately looking at. But it will be data from the real world, the crucial difference.
Rob Stevenson 32:11
Yep, definitely. Well, come on, we are rapidly approaching optimal podcast length here, I could definitely stand to hear more. But for now, just say this has been so fascinating. I'm going to make sure we put an example of some of the synthetic data you're generating in the shownotes. Just so people can kind of conceptualize a little bit like maybe some of the people out there have already seen something like that. But I think it's worthwhile to be like this is what we should be feeding machines, not something that looks like photorealistic and aesthetically pleasing to we humans. So we'll put that in the show notes. And Colin, this has been fascinating. It's amazing work you're doing over there. Congrats on your success so far. And I'm I'm going to be following this very closely. Because it really really is a new and interesting approach.
Colin Herbert 32:49
That's great. Thank you for the invitation. It's been enjoyable Thanks.
Rob Stevenson 32:54
How AI happens is brought to you by sama. Sama provides accurate data for ambitious AI specializing in image, video and sensor data and notation and validation for machine learning algorithms in industries such as transportation, retail, e commerce, media, med tech, robotics, and agriculture. For more information, head to sama.com