Computer vision is a form of artificial intelligence that can help manage data, explains Tackle AI co-founder and Chief Technology Officer Sergio Suarez Jr., in this episode of “The Buzz” Podcast.
Currently, the financial services industry’s use case for the technology is in mortgages, where banks are utilizing it to assess risk. Computer vision can sort through pages of documentation to pull out key information, Suarez explains.
“[Bank have] got to look for a bunch of data points that will help them make the determination whether it’s good or bad,” Suarez tells Bank Automation News. “We’re very good at looking through them and pulling out all the things we are looking for, such as … what’s the interest rate? What’s the amount of this loan? [Has the consumer] been late paying? How many times were they late?”
Underlying computer vision is a deep learning that uses repetition and iteration to train bots over time to recognize complex images, Suarez explains.
Learn more about what computer vision is, how it’s evolving and its use with robotic process automation in this emerging technology episode of “The Buzz.”
Subscribe to The Buzz Podcast on iTunes, Spotify, Google podcast, or download the episode.
The following is a transcript generated by AI technology that has been lightly edited but still contains errors.
Good day and welcome to The Buzz, a Bank Automation News’ podcast. I’m BAN Editor Loraine Lawson. Recently I spoke with Sergio Suarez Jr. co-founder and CTO of Tackle AI, which specializes in computer vision, a type of artificial intelligence that allows computers to learn from visual images. I asked Mr. Suarez to explain how computer vision works and its use with robotic process automation and in financial services.
Sergio Suarez Jr.
So computer vision is a form of deep learning. And, you know, with machine learning, or deep learning kind of tasks with AI. And it’s really a way of how can we view be able to look at objects or documents or things like that, and having AI be able to tell us things. So an example of where we would use computer vision is an identifying dogs in a picture, right. So if you’ve ever used your your iPhone app, for your pictures, you’ll notice that you can type things like chair. And the next thing you know, it’s filtering all of the images that have a chair, you can also filter by yourself. So facial recognition is a type of computer vision. But we’ve taken that much further. And we’ve started to use it for things like reading documents, or identifying logos, right? Whenever you’re trying to look at a document and figure out well, what is it a logo is a really good place to start, if I see a Walgreens logo, or Walgreens pharmacy, then I probably know, hey, this is probably going to be a Walgreens pharmacy document. So it’s been a really, really helpful tool in not only the way that we are able to identify objects and people, but also in the way that we’ve been able to automate paperwork because of it.
Loraine Lawson
Okay, so can you give me some examples of how it might assist with robotic process, automation, and other forms of automation?
Sergio Suarez Jr.
Sure, so, um, it’s a tool that helps you with automation. So I think, with RPA RP is very focused on using regular OCR, and then zoning doctrines. So it’ll say, I’ve seen this document before, I know where all the information that I want is, and they draw x, y, you know, x one, y one and x two y two coordinates, and then they just grab whatever information is in there. Whereas we start to use more deep learning and especially computer vision. To know, I think I know what type of document this is an example would be, this looks like an MRI, because this is what an MRI looks like, right? Or this is a CT scan, because this is what a CT scan looks like. So even without having to read the document at all, because we can visually see that it’s a CT scan, you already know what you’re looking at. So it really helps you narrow things down very, very quickly.
Loraine Lawson
So how would that help with bot deployment or how a bot might function in say financial services.
Sergio Suarez Jr.
So if financial services, let’s say, you’re processing invoices, for example, that’s something we do at tackle, it’s actually being able to identify the logo of the company that sent you the invoice really helps you narrow down who it belongs to. So you don’t have to use OCR to read. Okay, make sure that it exactly says FedEx on there or something. And also, a lot of times these are really poor quality images that you’re getting when someone sends you an invoice and somebody crumbled it up. And, you know, now it’s really difficult to see the letters, but logos still look very distinct and they still are able to help you find this. And also things just computer vision. If as a person I look at a document, we kind of know what simple Hortence right away, you know the the letters that are bigger the things that are bold, the, you know, numbers that are underlined, computer vision is really good at like, getting rid of garbage and saying, Hey, these are the important things. So we’ve been using that a lot when looking at invoices and bank statements and things like that.
Loraine Lawson
And how do you couple it with AI — what’s some of the use cases there?
Sergio Suarez Jr.
Yeah, so computer vision is a form of AI. It’s a form of deep learning. I think that AI at tackle, we’re very big machine, lots of different strategies. So we like computer vision a lot as a really good first pass of eliminating noise, which I’m a really big fan of. And then we move on to more typical or legacy kind of machine learning tactics, and even some rules based engines. For example, if I see that something is an MRI, or the computer vision helps me identify that, then certain clients have certain rules for what they want out of an MRI, you know, some of them want the name and the medical record number, etc. So you also need those engines that can like, hey, once I know what the rules are, once I know what I’m looking at, here’s all of your rules. So we’re really big on that,
Loraine Lawson
too. Okay. Yeah, you did say it was AI, a form of deep learning. So I guess I was thinking more traditional AI sort of applications. But what are some of the trends that we should look out for the coming year with computer vision and with with through this approach to document processing?
Sergio Suarez Jr.
Yeah, so for the longest time, everything has been very, about this particular document. So RPA is very good at that is, I know, this exact document, what we’re getting with deep learning models is the concept of what a document is. So an example would be like, let’s say in legal, based on what I’m reading, this is a hearing document, based on what I’m seeing here. And natural language processing is another form and techniques that we use for for a lot of this as well, in conjunction with computer vision, where can we make sense of what’s being written out. So instead of knowing exactly what a hearing document has to be structured, like, we can make sense of what it’s saying, you know, this is probably hearings, or, you know, this is a document that’s telling you that you have to go to court or something like that. And mixing all of these different strategies has made that really a lot a lot easier for us really mimicking what a human is doing. When we look at a document?
Loraine Lawson
Do you work with a lot of banking clients, or fintechs?
Sergio Suarez Jr.
So we’re big right now, with mortgages. In reading mortgage documents, a lot of times people don’t realize like, those after a few years are four or 5000 page documents. When you buy a mortgage from Chase Bank, or somebody, you’re actually paying Chase Bank, an investor buys those, you know, or mortgage servicing company will buy those quickly. And they now have to assess, you know, go through is this a good, this is a good mortgage to buy. And they have to go on, they got to look for a bunch of data points that will help them make the determination whether it’s good or bad. And we’re very good at looking through them and pulling out all the things are looking for such as you know, what’s the interest rate? What’s the amount of this loan? Have they been late paying? How many times were they late? Does it have a garden, like little things like that, that they’re looking for, because the little things like, hey, if something has a garden, if a home has a garden, and people care about their home more, they’re more likely to pay their mortgage. So that little data point is a very good one for mortgage servicing companies. That right now, for a very long time, humans would have to go through and find these and now we’re using AI to find
Loraine Lawson
Have you seen any innovative uses of your product or computer vision in general in the FinTech or financial space, something that you feel was unusual or stood out? Um,
Sergio Suarez Jr.
I think I think the thing with mortgages is pretty, it’s pretty awesome. I think that we’ve really showcase a whole bunch of different techniques. That’s really cool. I think that will continue to go deeper and deeper. I think another one too, was like analyzing bank statements. Because again, every bank has a different bank statement. And depending on what kind of a checking account you have, it’s put differently. And we’ve been really good at saying here’s all the information about the bank statements, and all the times they need that to be able to analyze whether to give you credit or not, you know, have Did you overdraw, how often have you overdrawn? And how quickly did you did you get the money back and there’s a whole bunch of stuff like that, that right now, or for a very long time, it was just humans having to go analyze it manually. Whereas now we can just give them the information. As quick little data points, they overdrew three times in the last four years they do XY and Z Now they have their algorithms and their their analytics that they can run. And there’s really almost no human in the loop anymore with analyzing stuff.
Loraine Lawson
I wondered, like, Where was the big, big thing with OCR? I don’t know did people kind of like you did a labor work with
Sergio Suarez Jr.
so. So OCR to us is it’s a, we still use OCR and a lot of the things that we do, a lot of times what people don’t understand is, if you just OCR document, you still don’t know anything about it. Right? If I gave you even a Word document, right, that’s, that’s already kind of structured. You still don’t know what the name is, what the address is, who the person is, it’s just as it’s just text now. But you still need information. You see, to cut through all of that, I will say that computer vision has made OCR become more and more obsolete. It’s just better at picking up letters and numbers. And then especially when it’s especially when it’s in like really weird signs or things are not completely straight. OCR very much like straight lines, you know, very orderly, whereas computer vision can look at anything and figure out what it is, you know, you can take a picture of outside and it’ll pick up that you know, what that restaurant is called outside or you know, that OCR just not going to be able to do that. OCR is looking for documents.
Loraine Lawson
That raises a question for me, actually, can you explain a little bit behind the technology? Like, what is it doing differently that that allows it to work that way? Is computer vision? Yeah, computer vision?
Sergio Suarez Jr.
Right. Yeah. So you know, and that’s a, that’s a very loaded question. Because it has to do with fundamentally how deep learning works. And when you tell people how deep learning works, it scares people. Because the fact of the matter is, we mostly don’t know. We get the math behind it, and we get how it’s happening. But in reality, you’re training this is bots, right? You tell you, you have two things, and you show, here’s a B, and here’s a cat, right? And you have with this one little bot that you write, and it says, Hey, by the way, that’s the cat. And so then this bot then writes 1000 bots, and it says, Hey, go choose, go choose the cat. And about 50% of them are going to choose right 50% of the literature to choose wrong. And then you delete the 50, that got it wrong. And you replicate the ones that guy, right, and you do that billions of times, until for some reason, you end up with something that just knows what that that that’s a cat, or that has to do with a lot of how neural networks work. And you know, you could break it down. But we really can’t give you a specific answer as to why this thing now knows that that’s a cat. We’re just mimicking the way we assume instantly, I can tell you what I know, the difference between you know, a Ford and a, a Ford car and you know, GMC that. I don’t know why I know. I just know that I know it. And it’s the same sort of deal that happens with computer vision. As long as we give it enough training, and we give it enough examples, it will figure it out. But why do we know? Yeah, that’s that’s a that’s a much longer conversation.
Loraine Lawson
It sounds like natural selection for bots.
Sergio Suarez Jr.
it’s exactly what it is. It’s exactly what its natural selection. Robots. Yes.
Loraine Lawson
Yeah, I can see why that would frighten people. But fortunately, it’s pretty technical. So the hopefully will be
Sergio Suarez Jr.
great. That’s, that’s awesome.
Yeah, I think like, right now, computer vision is going to start getting into like, way more parts of our life. You know, it’s it’s been sneaking in, you know, and like these little things like if you’ve ever done like an image search, you know, simply like on Google, that’s a form of, of computer vision. But it’s starting to get a lot more complex. And as, as our GPUs and our processing power keeps getting faster and a lot more robust, we’re able to process this stuff much, much faster. So we’re gonna see insane because now we don’t start models, a lot of times from scratch. We take a model that was made maybe three years ago, and we’re like, hey, we can do 100 times more with this model now. So very rarely are we starting from zero like we used to do many years ago. And I think that we’re gonna see that more and more, especially as GPUs continue to get faster and let us do some pretty good stuff. So yeah, I it’ll be fun.
Loraine Lawson:
You’ve been listening to the Buzz, a Bank Automation News podcast. Thank you for your time, and be sure to visit us at Bank automation news.com for more automation news. You can also follow us on Twitter and LinkedIn. Please don’t hesitate to rate this podcast on your podcast platform of choice.
Computer vision is a form of artificial intelligence that can help manage data, explains Tackle AI co-founder and Chief Technology Officer Sergio Suarez Jr., in this episode of “The Buzz” Podcast.
Currently, the financial services industry’s use case for the technology is in mortgages, where banks are utilizing it to assess risk. Computer vision can sort through pages of documentation to pull out key information, Suarez explains.
“[Bank have] got to look for a bunch of data points that will help them make the determination whether it’s good or bad,” Suarez tells Bank Automation News. “We’re very good at looking through them and pulling out all the things we are looking for, such as … what’s the interest rate? What’s the amount of this loan? [Has the consumer] been late paying? How many times were they late?”
Underlying computer vision is a deep learning that uses repetition and iteration to train bots over time to recognize complex images, Suarez explains.
Learn more about what computer vision is, how it’s evolving and its use with robotic process automation in this emerging technology episode of “The Buzz.”
Subscribe to The Buzz Podcast on iTunes, Spotify, Google podcast, or download the episode.
The following is a transcript generated by AI technology that has been lightly edited but still contains errors.
Good day and welcome to The Buzz, a Bank Automation News’ podcast. I’m BAN Editor Loraine Lawson. Recently I spoke with Sergio Suarez Jr. co-founder and CTO of Tackle AI, which specializes in computer vision, a type of artificial intelligence that allows computers to learn from visual images. I asked Mr. Suarez to explain how computer vision works and its use with robotic process automation and in financial services.
Sergio Suarez Jr.
So computer vision is a form of deep learning. And, you know, with machine learning, or deep learning kind of tasks with AI. And it’s really a way of how can we view be able to look at objects or documents or things like that, and having AI be able to tell us things. So an example of where we would use computer vision is an identifying dogs in a picture, right. So if you’ve ever used your your iPhone app, for your pictures, you’ll notice that you can type things like chair. And the next thing you know, it’s filtering all of the images that have a chair, you can also filter by yourself. So facial recognition is a type of computer vision. But we’ve taken that much further. And we’ve started to use it for things like reading documents, or identifying logos, right? Whenever you’re trying to look at a document and figure out well, what is it a logo is a really good place to start, if I see a Walgreens logo, or Walgreens pharmacy, then I probably know, hey, this is probably going to be a Walgreens pharmacy document. So it’s been a really, really helpful tool in not only the way that we are able to identify objects and people, but also in the way that we’ve been able to automate paperwork because of it.
Loraine Lawson
Okay, so can you give me some examples of how it might assist with robotic process, automation, and other forms of automation?
Sergio Suarez Jr.
Sure, so, um, it’s a tool that helps you with automation. So I think, with RPA RP is very focused on using regular OCR, and then zoning doctrines. So it’ll say, I’ve seen this document before, I know where all the information that I want is, and they draw x, y, you know, x one, y one and x two y two coordinates, and then they just grab whatever information is in there. Whereas we start to use more deep learning and especially computer vision. To know, I think I know what type of document this is an example would be, this looks like an MRI, because this is what an MRI looks like, right? Or this is a CT scan, because this is what a CT scan looks like. So even without having to read the document at all, because we can visually see that it’s a CT scan, you already know what you’re looking at. So it really helps you narrow things down very, very quickly.
Loraine Lawson
So how would that help with bot deployment or how a bot might function in say financial services.
Sergio Suarez Jr.
So if financial services, let’s say, you’re processing invoices, for example, that’s something we do at tackle, it’s actually being able to identify the logo of the company that sent you the invoice really helps you narrow down who it belongs to. So you don’t have to use OCR to read. Okay, make sure that it exactly says FedEx on there or something. And also, a lot of times these are really poor quality images that you’re getting when someone sends you an invoice and somebody crumbled it up. And, you know, now it’s really difficult to see the letters, but logos still look very distinct and they still are able to help you find this. And also things just computer vision. If as a person I look at a document, we kind of know what simple Hortence right away, you know the the letters that are bigger the things that are bold, the, you know, numbers that are underlined, computer vision is really good at like, getting rid of garbage and saying, Hey, these are the important things. So we’ve been using that a lot when looking at invoices and bank statements and things like that.
Loraine Lawson
And how do you couple it with AI — what’s some of the use cases there?
Sergio Suarez Jr.
Yeah, so computer vision is a form of AI. It’s a form of deep learning. I think that AI at tackle, we’re very big machine, lots of different strategies. So we like computer vision a lot as a really good first pass of eliminating noise, which I’m a really big fan of. And then we move on to more typical or legacy kind of machine learning tactics, and even some rules based engines. For example, if I see that something is an MRI, or the computer vision helps me identify that, then certain clients have certain rules for what they want out of an MRI, you know, some of them want the name and the medical record number, etc. So you also need those engines that can like, hey, once I know what the rules are, once I know what I’m looking at, here’s all of your rules. So we’re really big on that,
Loraine Lawson
too. Okay. Yeah, you did say it was AI, a form of deep learning. So I guess I was thinking more traditional AI sort of applications. But what are some of the trends that we should look out for the coming year with computer vision and with with through this approach to document processing?
Sergio Suarez Jr.
Yeah, so for the longest time, everything has been very, about this particular document. So RPA is very good at that is, I know, this exact document, what we’re getting with deep learning models is the concept of what a document is. So an example would be like, let’s say in legal, based on what I’m reading, this is a hearing document, based on what I’m seeing here. And natural language processing is another form and techniques that we use for for a lot of this as well, in conjunction with computer vision, where can we make sense of what’s being written out. So instead of knowing exactly what a hearing document has to be structured, like, we can make sense of what it’s saying, you know, this is probably hearings, or, you know, this is a document that’s telling you that you have to go to court or something like that. And mixing all of these different strategies has made that really a lot a lot easier for us really mimicking what a human is doing. When we look at a document?
Loraine Lawson
Do you work with a lot of banking clients, or fintechs?
Sergio Suarez Jr.
So we’re big right now, with mortgages. In reading mortgage documents, a lot of times people don’t realize like, those after a few years are four or 5000 page documents. When you buy a mortgage from Chase Bank, or somebody, you’re actually paying Chase Bank, an investor buys those, you know, or mortgage servicing company will buy those quickly. And they now have to assess, you know, go through is this a good, this is a good mortgage to buy. And they have to go on, they got to look for a bunch of data points that will help them make the determination whether it’s good or bad. And we’re very good at looking through them and pulling out all the things are looking for such as you know, what’s the interest rate? What’s the amount of this loan? Have they been late paying? How many times were they late? Does it have a garden, like little things like that, that they’re looking for, because the little things like, hey, if something has a garden, if a home has a garden, and people care about their home more, they’re more likely to pay their mortgage. So that little data point is a very good one for mortgage servicing companies. That right now, for a very long time, humans would have to go through and find these and now we’re using AI to find
Loraine Lawson
Have you seen any innovative uses of your product or computer vision in general in the FinTech or financial space, something that you feel was unusual or stood out? Um,
Sergio Suarez Jr.
I think I think the thing with mortgages is pretty, it’s pretty awesome. I think that we’ve really showcase a whole bunch of different techniques. That’s really cool. I think that will continue to go deeper and deeper. I think another one too, was like analyzing bank statements. Because again, every bank has a different bank statement. And depending on what kind of a checking account you have, it’s put differently. And we’ve been really good at saying here’s all the information about the bank statements, and all the times they need that to be able to analyze whether to give you credit or not, you know, have Did you overdraw, how often have you overdrawn? And how quickly did you did you get the money back and there’s a whole bunch of stuff like that, that right now, or for a very long time, it was just humans having to go analyze it manually. Whereas now we can just give them the information. As quick little data points, they overdrew three times in the last four years they do XY and Z Now they have their algorithms and their their analytics that they can run. And there’s really almost no human in the loop anymore with analyzing stuff.
Loraine Lawson
I wondered, like, Where was the big, big thing with OCR? I don’t know did people kind of like you did a labor work with
Sergio Suarez Jr.
so. So OCR to us is it’s a, we still use OCR and a lot of the things that we do, a lot of times what people don’t understand is, if you just OCR document, you still don’t know anything about it. Right? If I gave you even a Word document, right, that’s, that’s already kind of structured. You still don’t know what the name is, what the address is, who the person is, it’s just as it’s just text now. But you still need information. You see, to cut through all of that, I will say that computer vision has made OCR become more and more obsolete. It’s just better at picking up letters and numbers. And then especially when it’s especially when it’s in like really weird signs or things are not completely straight. OCR very much like straight lines, you know, very orderly, whereas computer vision can look at anything and figure out what it is, you know, you can take a picture of outside and it’ll pick up that you know, what that restaurant is called outside or you know, that OCR just not going to be able to do that. OCR is looking for documents.
Loraine Lawson
That raises a question for me, actually, can you explain a little bit behind the technology? Like, what is it doing differently that that allows it to work that way? Is computer vision? Yeah, computer vision?
Sergio Suarez Jr.
Right. Yeah. So you know, and that’s a, that’s a very loaded question. Because it has to do with fundamentally how deep learning works. And when you tell people how deep learning works, it scares people. Because the fact of the matter is, we mostly don’t know. We get the math behind it, and we get how it’s happening. But in reality, you’re training this is bots, right? You tell you, you have two things, and you show, here’s a B, and here’s a cat, right? And you have with this one little bot that you write, and it says, Hey, by the way, that’s the cat. And so then this bot then writes 1000 bots, and it says, Hey, go choose, go choose the cat. And about 50% of them are going to choose right 50% of the literature to choose wrong. And then you delete the 50, that got it wrong. And you replicate the ones that guy, right, and you do that billions of times, until for some reason, you end up with something that just knows what that that that’s a cat, or that has to do with a lot of how neural networks work. And you know, you could break it down. But we really can’t give you a specific answer as to why this thing now knows that that’s a cat. We’re just mimicking the way we assume instantly, I can tell you what I know, the difference between you know, a Ford and a, a Ford car and you know, GMC that. I don’t know why I know. I just know that I know it. And it’s the same sort of deal that happens with computer vision. As long as we give it enough training, and we give it enough examples, it will figure it out. But why do we know? Yeah, that’s that’s a that’s a much longer conversation.
Loraine Lawson
It sounds like natural selection for bots.
Sergio Suarez Jr.
it’s exactly what it is. It’s exactly what its natural selection. Robots. Yes.
Loraine Lawson
Yeah, I can see why that would frighten people. But fortunately, it’s pretty technical. So the hopefully will be
Sergio Suarez Jr.
great. That’s, that’s awesome.
Yeah, I think like, right now, computer vision is going to start getting into like, way more parts of our life. You know, it’s it’s been sneaking in, you know, and like these little things like if you’ve ever done like an image search, you know, simply like on Google, that’s a form of, of computer vision. But it’s starting to get a lot more complex. And as, as our GPUs and our processing power keeps getting faster and a lot more robust, we’re able to process this stuff much, much faster. So we’re gonna see insane because now we don’t start models, a lot of times from scratch. We take a model that was made maybe three years ago, and we’re like, hey, we can do 100 times more with this model now. So very rarely are we starting from zero like we used to do many years ago. And I think that we’re gonna see that more and more, especially as GPUs continue to get faster and let us do some pretty good stuff. So yeah, I it’ll be fun.
Loraine Lawson:
You’ve been listening to the Buzz, a Bank Automation News podcast. Thank you for your time, and be sure to visit us at Bank automation news.com for more automation news. You can also follow us on Twitter and LinkedIn. Please don’t hesitate to rate this podcast on your podcast platform of choice.