Steve Hsu - Solving the Hallucination Problem Artwork

Conversations on Applied AI

Welcome to the Conversations on Applied AI Podcast where Justin Grammens and the team at Emerging Technologies North talk with experts in the fields of Artificial Intelligence and Deep Learning. In each episode, we cut through the hype and dive into how these technologies are being applied to real-world problems today. We hope that you find this episode educational and applicable to your industry and connect with us to learn more about our organization at AppliedAI.MN. Enjoy!

All Episodes

Conversations on Applied AI

Steve Hsu - Solving the Hallucination Problem

September 26, 2023 • Justin Grammens • Season 3 • Episode 19

0:00 | 41:50

The conversation this week is with Steve Hsu. Steve is a professor of Theoretical Physics and Computational Mathematics, Science, and Engineering at Michigan State University. He is also the founder of SuperFocus.ai, which is for lack of a better term Enterprise GPT. However, its AI uses large language models, but with customer-defined memory to eliminate hallucinations.

If you are interested in learning about how AI is being applied across multiple industries, be sure to join us at a future AppliedAI Monthly meetup and help support us so we can make future Emerging Technologies North non-profit events!

Resources and Topics Mentioned in this Episode

Enjoy!

Your host,
Justin Grammens

Steve Hsu 0:00

If I could broadly describe what's happening in the AI space right now, there's obviously tons of hype and sky high valuations, around companies that are building foundation models. But we're pretty confident that already the foundation models that are available both the open source ones and the proprietary ones are kind of good enough to play a role in an arcade, like ours. What's much harder, at least in terms of the nitty gritty engineering that you have to do is getting the AI to the point where it can satisfy the enterprise requirements.

AI Announcer 0:37

Welcome to the conversations on applied AI podcast where Justin Grammens and the team at emerging technologies North talk with experts in the fields of artificial intelligence and deep learning. In each episode, we cut through the hype and dive into how these technologies are being applied to real world problems today. We hope that you find this episode educational and applicable to your industry and connect with us to learn more about our organization at applied ai.mn. Enjoy.

Justin Grammens 1:08

Welcome everyone to the conversations on applied AI Podcast. Today we're talking with Steve Chu. Steve is a professor of theoretical physics and Computational Mathematics, Science and Engineering at Michigan State University. He is also the founder of Super focused.ai, which is for lack of a better term enterprise GPT. However, it's AI is using large language models, but with customer defined memory to eliminate hallucination. So I'm very excited to talk about the aspects of your project today. It's an exciting new field that I'm excited to learn a lot more about. He's been the founder of countless other startups with interests and specialties in fundamental physics, research, genomics, internet security, encryption, entrepreneurship, and of course, artificial intelligence and machine learning. So thank you, Steve, for being on the podcast today.

Steve Hsu 1:51

It is great to be with you, Justin.

Justin Grammens 1:53

Awesome. Well, I gave a little bit of a, I guess, story on where you are today and what you're doing. But one of the first questions I usually like to ask people is maybe maybe you could talk about how you got to where you are today, kind of what's been the trajectory of your career? Well,

Steve Hsu 2:06

I grew up in Ames, Iowa, which might be not that far from where you are now. Are you in Minnesota? Yep. St. Paul? Yeah. So I grew up in the Midwest, went to college out west, was educated at Caltech and Berkeley. And if you'd asked me when I was younger, what I wanted to do in life, I would have said, I wanted to be a theoretical physicist, like my hero, Richard Feynman. And it was only later in life, really coinciding with the first internet bubble, that I got interested in entrepreneurship, and tech startups. But that was now almost 25 years ago. So a while ago, so I've been kind of leading a dual life for the last 25 years or so where I'm both a university professor and a founder of what have mainly been deep tech startups. So they're startups which are actually attempting to do something which is technically hard, but you know, with an ultimate goal, which is useful in the economy. And this latest startup is the outgrowth of a long term interest I've had in deep learning neural networks, and transformer architectures, but was really stimulated almost a year ago, actually, now, by my co founder to Shar Sheth, who is a University of Michigan, he has degrees, both from the engineering college and the law school, he served in the Obama White House on both education and innovation policy. And when he got out of government, he wanted to start a company that really would help people use a lot of these breakthroughs in AI and related areas to help people. So when he first came to me, he was mostly focused on building an AI tutor. So something that could tutor kids. And in discussing the possibilities in that area, we kind of agreed that we had to first solve what's called the hallucination problem. But the issue is basically just that the way large language models are built. And we started this company, actually, before Chet GPT, contribute to it. But but I've had a long acquaintance with the open AI founders going back many years. So I kind of knew what was coming. And I knew we had to solve this elimination problem, if you want to do something serious, like build an AI tutor, and the company at the moment is mainly focused on as you said, enterprise GPT. So applications of large language models in the enterprise. For that you also need to solve hallucination. So what you really could say is like our company is basically trying to build MLMs that don't eliminate.

Justin Grammens 4:38

And I guess it that's gotta be a hard problem.

Steve Hsu 4:41

Yeah, it's a fundamental problem, because the way that large language models are built, is you take a very large neural net, with billions, perhaps even a trillion connections, and you are training it to predict the next word given up prompt a with n words, it's hard to predict what the n plus one word will be. And what that really is, and it's doing this by looking at huge, you know, trillion token training datasets, which really typically tend to be scraped mostly from the internet. And what it's really being trained to do is create plausible human language. So in other words, if the next word is something that fits in patterns that it has seen before, on the internet, then it's happy just to give you that next word, or next series of words, as the response. But you'll notice it's not trying to really nail down you know, exactly what is the price of that ticket from Minneapolis to New York City, it could give you a plausible answer and be perfectly happy. And so what people rapidly discovered and we knew this already, when obviously, when we started the company that the moment you're trying to do something serious, like as you add your way from human in the loop, you've got to force the LLM or the AI to be much more accurate in the information that's coming back. It's so our whole company is based around kind of software architecture, which uses LLM, but is designed in such a way as to only use the LLM for its language capability. But not to rely on knowledge encoded in the LLM. And one way we conceptualize it is we think of a customer defined memory that we're attaching to the LLM. It's like we're doing brain surgery. And in your brain, you have a language and reasoning Center, which is the LLM. And we're kind of attaching to it a memory and the memory could be just all the flight booking policies of Delta Airlines, and this huge table of all of its flights and prices, and seats. And that's the memory. But we forced the LMT use that memory in order to deal with user queries.

Justin Grammens 7:05

Gotcha. Gotcha. Yeah. And so do do people load all of their corpus of text into your platform? Are you guys gonna use a hosted? Are you guys like on pram? Ha? How do people interact with your, with your technology?

Steve Hsu 7:19

So at the moment, the Paul, the pilots we're doing right now are with the AI in the cloud. And generally, our partner is providing the corpus to us beforehand, typically uploading it into the cloud. And then we're building the AI so that the AI is good at responding to queries based on that corpus. Gotcha.

Justin Grammens 7:45

And you have to do kind of the hand holding, or you have to sort of handhold them through the process, or do some of the workers that kind of self serve, you know, right

Steve Hsu 7:51

now, it's, there's a certain amount of engineering that has to be done for each use case, which is specific to that particular customer and the end the domain that the AI will be operating in. As we get better and better, we'll start to see more cases use cases, which are almost identical to use cases we've already done. And in that case, it can be plug and play where, you know, this third travel management company comes to us with a kind of similar use case. And oh, yeah, I see in your I see your data looks a lot like the data of these other two companies already done. So yeah, we can just use our existing code base and just swap your set of, you know, contracts and policies, etc, etc. And for theirs, and the AI will work, you know, pretty much out of the box. So we're evolving toward that, although I wouldn't say we're actually there yet.

Justin Grammens 8:42

Sure, sure. Yeah. Again, I'm I've started a couple different startups myself, too. And you don't, you don't really know what the market really wants you kind of kind of build for the future when you need the feature. Right. So if you're dealing with customers right now that are okay, having you sort of help them work through it, then you'll find out if they're going to need to move on to the plug and play, you know, aspect or not for sure.

Steve Hsu 9:02

Yeah. I mean, you you know, as a startup guy, you you know that there are some engineering challenges, you know, you're definitely going to have to shut them out. So you can work on those right away, then the exact nature of the product market fit, you don't know, till you actually start really doing pilots with customers, and you start to see like, oh, they don't actually care about this feature. This feature doesn't do anything for them. They're never using it. But they care a lot about this, which you didn't think about, totally yet. Yeah. And so we're in that phase. And I think, if I could broadly describe what's happening in the AI space right now, there's obviously tons of hype and sky high valuations around companies that are building foundation models. But we're pretty confident that already the foundation models that are available, both open source ones and the proprietary ones, are kind of good enough to play a role in an arc like ours. What's much harder, at least in terms of the new At gritty engineering that you have to do is getting the AI to the point where it can satisfy the enterprise requirements. And so we're really kind of focused on nitty gritty stuff, which is, I think, a lot less glamorous. Like, I think investors are interested in the space that we're in, we had no trouble raising money, but they're not willing to give you the billion dollar valuations from the get go that they're willing to give Software Foundation model companies. Yeah,

Justin Grammens 10:25

fair enough. One of the things is you were talking about these MLMs, it feels like you're something like Chet GPT, or Bard or some of these other ones, it seems like they always want to try and give you an answer, right? It's almost like if I go on Google and I typed something in, I'm always gonna get a response, whether I care for it or not, it's really hard to sort of Stump Google. So it feels like all of these alarms with the intent, the way that they were built, was like, you kind of have to, like, do a little bit of prompt engineering or sort of like, you know, pre tuning to get them to shut up.

Steve Hsu 10:50

Yes, no, you're exactly right. I mean, ours are, depending on which flavor you're talking about Rs are designed to actually, they're very willing to say, I don't have any relevant information in my corpus to answer your query, and therefore I can't give you an answer. Maybe you could reformulate your query. That's a perfectly acceptable response for us that they want that to happen, rather than just make something up. Yeah, for

Justin Grammens 11:17

sure. And again, it probably falls under one of the enterprise requirements that you're talking about is security, or, I guess, fear of data leaking out into a public cloud. And another thing that your customers are talking about, are they pretty concerned about data they're loading in, to

Steve Hsu 11:33

be honest, I don't think for a company that has a pretty good tech team, they realize that, you know, hosting it on AWS, or Google Cloud is as secure as having it, you know, in their home environment, because you know, it's encrypted, it's got all the protection. So I don't think it's a particularly risky thing for a company to load their data into our corpus. Because we follow all the best practices for security, we may eventually encounter companies that really want to do everything on prem. But we haven't yet actually encountered anybody that really cares that much. And also, a lot of the applications that we're talking about are cases where the company is not trying to hide the data, the data is proprietary. So let me give you an example. We're in conversations with some very big call center companies, right? And so imagine you have all of the information that a person would need in order to decide whether they want to use Verizon as their calling and data plan. And there's all kinds of complicated stuff like, what's the rate? Is there a data charge? If you're on this plan? What happens? What about roaming? Does it work in Italy? They're really pretty complicated questions, but they're not trying to the company. Verizon is not trying to hide that information from anybody. In fact, what they want from our AI is something that can answer questions about their policies and rules and products, effectively and correctly. So that could be assisting the call center employee, or it could ultimately eventually with no human in the loop be actually answering questions from a user. So when you go into Verizon site, like you're kind of hunting around trying to figure out like, oh, yeah, does it work in Italy? Is there a charge? You know, then you can just say, can I use this plan in Italy? And what will it cost me? And then the AI will say, No, that's included in your plan, you have 100% Unlimited data and roaming in Italy, you know. So, in those scenarios, security is really not the issue at all. It's not like the company, that's all public information. They wish they could beam into the head of a potential customer. But they're trying to hide it in any way.

Justin Grammens 13:39

Fair enough. Yeah. They're just they're trying to make it easily accessible through some sort of a human chat interface. more conversational? Exactly. Yeah. Are these companies needing to retrain their data very often? I mean, it feels like maybe an FAQ can just kind of be done and doesn't really need to be updated. Yeah, no, this

Steve Hsu 13:55

is a great point. So a lot of the pilots and things that were involved in our cases were what what will we encounter, Verizon is actually not somebody that we're working with, I just use that as an example. If you go on the Verizon site, you'll see they obviously in order to like, get you to decide to buy the plan, they've got all the information is someplace on their website, as long as they keep their website up to date, all the information that you might want is pretty much there. And then they either do or don't have like an FAQ, but they might have an FAQ that they have written that's in a binder for their call center people that are trained on right. So that information typically exists in some form and, and like what we've built is an easy way to suck that into the memory of the LLM. What's interesting is that, on the other hand, you could have a case where, okay, let's suppose the AI we're building is answering complicated questions about, you know, a line of cell phones that somebody might be buying, like, you know, Motorola cell phone, you know, or something like that. They might just have the product manual. And then you have to ask like, Can the AI just answer the questions from what's in the product manual? Or could we encourage Motorola if we're going to work with them to like, sit down and write an FAQ for that product. And if you think about it there, people if they're good could write that a fake, like a day, days where you could go to their call existing call center and say, like, let's let's figure out what are the most common questions and we'll write good answers to them. And then that can be in the corpus. So obviously, the more work you do in preparing the corpus for the AI, the easier it is for the AI to figure out how to answer these questions. But it's generally like something that you're talking about a relatively modest amount of human effort. And then the AI can perform very well. If you think about it, like, think of all the questions that people have ever asked, like calling the Motorola customer service line about their smartphone, like, could there be more than like 1000 actually different types of questions that people have asked, you know, there's a very finite universe. For a geek like me, for a professor like me, the information content of the space is very finite. And so you actually building an AI, and you have the luxury because of this AI might be used by millions of customers that have that cell phone, you have the luxury of doing some preparatory work, and kind of covering this whole space. What's fun about it is, you know, the FAQ, if didn't have an LLM, the FAQ can be very brittle. So if you didn't ask the question exactly the right way, keyword search, necessarily find that question in the FAQ and the answer. But because the LLM 's are very good at language, I could ask effectively the same question 50 different ways. But the LM will understand right away? It is it is question number 246 that this person is asked about? And then I can give them the answer in language, which is similar to what they used in asking the question. So that's the magic that we're taking advantage of that didn't exist like three years ago. And but in a lot of these enterprise applications, the thing to realize is the actual information space that you're operating in is very finite, it's not very limited. And you can build like 100% superhuman AI. So the AI is better than the average call center employee, even after they've been through training, and even after they've been on the job force. Yeah, he's instantaneously answering, it's figuring out which of these 1000 questions that people ask, is this guy asking me and then giving you the perfect answer, humans are not gonna be as good as the AI for a kind of narrow application like the one I just described.

Justin Grammens 17:38

Yeah, as you were talking, I was thinking, it's nice, because you kind of have some guardrails, to kind of stay within, right, you have this this sphere of knowledge. And it's not going to really go beyond that. And I guess, you know, where you guys are different because I could use open AI today, there are API, and I could fine tune my model based on all this information from call centers. But I just wanted to be clear, what you guys are doing that is that you guys are not hallucinating. Right?

Steve Hsu 18:02

Great question. So often there is a conflation of what we do with fine tuning a model. And so just to be clear, we may actually be doing fine tuning of models ourselves, to make them good at the language domain. Like say we're talking about features of cellphones or something like this. So we may do some fine tuning, but that doesn't solve the problem. So what actually solves the problem is actually having an attached memory. So that once the LLM figures out what it is that you're asking about it can pull, it can actually pull information from the memory from the corpus, and then incorporate that information into its response. So involved in all this are multiple LLM. So some smaller MLMs, which try to figure out what the human is after in the query, and then they pull from the data base on what it can figure out about what you're after. And then another part of our code writes that all into a prompt, but the prompt includes the answers, the prompt actually includes the answers from the memory. It's a little bit like if I ask you like, what did you get your daughter for Christmas last year? You have to think for a second say like, Wait last year? Wait, was it the bunny? No, it was the pony, you know, and you're some other part of your brain is like pulling information out of your memory, which is relevant to the question I just asked. And then it's, it's showing that to the language center in your brain, and then your language center is like saying, Oh, yes, Steve, you know what, we got a great deal on this pony. It's living in our backyard. It's a Shetland pony, you know. So our architecture is essentially kind of mimicking that process. And so there's a bunch of different steps. And there are actually multiple MLMs involved in the process. There's even an error checking LLM at the end, where like, if you weren't like quite paying attention to what I said, or maybe you were drunk or something, and you just started answering, there is another part of your brain would which would be like, Wait, is that an appropriate response? Like is that what Steve was asking? about like, maybe you're about to swear, because the pony like makes a big mess, like poops in the house or something. And you're like, now I don't want to say that actually. So then like filters, he gets removed. So even have MLMs, which are optimized for error correction to try to figure out like, this is not actually something you want to say, at the end. So it's a, it's a kind of complicated engineering, you know, project where you're using LLM 's, but you're using a database, and you're using all kinds of different things combined to get the performance that you want.

Justin Grammens 20:30

I see that Yeah, yeah. So yeah, totally different architecture. And like you said, if you've, you're kind of incorporating multiple, multiple facets of it, is this patentable? I guess you don't mind me asking. I mean, is this something that's totally unique? No one else is doing you guys, maybe we're gonna try and file something for it.

Steve Hsu 20:44

This is a great question. I don't know how much I should say about this. But let's just say that, because we've been working on this for a long time we were working on this, prior to that GPT VT. And our engineers, you know, did do some very interesting things. So there is there are some filings that we think, you know, generally, as you know, in the software, space, patents generally don't carry the day for any particular company. You know, it's not like pharma, where you can literally patent this drug, and no one else is going to be able to use this drug except yours or whatever. So other than, you know, Pfizer or something, nevertheless, like we think like, because we were kind of worried about this earlier than other people, we may have identified some features of the software architecture, that really are crucial. And so you know, maybe it is possible, we could get some patent some of our claims granted by the USPTO. And then yeah, it would be tough for other people to build something similar to us without infringing on our patent. So it is possible. But you know, as a startup guy, I've never had a situation where in the software space, a patent really turned out to be decisive.

Justin Grammens 21:51

Yeah, it always feels like somebody can just kind of tweak it a little bit and be like, No, mine is unique. Now, you know, it doesn't doesn't follow yours. Yeah, even though it does 90% of the time, but 98% of the time, you know, but they're hard to enforce. And then, of course, being being a startup, yeah, somebody else comes in and starts using it. It's like, do you really have the firepower to take out a bigger corporation? That's

Steve Hsu 22:11

exactly right. Although having been through some acquisitions, when the acquiring company, which could be big, say it's apple, or they are very interested in the patent portfolio, because they can prosecute that patent against other companies, right, because they have infinite resources. So it still could play a role. But I don't think I think the real barrier like if you just said, like, if you were an a VC or an investor, and you were talking to me and saying, Hey, what's your barrier to entry here? I would say that actually, this is kind of a hard engineering problem. Like to get the system working well is not easy. And then like, once you get it working? Well, we noticed, oh, my God, this introduces a lot of latency. So now, how do you work on the latency? So it's actually a hard software engineering problem, which, again, I think the market, the market may ultimately be right. But currently, the market is kind of thinking, ooh, foundation models are really hard. And engineering using MLMs is easy. But I actually think from our perspective, right now, it's already the case that MLMs are kind of a commodity in the sense that we could plug Claude to into our architecture, we can plug GPT four, we can plug llama to, you know, we have a lot of options in terms of what we can build with, including some open source models now that have commercial licenses. So to us, it's like, I'm not sure you want to pour a billion dollar bet into the foundation space, because I'm not sure how, you know, some of those companies are going to make a lot of money, but a lot of them are not. Whereas this engineering stuff that we're doing right now is pretty hard, and everybody's gonna have to do it. So if we get there and say Walmart is having trouble. And we can roll out a product where yeah, you know, every product on the Walmart site or on the Amazon site, you can talk to our AI about it. Like, you can ask like, Wait, does this take double A batteries or triple A batteries? And AI just says, oh, double A? And it says that correctly? Right? And you know, you can say, Oh, does this contain lactose? No, this does not contain like, you know, like, there's going to be a day very soon, where literally every SKU or UPC every product that exists, there will be an AI that can like you can just talk to the AI about the product. AI will be it will be like talking to an intelligent human that has read and memorized the product manual, the nutrition label, and has read all of the top reviews of the product that are on Amazon or on, you know, Walmart or Costco or whatever. So like that is buildable by people like us. And yes, maybe Amazon or Walmart be able to build it themselves. Or maybe they won't, because there's a lot of engineering involved. And I would always bet on hungry startups over like insular tech teams inside of big companies to actually do this stuff. Yeah, for

Justin Grammens 24:55

sure. Good stuff. Good stuff here. I mean, what do you think the future looks like for Large language models, what is what do we are going to see in the next three to five years or so? It's very interesting,

Steve Hsu 25:04

because so the academics who study this, there's quite an interesting debate going on, like, some people would say, you can already see some sparks of AGI. In the most advanced models like GPT for like it can do some very astonishing things. But on the other hand, some people would say, Oh, they're kind of reaching the limit of the amount of training data that they have, because they've kind of used everything, and you're not going to easily get another 10x in training data. And it was revealed recently that GPT four is already a mixture of experts. I don't know if you're familiar with that, but but the way they benched for a while was not clear how they built GPT four, because GBT four is clearly better than 3.5. And it turns out, really, it's basically about a dozen different, or 10, different GPT. Three point fives trained differently. There's like a traffic cop that allocates your question to the version of 3.5. That actually is better to answer your question. So in a way, they've kind of reached a kind of limit, like they might seem like four is better than 3.5. But yeah, but not because of scale. It's actually just a mixture of three point fives. So it is possible that with all the money being poured into building better foundation models, and I do think there are gains to be had from like, tweaking the algorithm and slightly better architectures, and slightly more data. So I do think you're going to see advances. So I think I think the models are going to be better in a few years than the models we currently deal with. But are they going to be enormously better? Not clear? Is that by itself going to solve the hallucination problem? I would say almost certainly no. So when we were starting our company, we were actually, as I said, I know some of the founders at open AI. And so I was having these conversations that meant I said, I really don't think you're going to solve hallucination by using x word prediction as the objective function. And they were like, oh, Steve, just wait till you see GPT. Four, right. But then when GPT four came out, there's a big technical paper that comes with it with which tests, including like results for hallucination testing, what they find is that CPT for hallucinates less often than three or 3.5, but it's nowhere near like 99%, or 98% is more like 80% or 5%. Accurate. And in the enterprise, you can't do that. Like if I am trying to figure out whether to buy your drill. And I need to know like, you know, how many amp hours is in the battery, this drill electric drill? And the thing just makes up an answer, because it's seen that answer, oh, it's four amp hour, you know, just make something up 15% of the time, or 5% of time, people are going to be very mad at you. Right. So, yeah, that threshold to cross that kind of 99% accuracy or higher that you need for the enterprise. I don't see how you can avoid architectures like ours.

Justin Grammens 27:52

Sure, sure. And of course, the more regulated and the more that people's lives are on the line, the more that data needs to be right, right. I mean, it's one thing to buy the wrong drill with wrong amperage. But it's another thing to, you know, give the wrong dosage. Yeah,

Steve Hsu 28:06

no, absolutely. We haven't talked about medicine at all. But there are clear cases where, let's suppose I just scraped all the information from WebMD, or mayo, Mayo Clinic or some authoritative resource for, you know, just standard questions like, oh, I have a headache, you know, what should I do? Or I have these symptoms did I have a heart attack? Those are things that like a triage nurse on a helpline would be doing, but I'm pretty confident we could build an AI which is as good or better, it's like, at least slightly superhuman compared to the population of nurses that staff these helplines, just based on that information, so if the if the human just starts giving information to the AI, and the AI is using these rules, not even rules, but just the natural language information that's on the site, I think it would actually do a very good job of saying like, Oh, I need to escalate this, or, you know, you probably just have you know, this, maybe you should go get some aspirin. I actually think that's for our architecture within the realm of possibility. But I wouldn't do it. I wouldn't allow a sure LLM to do that, because a pure LLM, even if you fine tuned, it is very capable of hallucinating.

Justin Grammens 29:13

Yeah, for sure. So you teach University. So do I actually here in St. Paul, I'm not a full time faculty member. I'm an adjunct. I teach classes on machine learning and artificial intelligence around Internet of Things, a lot of like sensors and stuff, which is always a blast. But one of the things that I always actually kind of asked people this is like, what sort of advice would you have? Or what advice do you give your students I guess, that want to get into this field when they're done? Are there any courses specifically books, conferences, you know, full immersion? Yeah, I

Steve Hsu 29:41

would say there's sort of two buckets that I would talk about one bucket is do you want to understand actually how a neural net slash transformer architecture slash LLM works, do you? You know, is that important to you or not? It doesn't have to be you don't have to be in that category. If you are in that cat Gordon, you have to learn some things like linear algebra, you know some things about optimization, things like this. It's kind of unavoidable that you, you there's you learn this particular subset of mathematics and computer science. But then once you understand those things, you can go online, and there are people there very, very good lectures online, where they explain to you like, oh, how does the attention mechanism work? And what's the difference between a key and query matrix and a student, like, a kid, like my son who's, you know, high school aged, if he knows the math, he can actually learn a lot about just from YouTube about models actually work. So that's one bucket, which I would say like, Look, if you are decent at math, like most computer science majors can understand linear algebra. So if you are good at math, and you just you kind of are interested in really what's under the hood, and I totally encourage you to spend some time trying to figure out because this is the future, like these general structures, these kinds of neural nets are going to be the way AI is done for at least the foreseeable future. And so it's, it's worth investing some time to understand that. The other bucket is applications. So it's basically like, Okay, let me just assume there is this brain in the cloud, called GBD. Four, there's an API, I can just submit queries to it. And at that level, you can do a lot, just get your feet wet, and write some code and just try to build some little applications. And just working with these models will give you a lot of intuition. Like, even if you're not a developer, like See, you're on the business side. Like let's say you're 100%, on the business side, you just need to figure out like, how is this going to affect my sales operation? Do I need all these people do all these people I'm employing need to have an AI, like helping them when they're on a sales call, you know, you can't have the right intuition for how you might use this technology unless you actually spend some time with it. So I encourage everybody to do that. So I think, really, it's a broad, like, at university, like every university should just have a class on like, you know, either intro to AI, just the basics, or maybe some more advanced things like go the mathematics of AI, or, you know, you can have everything, everything under the sun. But I would encourage everybody to take classes like that. Yeah. And

Justin Grammens 32:09

it's funny. I mean, you talk about the startup scene, it's just like, I wonder if there's an AI that can help me with this. And I feel like every day I google something, it's like, oh, yeah, there's a new startup, they're gonna help you, you know, do this now. Right. And it's fascinating, I guess, to see all the applications that new companies are sort of building and trying now, over time, some of them aren't going to be worth their weight at all, or they're going to be problems too hard to solve, you know, beyond just some of the basic use cases. But it's a fast moving space right now, as you would probably agree,

Steve Hsu 32:37

it is moving super fast. And it's tough to keep track of everything. One thing I would say is that if you're like on Twitter or on YouTube, and, you know, people are constantly announcing, oh, I, I built this integration with open AI that can do this, it'll like, you know, read your email and summarize, you know, Baldwin, there's a lot more of that stuff than there is actual production level product. Because if you if you actually start talking to companies, and you say like, what are you using? Like, do you get any benefit other than like one of your employees occasionally using chat GPT to look something up. So that will be directly going to open AI or using Copilot to write code. Other than those two things, it's actually rare to find a company that's like, oh, I actually, there is an LLM application written by some dude that I actually get value from. And it's reliable enough that I can use it in my day to day business that the number of cases like that is actually quite small. And more, you might say, like, Wow, super focus. You guys have 10 ai engineers, and you're building stuff. They're working full time building stuff. And you know, you've raised millions of dollars in venture capital. Was that really necessary? Because I just read on Twitter that like so and so just just like, put that on GitHub? And I'm like, sure, yeah, go try and deploy that and see if how well it worked. Right. So there is a gap between perception there is definitely tons of ideation and creativity in the space. But you always have to remember, like, there's a reason why you don't just use some homecooked, Salesforce database tool, you buy the thing from Salesforce, because they actually build a professional level thing that actually works. Kids generally are not going to just whip something up that's going to satisfy the enterprise customer. So that there's a gap between what people and now it's on Twitter and YouTube and what what is reality and in the business world.

Justin Grammens 34:27

100% For sure, for sure agree with that a lot of difference between some sort of a proof of concept, right? Yeah. And something that is actually in production as being used? Yeah,

Steve Hsu 34:36

we're the ones who will tell you all the pain of the gap between being a POC to an enterprise ready product that those it's not small in this space, I would say. Yeah,

Justin Grammens 34:48

I appreciate the time. I got one more question for you. I mean, how do you see this affecting the sort of the future of work, whether it be your tool or AI in general, like you know what you talk about your son right? You Know what, what's it gonna be like for him when he gets out into the workforce? Do you think how is he going to affect that? I think that it's hard to predict how it's going to affect because there's some jobs that haven't even existed or haven't even been invented yet. Right, I

Steve Hsu 35:11

guess what I will say is that I was just actually tweeting about this this morning. So it's kind of funny. There are already some studies, there's a Stanford Business School professor, who did a study of call center workers. So I think they looked at 5000 Call center workers, some of which were given access to GPT chat GPT to help them and this these were not specialized AIS, like the ones we built. These are just general chat GPT. And they found a roughly 15% efficiency gain for the employees that had this tool available to them. So if you think about that, it's not a trivial that's like 1/6 of somebody's salary, right is the value that we're getting from this tool in this academic study. And so that's got to put downward pressure on the demand for labor in those segments. And we see tons of interest from call centers and customer service companies in wanting to talk to us about like, an AI that will help them. So I see a lot of jobs where there's going to be huge impact where a few years from now, literally everyone working in that job category will be doing it partnered with an AI, some of them will be actually what they were doing today. And 2023 is no longer done by human because it is almost human out of the loop. It's being done by the AI. So I think there will be some categories like that, although the minority, but in the short term, definitely, even just in a few years, a lot of people are going to be working partnered with an AI to improve their efficiency, I can see many, many cases of stuff like that.

Justin Grammens 36:40

Yeah. And it's it's a big debate, I was actually just on a podcast earlier today, where, you know, it's kind of like job loss versus job augmentation, you know, I, I realized there are definitely going to be some job, you know, losses, there's people, you just don't need as many people, when the people you have are becoming 15%, more efficient.

Steve Hsu 36:57

15 was in that study, which they did, right, they did a study, maybe the paper came out in early, like spring of 2023, so just a few months ago, but they were using very early tools. And I would guess with what we can build, I would guess not 15%, maybe 30% more. And then at that point, it's like, well, that is definitely macroscopically affecting the number of people you need in these jobs. Oh, you need a third less people in your call center than you needed before. You'll notice that, like one of our investors in Super focus is a fund based in the Philippines. And they have very tight connections to the call center industry. And the government there is already very worried about this, because the call center work is 8% of their GDP in the Philippines. And so they're scared like crazy trying to figure out how are we going to help all of our domestic companies adapt to possibly reduce demand for human capital? Because of AI? So they're taking it very seriously?

Justin Grammens 37:59

Yeah, yeah. Again, I'd say it's gonna remain to be seen, obviously, there's some people that are going to be have to be let go. And there's others that are going to be like, well, how can we use you in a better, more humanistic way? Right? Better, the things that humans are good at

Steve Hsu 38:12

the joke. The joke that I've heard from the Philippines side of this is like, wow, I guess the case for UBI is like, people are just gonna see their handwriting on the wall pretty soon that like, wow, UBI is kind of, like if you want to have a humane society, you better have some UBI because there are going to be some humans, that, you know, ultimately, their skill set is surpassed by what machines can do more cheaply. And then like, well, if you if you want to be humane, like all that extra productivity we got in the economy, we should give some of that to these people so that they they can have nice lives, even if they've been made a little bit obsolescent by the machines.

Justin Grammens 38:50

Yeah, for sure. That was Andrew Yang, right? What's his book? Pretty good? Yeah,

Steve Hsu 38:54

I forgot the name. But the funny thing is, like, a few years ago, there was a huge amount of discussion of UBI and then it kind of went away I guess I'm back thinking of the last election 2020 And then it kind of went away for a while. But what we're seeing now people like specifically our company and companies like us, that are like, Oh, you see, this little super specialized AI was built for you and it works better than a human for for doing this class of things that you need done. At that point, the light bulb goes off again and they're like, wait a minute, wait a minute we need with any UBI because there are a lot of people that are just not going to be able to it just you're better off with the AI than with the person and I do think that's going to be something people are gonna the societies are gonna wrestle

Justin Grammens 39:35

for sure. Yeah, it the book is the war on normal people actually, it's really good it talks about truck drivers and like yeah, what happens once once all these so yeah, it sets up the first couple chapters are a little bit the funny

Steve Hsu 39:47

thing is like, you know, a robot still like humans are very nimble and like the LLM breakthrough didn't suddenly make the robot extremely good at like, wielding soldering iron or something like that. Yeah, the manual for that. was processes in the factories you have to define, you have to just set things up in a very rigid way. So the robot can do it and be very good at it. But these LLM 's are so flexible, like things that have to do with processing with human language. Communicating with humans, figuring out what the human wants the LLM is are extremely good and flexible at that. So that's going to be a place where I think a lot of people in a way kind of made redundant by LLM technology.

Justin Grammens 40:24

Cool. We'll see that I appreciate your time. How do people get a hold of you? Or is it good to find you on Twitter, LinkedIn, what's the, what's your handle?

Steve Hsu 40:32

If you want to pilot one of our AIS, if you're if you if you run like Delta Airlines, you can reach me. Let's see what's the right way to do this? You can DM me on Twitter, you can find me on LinkedIn, all those places.

Justin Grammens 40:46

Yeah, we do a full transcription of of everything we talked about here. And we also have liner notes in there as well. So I will make sure to have links to super focus and links to you on LinkedIn and Twitter and all that sort of stuff. So make sure to get that so well. Thanks again, Steve. I appreciate it. This was a great great conversation and yeah, your your product sounds super interesting. So nothing but the best. Best of luck. I'm sure. As you guys are, are working more and more in this space. It feels like something that like you said specifically enterprises are going to need this product. So best of luck. Yes, thank

Steve Hsu 41:14

you for watching. It's been a pleasure.

AI Announcer 41:17

You've listened to another episode of the conversations on applied AI podcast. We hope you are eager to learn more about applying artificial intelligence and deep learning within your organization. You can visit us at applied ai.mn To keep up to date on our events and connect with our amazing community. Please don't hesitate to reach out to Justin at applied ai.mn If you are interested in participating in a future episode. Thank you for listening

Justin Grammens

Host