NOTE: I am excited to announce that all listeners of the Conversations on Applied AI Podcast are eligible to receive a 50% discount at the 2022 Applied AI Conference! Just use the discount code of "podcast" when purchasing your ticket.
The conversation this week is with Daniel Kornev. Daniel is Chief Product Officer for DeepPavlov.ai, which is the developer behind the Deep Pavlov Open Source Conversational AI stack for building voice assistants. Daniel has also served as an adviser to the Alexa prize team from the Moscow Institute of Physics and Technology. Before DeepPavlov he founded and led an AI-driven startup called Zet Universe, and was both a technical program manager at Google and a dev evangelist at Microsoft. Daniel has an MS in computer science and has done extensive research in computer vision interaction.
If you are interested in learning about how AI is being applied across multiple industries, be sure to join us at a future AppliedAI Monthly meetup and help support us so we can make future Emerging Technologies North non-profit events!
Resources and Topics Mentioned in this Episode
Daniel Kornev 0:00
For example, user might say that they want to have pizza Oh no, no, I don't actually want pizza. Let's go with a burger instead. No, no burgers fine, but also need couple of products. And this is where real human might lose connect control of what's happening, and to bypass the laws and lose control much, much faster. In reality humans we clarify things right, we asked qualifying questions like Justin, what is the picture behind you? Right and you were telling me the story about and so on. But the bot to clarify some things a bot has to be programmed to do that. And in many ways, qualifying every single is not so easy.
AI Announcer 0:38
Welcome to the conversations on applied AI podcast where Justin grumman's and the team at emerging technologies North talk with experts in the fields of artificial intelligence and deep learning. In each episode, we cut through the hype and dive into how these technologies are being applied to real world problems today. We hope that you find this episode educational and applicable to your industry and connect with us to learn more about our organization at applied ai.mn. Enjoy.
Justin Grammens 1:09
Welcome everyone to the conversations on applied AI Podcast. Today we're talking with Daniel Kornev. Daniel is Chief Product Officer for DeepPavlov.ai, which is the developer behind the Deep Pavlov Open Source Conversational AI stack for building voice assistants. Daniel has also served as an adviser to the Alexa prize team from the Moscow Institute of Physics and Technology. Before DeepPavlov he founded and led an AI driven startup called Zet Universe, and was both a technical program manager at Google and a dev evangelist at Microsoft. Daniel has an MS in computer science and has done extensive research in computer vision interaction. Thank you, Daniel, for being on the program today.
Daniel Kornev 1:45
It's a big pleasure of mine. Thank you so much for getting there for Justin.
Justin Grammens 1:49
Well, ya know, we love talking with people doing interesting and fun things in the space of artificial intelligence. And I can't wait to learn more about kind of what you got into this space or how you got into this space and some of the awesome things you're doing at Deep Pavlov, maybe, maybe we could rewind the clock back and maybe you could talk a little bit about your trajectory of your career and how you got to where you are today.
Daniel Kornev 2:09
It's like three things have changed my career forever was interesting. Number one who got a computer back in 1994. This Windows three dot 11 workgroups. And the second thing is probably when I learned about Windows Longhorn and Weena fast Beach was a big dream of Bill Gates. And that's kind of like a decade later, obviously, that was a fascinating idea that Bill imagined where windows would be smart, where it would know everything about data in computer. So essentially, he envisioned that computer could have a personal knowledge graph and windows and Ember sync all apps would work in tapa with us Fantastic idea was so fascinated sounds like everything they've roped in, or they read into science fiction could become as some readers might be aware of in the first died. And then they'll get started as a project called Microsoft Symantec edge. And it was called arena internally. And that what sparked my interest, I wanted to build this kind of smart systems. I didn't want to make mistakes that were made in VNFs. And I worked at Microsoft at the time, so I had the chance to talk to those people. And I was amazed. So I'd like to discuss with you a little bit earlier, worked on things like personal assistants before that was more or less coined. It was like back in the second half of 2000, when everything we knew about this system was basically Thanks, Eric Koretz made for Office. So small Microsoft agents that inspired Clippy and other things? Yeah, gonna say, yeah, and Kayla from Darfur co departed's notes and organizers that spin Dolphy into Siri. So that kind of things that inspired me a lot to build things.
Justin Grammens 3:43
That's awesome. So you're you're working at Microsoft. And you were I think you were a like a like a program manager and Dev evangelists. They're kind of getting into this conversational AI or just understanding how these smart assistants would work now. Now, where did you go from from there?
Daniel Kornev 3:57
Essentially, when I joined Microsoft, yeah, I was a developer evangelist. But that was just for a year, urine three months, something like that to act quickly changed my career to become a program manager because that's where I could be more productive because I could coordinate multiple projects at the same time, I could easily talk to headquarters in Redmond, Washington. And I haven't died a lot doing this kind of stuff. But also with driving idea of making it Windows semantic or contextually aware. I've worked a little bit with folks from Windows seven team, I've worked with support from office labs, we build a thing click paper for Bill Gates didn't read it. But some as executives for it like Donald Thompson, CO was CTO at Microsoft back at the time. It's just a friend of mine and Facebook today. And that inspired me to work to build something my mom hence. So not only I inspired people at Microsoft, we actually had a workshop at Microsoft Research back in 2010. It's like 50 senior people towards the company discussing this kind of stuff, but I was also building to buy my hand. So I joined Microsoft research as an intern. After internship. I spent like a year at Google, I left Google and decided to finally build my own company I was I couldn't be stopped. And I spent like more than five years building this kind of thing. So I built a system that was combining data from different data sources for product managers, project managers and other people like me, and allowed them to see that data organized by project not by sources from vendors, AECOM, I've just played this was inspired by VNFs. It's like a semantic middle war between everything that user workspace and the user, and also build as a mobile user interface. And Topo is other people could interact with it, there was a very cool endeavor for me, because they had to build like data pipelines that would grab data and process it using different processors. And that was before pipelines became a thinking data science was I think, something overall at that time, I built that visual interface with touch input and other things. So it was a lot of fun. And at some point, I realized that I was building a lot of this middleware thing can UI thing, but they needed to focus on the UI assistance. And so yes, it's from the head voice. They have textinput. So I shifted to that. At the antics I helped shape the area system for smart speaker in 2018. Then I worked for one of the conversational AI startups in Boston and Laters. After the time I joined depower forebodings. It's open source conversational Yes, tech. And it's pretty crazy adventure. Put it this way. And no longer Ah,
Justin Grammens 6:18
Yeah. And so yeah, I mean, it sounds like maybe you're a serial entrepreneur, you really like working with small companies working on sort of the cutting edge new technology,
Daniel Kornev 6:27
Yes. And no, after the first startup, I said myself that I'm not gonna work in the startup anymore. So don't be company. A year later, turns out that politics in the company can be crazy. I've seen like half of teams that worked on the assets and for the companies, they moved to different places, left the company, and so on. So I decided that maybe small companies have faster and less maybe political, just to give a comparison at Yandex. At some point, I had to coordinate 20 teams across five different organizations under five different lights president to get things done. It was a lot of fun. But obviously, when you coordinate with so many people, you get politics involved. And owning and is most startup, you don't usually have the kind of problem you don't have to go through firefight spreads to get things done. You can just you know, like ask Joe. Hey, John, can we put this feature in the system? And tomorrow on in next couple of months?
Justin Grammens 7:22
Yeah, exactly. Exactly. So tell me a little bit more than about like the size of the company of deep Pavlov?
Daniel Kornev 7:28
Sure. I think it's about certain people's quality of 25 to certain people. But it's mostly people who work on applied research and build products. And then we have a small team that was working fundamental research and determine is memory transformers and other things. But mostly stuff I work on is driving this open source conversational AI stack, making things happen,
Justin Grammens 7:49
We'll include liner notes, and links and stuff like that off to deep pavlov.ai and the open source, conversational AI stack, but maybe you could summarize it in just you know, a sentence or two or a paragraph or two, I guess, with regards to what what you guys are doing there. Sure.
Daniel Kornev 8:03
So basically, you mentioned you want to build your own Alexa or google assistant. And usually what you need to do is go to the company, it goes through lots of politics, and rob a lot of resources, then you will have to build everything, you'd have to build an NLP stack, which would understand what people say extracting terms, extract sentiment and other things that are features of the language that are important for you to understand what people are doing, then you need to build an engine that would allow you to control conversation between the user or the systems with scenarios, then you have to apply a generative model like GPT, three or something like that, and so on, that's one hand side, then you have to add lots of different skills that will allow the system to do different kinds of activities for you on behalf of you like cold taxi, play music, and so on. So then you need to build orchestration, that will allow you to grow up all of those skills together. And then it's the right skill for the right ask from the user. And that's just one side of the story. The other side of the story is to build fulfillment was a semantic mental warfare like where you need to connect the dots between what user wants me is the services that can fulfill users demand. So like, if you want to call taxis then you have to have a service that wraps API routes and taxi service like Uber or Lyft, or something like that. As you can imagine, it's a huge endeavor. That's why Amazon Alexa, folks, they still have like a one solid open three year jobs at their website. The same goes for Google. The same goes for Yandex. Obviously, Yandex is a smaller company and so on. But still, the problem is, it's a huge endeavor. What we tried to do, what we're working hard to do is to democratize this. So we've built a mechanism to build skills using scenario driven approach and through generative models. We build an orchestrator that allows you to put lots of skills into the system and then orchestrate conversation in such a way that when user wants something is correct. Skill is invoked, and are everything goes this is available as a stack of technologists. Oh and The bottom we have models that allow you to extract data from user phrases like intense or sentiment or motions and other things. Entities obviously. Second level, as I said, because it's engine and desert, we have agents that produce traits, conversations. And finally, we have the power of dream. The default dream is our platform for building a multi scale yes systems, it allows you to build your own Alexa, essentially, from all of this tools, everything is open sourced, everything is other Apache to the dough. So we can grab it and bring it to your company and build your own thing. Based on this long story short, that's how you can build your own Alexa without spending lots of millions of dollars on building everything from scratch, like Amazon and Google and Microsoft and stewarded.
Justin Grammens 10:43
That's awesome. Google had this thing called the AI y, which where they had like a vision kit and they had a voice kit stuff for you. Do you remember them putting out those things? Some years back?
Daniel Kornev 10:51
I think a lot about that. They also have IBI is acquired and they created a doubt flow out of it by Google in many ways. They've your secondary to Alexa, when Alexa was shipped, Google just had Google Now it was far cry from what Alexa had. So they had to be really fast and to catch up. In many ways, I think Google has to catch up. They have fantastic people, actually, some of them are coming from the same company. And I really, really admire that work. Yeah,
Justin Grammens 11:21
I mean, they, like I said, I think they put together this kid and it basically had a little like a, like a Raspberry Pi. And it just, it was using open source libraries. From my under from my understanding, I got the vision kit, which was basically running open CV on it, and they just kind of packaged it up into this kit you could buy for I don't even know what it was maybe $99 or whatever. $100. Yeah, yeah. And like a little cardboard box, and you would assemble the thing, but I mean, it was it was built on, you know, kind of open source software. But, you know, again, it was all wrapped in Google's ecosystem. It sounds like what you guys have built here is is completely open source. I'm looking at your GitHub repository, I mean, you get, you can download, you can get a Docker container with everything sort of fired up and ready to go. Right.
Daniel Kornev 12:01
Absolutely. And a couple of real things here. I really love that you said about the Docker containers and so on, but also about the Raspberry Pi. So there was a company called snips zeva, founded by an ex CTO of next computers. He later worked for Apple with Steve Jobs in France, I believe. And they had fantastic team they had fantastic team, they managed to build a very small system much smaller than ours. And wait, you may waste a much simpler obviously, because of that, but even able to run it on Raspberry Pi completely snips. And back in 2018. After I left Yandex I actually got napes got to Bluetooth soundbar a microphone. Actually, this is a microphone I have. It's omnidirectional. So it helps you to imitate Alexa echo device. And I managed to build a very simple prototype of what you could afford with snips. And it was fantastic that it worked on the device, you had the wake word working right here actually the wake word on the Alexa also working on the device, but everything else is practice on the cloud, in case of what sniffs who are able to do that was practice here. But obviously, ASR was garbage. So the Z enabled you to use ASR from Google and no, in our case, everything is running in the cloud, but in our cloud, but it can download to yourself to your own computers and run can run up here just big enough. It's a bit too big, we're working now building smaller distributions. Let me put it this way. It's like you're in line. So you have distributions. And Windows, you have SK use, like Home Edition, professional edition, and so on, you basically grab the components, and then you pick what you want to build. So our Depaolo dream has the same model is have assets and distributions folder. And when it goes there, and just a set of configuration files be specify what kind of components you want to run. And that's essentially how you build a system to have your own frozen, already available components, you obviously can build your own components, we have workshops, illustrating how you can do all those things. So it's pretty much available to the general public. And we are working hard to bring more instructions, more tutorials.
Justin Grammens 14:01
What's the business model? So you guys have services, then you run on top of this, you can help companies? Yeah,
Daniel Kornev 14:07
So right now we have academic academic organization. And we are currently working on building some of those things available commercially tall. One of the things we're building is entity extraction service. It extracts entities from the text and links them to weak data. There is a service called dandelion, that you that provide the same functionality, as there are some other companies that do the same. So what we want to do is to make this available to the general public, it's already running internal better, get money from selling it. And the same story goes to the whole platform. But do you need to build you know, the tenants and tenant management and web platform very configurator your AI system without having to go to Docker or editing JSON files and so on. So that's what we're working on. We hope to get some early Beats by the end of this year and make them available right now. This is obviously more friendly to developers who can cook everything up the way they want and to use it
Justin Grammens 15:00
Was that was that more of a text text based solution? Is that what you were saying? Or is it voice?
Daniel Kornev 15:04
Excellent question. So our solution internally is obviously text based. But we have a models allowing you to use Knievel Amber is from Nvidia who is our partner, by the way. And so we use ASR to test models from them, we have a custom model that allows you to code 16 kilohertz both files to the system is a little bit unrecognized transcribed into text and sent to the pipeline and then back in, in the entity will be added to TTS the DSP will send you out of it. I saw the demo this actually a couple of years ago, and we did UTC fall 2020. So it's pretty straightforward.
Justin Grammens 15:39
Yeah. And people can just go to demo dot d pavlov.ai, I think can see a couple of things a couple of demos as well, right?
Daniel Kornev 15:46
Yeah, actually, it's dreamed that the power of that AI, that's where they can get a fantastic, beautiful landing, where he can talk to the Chatbot, and so on. We also have one important thing, everything you say to the Chatbot will be available to the general public. So we put a lot of placeholders and say, like, don't don't put credit card details and stuff like that. And this allows people to get a public dataset. That's one of the big challenges. And conversationally I have, like feed people love to chop like in meat. Our I accept our podcasts, which by the way, is a good idea except the podcast and so on most of the data that people talk about these private. So we will help as a developer since academia and the industry by open source is a conversation between our users and our chat bot on dream that deposit again,
Justin Grammens 16:34
I did go to dream dot d Pavlov that AI and yeah, there's a little chat bot here, you can start typing in your messages. Like you say, there's a warning that pops up it says don't put any personal data in here, you know, and all but yeah, I will. I will definitely put a link off to this where people can play around with it. And I was just looking at your GitHub page. So that that's that's phenomenal. So first of all, I guess the backup, you know, whenever I hear hear of conversational AI, for some reason, my mind just goes to Siri and Alexa these type of things. But I mean, it really, it doesn't need to be only audio based, right? It's it's really just as broad based.
Daniel Kornev 17:03
Yeah, this is a good question. The fact is, we people are multimodal. So even right now is through via recording this audio, and you folks who are listening to us, thank you so much for that. You're able just listening to us. But the thing is, this conversation is video based. So we can see Charles, because he has a face. As you can see, she has a smiles in your eyes, and so on. And that's how we interact. Even when I don't talk. I still have gestures, I still have better body movements. And the thing is, that's how we process information. And the other thing is when we process information, and visually, we can data synchronously, we can read once again, once again, once again. But when you build a voice only user interface, you have to design it very specifically, because people can keep in mind just a few things. user interfaces and user interaction between users and a UI system has to be multimodal as well, they have to see you they have to be able to recognize your body movements and gestures, and see where you're looking at and see a smile and see you sad and recognize it and react to it. For sure.
Justin Grammens 18:01
For sure. Yeah, yeah, you know, what startup that I'm working on right now with a friend of mine has actually a presentation. It watches you as you do online presentations, it listens to audio. So make sure that you know, kind of not having so many ums and ahs or stuttering, slow down your pace. So there's just strict audio, but there is also a whole video component to it as well. So what you're talking about is essentially a coach, you know, things that humans would do in a lot of cases. But when you're trying to teach a class of 100 people, it's very, very difficult for you to do that. So we've created a piece of software that kind of does a lot of that writing on top of a lot of existing models and existing technology. But do you see this conversational AI as replacing humans kind of complementing what they do? Like where do you see a lot of these things being used today?
Daniel Kornev 18:46
Oh, that's a crazy question. And in some ways, it's obviously something that can help you, like you gave us it just gave an example where people do teach in class and they can get suggestions from the system, it's actually very easy to hook up conversational AI into the system, when you as a teacher, are answering questions to the audience can, small chatbot can just stay round, not exactly like clipping but may do something like this. And instead of asking stupid questions, or you answering, you're talking to the audience, it could give something useful, like it could show you contextual information relevant to what you're answering to. Right. So in this way, it obviously can help you on the other side, obviously, there is a lot of efforts trying to replace humans for weight as possible. I mean, sales processes, usually not the most interesting process of the mode. Yeah, of course there is like Glengarry here when broke if I'm not mistaken, that fantastic movies. Okay, I forgotten the guy as a guy's name ever probably get back to it. But the thing is, most of the people don't like sales process, so why not automate it? We had a customer a couple years ago, we worked with I want to build a version of our platform where we build the thing where it would reach out to bloggers and ask them they want to do product placement in their blogs, posts in Instagram or something like that. And the whole process is automated. And only when bloggers said that they actually want to be involved, then as a system would connect them to the actual manager from the company. Now, I'm not gonna say I'm super proud of the process, some morals of mind, say, Hey, Daniel, maybe that's not the best way to build conversation. But the practical business application makes a lot of sense, you do a lot of cost saving by doing this kind of operation. On the second side, so you can do a lot of things. I don't know, like some years ago, I had a TEDx talk on functional literacy and other things. But one of the dreams that we in industry have is a system that is not about us saying what is the weather like outside, it can be our teachers, it can be our friends can be our advisors, I can give you a small example, a hopefully motivational one. When I was a kid, my parents brought me to lots of different art students. And every time they were the same protests, I was learning how to draw with a pencil and how to roll with a pen. I was I was learning how to draw on figures. And I was learning how to draw on fruits, the landscapes of humans, and so on. And all of those times, it was fun, but it was not so engaging. And then one day I went to another group in this place, and there was a person who didn't have education in teaching. She does have an architect and walking architect. And instead of getting me through the same process, he asked me, What do I draw when I'm not asked to draw what I'm drawing in my free time. And I told him, I have some of my drawings. And Xavier does black and white things. And she started bringing me to each next lesson, something small things she collected over the years as a professional. And that inspired me to draw more and more and more, and then it got good enough that it brought me a book and so on. And then at some point we had, we had exhibitions, we had exhibition selling in Moscow, we had exhibitions in Los Angeles, we had exhibition somewhere else, and my personal exhibitions and so on, I didn't become a professional drover till they went to program management at Microsoft, because I wanted to build things, not just drone. But the thing is, she was looking at me and Sue wanted generally wanted to uncover my potential. So a good example of a system hopefully motivational one would be imagine that there is a small assistant, maybe a Tete a teddy bear or something like that. That is he's a kid, and it can see that the kid is drawing. Why couldn't the teddy bear be smart enough to recommend kid continue drawing? And so good examples? Why couldn't it help the kid crawl? Of course, it would be ideal if everyone would have fantastic human teacher. But sometimes it's not possible. So this is crazy. Think. But if this could be done, imagine whatever kind of possibilities would conversationally open up to us?
Justin Grammens 22:58
Absolutely. Yeah, I know, I do you see that future? I have a couple. I have two boys. They're eight and 10 years old. And you know, they have Alexa's in their room. And they they ask it to play music, they ask questions, they, you know, the alarms go off in the morning. But you know, for I have heard of a number of startups kind of embedding that in sort of in, in dolls in in teddy bears and other things. Because in some ways that can be the kid's closest friend, and they can interact with it in a more human way. If it's smart, and intelligent, and they could they can get a whole lot of information and much more richer experience, I guess, with this device based on, you know, using all of this all the knowledge that's on the internet, it's fascinating.
Daniel Kornev 23:39
And the thing is that people who want to be smart enough and caring enough, we can control obviously, what we want to say what not to do not what to say for example of the architecture of our devolving platform, is it bit straightforward. At some point, when different gears are generating responses that have to be in one of those responses usually have to be picked and brought back to the user, we have a roll of annotators. That's why we call them they're basically the classifiers or rule based system, depending on what you want to do. They're just checking that you're not going to say something that might hurt feelings of the user that want to save words that are prep or prohibited. Don't want to say something that might spoil the conversation, something like that, of course, that's not ideal right now. But it can do that. And we can make this as user experience more friendly to the user. And this is obviously still a far cry for what we can do. And then after all, there was a robot called Bishop in the movie Aliens made by Cameron. And that was a very friendly guy. He was really fighting for the better future of people this can he was protecting them. And so this could be a great future building. I hope it not going to build asked from LM so yeah, I mean,
Justin Grammens 24:50
I guess tools can be used in all sorts of different ways. But you know, as long as we keep the positive side of it, going I think humans will continue to hopefully I use it use it for good. As I was thinking about one of the other conversational things that I remember, this maybe goes back at least close to a seven, eight years now or so. And maybe you'd seen this, but it was I think was called x.ai. And it basically was a virtual assistant where you could use a calendar, and they would look at your calendar, and you would have, you know, somebody could send you an email, and I forget the name of the assistant, he and she, they basically had names, but it was flawed. There were there were a number of issues that I had when I was trying to schedule something with people that just misinterpretations lot of back and forth. And I know that the person I was trying to schedule with, they ended up just stepping in and just, you know, a lot of times having to just override, maybe that's not a good example. But I guess, you know, thinking about what, what sort of flaws are you seeing that are still happening, I guess, around in this space where maybe we need to get better.
Daniel Kornev 25:50
That's a good point. Four years ago, folks from Nielsen Norman Group, they made a fantastic UX overview of what's good. And what better is the user interface of AI assistants, there are a few things it's like, the start is how you understand what users think. And that means it starts with ASR, and so on. But then you have to intended to interpret what the user said. And that's a different thing. So one thing is to recognize words, second is understand what user wants, then you have to do something for the user. So you usually have to fulfill users request, then you have to generate response back and then kept transformative, which and so on. But the thing is a problem site at all levels. So as they talk to me, as they say, okay, sir, can be much, much better to is much, much better today than it was couple years ago, and so on. That's pretty much true is but particularly for English language. But when he wants to understand what the user wants, there is a myriad of problems here. For example, user might say that they want to have pizza Oh, no, no, I don't actually want pizza. Let's go with a burger instead of No, no burger is fine, but also need couple of products. And this is where real Cuban might lose connect and control of what's happening. And tobacco laws and lose control much, much faster. And in reality, humans we clarify things right, they ask clarifying questions, like Justin, what is the picture behind you? Right, and you will tell me the story about and so on, about the bot to clarify some things a bot has to be programmed to do that. And in many ways, clarifying every Sankey is is not so easy. So V Kim is to create abstract models of photonic works, how something works, and that's how we try to operate. But but doesn't have capacity or capability to build this abstract model. So it has to be programmed. If you want to get things done. Obviously, it can also use GPT. Three, but then you will never know when it will start faking things. And he'll still think that that's fine. It's like ask my assistant create a reminder for something and it said, Okay, now we'll do and justice must respond to a chatbot, a chat chat system. So there are a lot of problems is interpreting what user said. And then obviously, you also have to build a pipeline of sorts of what has to be done. And here's the goals, there was a project called Facebook him, they built a fantastic AI system that was running converted limited baton in San Francisco. And the idea was that people could ask anything. And as their founder said, he left Facebook after that a few years after that, he said that in the beginning, people ask what is like the weather like outside, then they asked to show me his best Italian restaurant here. And at some point, they start saying, like, help me to organize wedding. And this thing is a way call with us to work was this behind the scenes of your people? The idea was that if a system can do things on its own, it will do what if it can not, that the people would step in, they will do the operations, and they will record what exactly they done. So that that could be automated. By the end of the experiment, which I think ended in 2017. Or something about that. They said only 30% of tasks could be actually automated. So imagine how many problems we still face in the industry. This is I think, one of the biggest problems we ever had in computer industry. Okay, I might be exaggerating a bit.
Justin Grammens 29:02
Yeah, well, I mean, who would have thought, you know, some of the early computers, which really, were just calculators in some ways, right? They just they automated tasks would now be doing kind of some very, very complex things today, right? Computers can use computer vision, they can they can drive cars, they can listen to and have some really interesting conversations back and forth. I mean, the whole contextual things. And I like to tell people like it was kind of interesting. When I started using Siri, I would say, you know, what was the score of the twins game, and it's the Minnesota Twins. They're the baseball team here in Minnesota, and it knew that like it knew the context around that it knew when you say twins game, it knew you Oh, you're talking about baseball and would tell me the score. And I thought that was pretty interesting, like, you know, that that it was able to understand contextually was going on, which is Yeah, I was cool. Very, very cool. And I mean, it's just going to get better and better. But I guess what does 510 years look like for you as a person who has been working in this field and seeing all these changes? Were Where do you see going next?
Daniel Kornev 30:01
That's a good point. I liked how you said about the context and the importance of a context for as a chatbot. To understand what's going on. I think in the next few years, we will see some sort of a revolutionary yes instance, the revolution will be driven by a couple of things. First, we will need some sort of semantic middleware is x, understand from Microsoft used to say that middle war would be something between AI systems and the reality. So it can be IoT, it can be applications, it can be anything. And this middle war would allow Huawei assistance to, practically speaking do things for you. So right now, I can give an example a couple years ago, I have a huge house, it's like 300 square meters. I don't know how huge it is inches, square inches, but you can imagine it's much bigger. But but the thing is, I put like nine Alexa devices across the house, and a breath a lot of course, smartphones taken to the house. And turns out, it was so hard to build something reliable. Like three years later, half of things death, lots of things don't drugs away, they want smart switches don't always want to because half of my house is citizens. The first one is concrete. So I had to put lots of repeaters are vital segments on if we want to AI systems to be able to do things for us, there has to be some sort of semantic middle words that would connect them to everything. And if we don't have to connect everything individually, I just block things and they work. And the second thing is a Yeah, citizens have to be able to experiment. There's a fantastic competition called igloo, from Microsoft Research. My manager, Michelle boards of kids, famous researchers at conversational AI field, he is a co organizer of the thing. And, to the best of my knowledge, maybe a little bit wrong. But right now, as far as tech, he is a co organizer of this. So that is their citizens can make mistakes, they can fail. And to do that, they need to have a safe environment where they can be taught, where they can learn how to clarify things and how to do things for us. And that environment has to be safe, so that when they will do mistakes, that won't affect us. So imagine that we have this semantic middleware that connects everything at the bottom. And then we have some simulation of that semantic middle war on top of each you can train your AI assistants. And when we will train them well enough to understand interpret what the user says and map that to the semantic middleware we can slowly, steadily brings them back to the real world and make things happen. That's I think what like make sense was the next five to 10 years.
Justin Grammens 32:37
Absolutely. I love that idea that that you can put this AI in a certain state and have it make mistakes. I mean, that's really how children learn. Exactly. They it's just a series of mistakes. So they need to be able to be put in safe areas and run essentially exercises over and over to train the model. I actually had never even thought of that. I love that. So that's project a glue, I guess is that is it. Is there. Is there a public facing thing? Or was No,
Daniel Kornev 33:03
no, no, no, it's not connected to Depaolo directly it just my boss he is also organizing Kegel, I just say that igloo is a fantastic illustration of how that could be done. They built a very small demo of when you can when you have a bot in Minecraft. And when you talk to it, it can do things for you, for example, can build a house for you. But right now it's doing by me saying common street. But what if I could say to it, and that's more or less ideal igloo? What if you could do build a house? And then say to repeat after me? What if you could say that I'm going to build a wall, put a couple of walls here, here and here. With a sailing for the stairs? What if you could explain things in nature language to the bot and it would learn. So this is a fantastic premise. And he globally is driving this from a research perspective. What if I was talking about from DeepPavlov Dream and Microsoft was a small experiment, we want to combine two things together. And obviously we have to have to build this magic middle word build a simulation, so that we could experiment with this kind of thing. They are not alone by so in the think Amazon have this competition they call for less surprise. A lot of chatbots participate in this chatbot competition to build a fantastic chat bot that can sustain 20 minute long conversation about anything but they have a couple of other challenges. One was a task board challenge as as it was a simple challenge where you have your issue, comments and natural language to the driving bot. And that bot in inside your apartment can do things for you can look up if the gas is open or closed or if Windows Ultimate close to something like that. For them and the competition is competition happening in the simulation as well. But you know that Amazon has Astro it's still limited, but there's a physical body that can move around it has echo inside it's like room of his echo in a way. So this is already happening anyway. Is Just Amazon's technology is proprietary and closed. And it's very hard to build something on top of it, because you have to be strict. You have to be using their API's. And you have to be using their sandbox. And so but generally, that's what I think is happening, not just to my colleagues as well,
Justin Grammens 35:18
Yeah. And so I mentioned during the intro, you were an advisor to the Alexa prize team is at, I'll put a link to the Alexa prize. But tell us a little bit about that, and what it was like being an advisor to that
Daniel Kornev 35:28
Crazy. So basically, when you want to do all their social budget talks to real people, and you want to to change me, you have to design the Chatbot something has some sort of logic behind that. My adviser party was a few things. So I couldn't participate directly. I couldn't write code for the system because it wasn't a student, because it's for university students only, yes. No longer university student as a consumer member, I mean, when I cut it off, I can really look like a student anyways.
Justin Grammens 35:56
You can you can pass through it. Yeah.
Daniel Kornev 35:59
But the thing is, when I was advisor, I was reading the dialects, and I was helping folks to change the design of scenarios, so that we would fix the conversation to make it better. We were also building a started driven engine, and it was experimenting with this engine and building some kind of things and showing that to the team so that the team would actually figure out how those skills could be built. On the system, we use the same platform Dream Dream was actually a product of participating in a couple of Alexa prize competitions. So there was a second thing and I was also helping kind of Coursers, our original application was at XPrize competition where my boss Nikhil boots if and we coined the idea of the goal without management, where we tried to build a system in such a way that to understand what as a conversational goals of the system of the user in the chitchat, and then tried to detect those goals and try to live conversation towards them. Turns out, it became much harder than we originally anticipated. So we now have a small research group of like four or five people who work in this full time during the Alexa prize competition. That's some of the things we experiment with. And of course, it's reading and fixing Technical Report, and meeting his team every day and so on. But that's the typical force you usually have in the group. So nothing Yeah.
Justin Grammens 37:15
Yeah, that's, that's good. And so does this happen every year? It looks surprise.
Daniel Kornev 37:19
Unfortunately, not. So is the last time it happened to us a year ago, it ended back in July of 2021. They didn't know it was the next one yet. Instead, they have the simple challenge. And that's pretty challenging in itself. So I think the team goal and the simple challenge and probably announced Netflix surprise, maybe in November, I think, but I don't know. I mean, it's not like I have a chance to open up the curtain and know what was happening.
Justin Grammens 37:41
Sure, sure. If I was just coming out of school, I'm a university student, I guess there's some of these challenges, I can get myself involved in how, what are some other ways you might suggest somebody kind of getting into the conversational AI landscape and, and learning.
Daniel Kornev 37:54
The best way to learn I believe is by doing, there's a fantastic blog by students enough Q who used to be president of Windows division, like learning by shipping. So in many ways segue this him. That's how you do it. So for example, you could go to dibala for a dream and drop out blood from a try figure out how it works. And you can just like run it on your computer, because we have proxies. So you can run it through our proxies and every single branch in the cloud, but mo Pato it will open your machine, then you can add more components to it. And we have a very beautiful domain specific language and Python, that allows people to quickly write data driven skills on top of it. And that that would make it very, very easy. We have actually recorded workshops and tutorials, complaining how one could do that. That's how you can start. Once you get the subject up and running. It's like when you have Iron Man's AI system drivers in your hands. And that's working, it's working with your hands so that you can now slowly, steadily add things up to the system and see how it works. I think it's fantastic. It's like when you want to know how the car works, I don't have one. But I know from experience with my friends, and so on, the best way to understand how it works in engineering, from an engineering perspective is to drive it and then to look at the mature how it works. You want to understand how the engine works, and so on. So that's how you do it.
Justin Grammens 39:10
Nice. Nice. I liked that. I liked that for sure. Well, Daniel, this has been it's been awesome conversation. Is there anything else you wanted to talk about maybe that I didn't really touch on in this whole space of conversational AI? We, we haven't had too many people on talking about this. And it's, I'm thrilled that you were able to take some time today to talk to us about it. But I want to make sure if there were other things that maybe I missed, or we didn't discuss it that you could share.
Daniel Kornev 39:33
I think there's one thing that connects back to Scott Liebman and your conversations going by AI file attachments, but resistance sake. One of the fantastic things here is when you want to build conversation, the AI Some people believe that we just have to use my generative models, and that's about it like coal and gas and coal today, but the thing is, it seems like the reality is near a symbolic word at some point, you have to use things like knowledge graphs, like roads. and so on to get things done. Sometimes you need to base your work on neural networks. For example, when he wants to build a classifier for something like for understanding contents, or sentiment or emotions of the array and so on, neural networks are fantastic. But when you want to have a tight control of causes or experience works, you might want to use some sort of snark driven approach. And the thing here is, if you want to build fantastic things that recessional AI, you should not look up yourself to just one of the directions, it should open your eyes really widely and see all of the benefits different approaches enable you to build things is. And as I said before is an example of knowledge graphs, I can give a simple example where the participant and Alexa price and the first time we just had the named entity recognition, which could recognize something because a CT or something case, person as such, and so on in the phrases, but when we connected it to the knowledge, graphs, likelihood data, they were able to pull so much data structured data that he could talk about for hours about it. So there's just a small but fantastic example of how much power lies in combining neural networks and symbolic approaches in AI together.
Justin Grammens 41:16
That's fascinating. Yeah, it seems to be a pretty big movement around graph databases, there's a company called Tiger graph. And there's a number of people in town here that are speaking a lot about it. And just this idea of just structuring data in a different way through using graph technology. Are you familiar with that, like some of those specific solutions?
Daniel Kornev 41:33
I mean, sure, when I was working with my startup, as a universe, I have looked at, I don't know, like 10-20-30 examples, different mechanism, how we can build it at DePaul a few years, couple of ways or Vika day to just rub their data center HDT format. And for ourselves, we build a small custom Knowledge Graph, where we can record things like what people said, we'll leave that to the World Knowledge Graph, we use it for Neil forte for that. There is, by the way, a fantastic team in Iran, they have been an entire stack of technologists with key value store the document database and draft storage on top of it, and they building as a stack up and up using C++ and so on. They're fantastic. And doing that. And the good example, what you can build from the ground up, and it perfectly illustrate what kind of data structures you need.
Justin Grammens 42:19
Yeah, yeah. And it borders like the whole semantic web, web 3.0 movement that seems to be as been going on for 20 years now. But just trying to get the Internet to really behave more. It's just if the data is more structured, or in a way that it can be processed better than parsing HTML. Likely, the better
Daniel Kornev 42:38
is the biggest problem. Here's how you can build a bridge and reverse a bridge would start them as a preacher didn't appear to have a semantic web idea decades a couple of decades ago was fantastic. But it was impossible to encode everything on the web, semantic web thinks, yeah, we have schema.org, which allows you to encode multiple things on the web is machine readable formats, but the gap is probably closed differently. Use NLP to extract entities. For example, we have a technology that allows you to extract triplets from the free text. And that's how we can build a nose drop off hook from was going to do half. So I think is a breach of it. It's a question of how you can build the bridge.
Justin Grammens 43:16
Yeah. And I love what you kind of said earlier, we're just we we need to take an open mind, we need to take an open approach, the way it's been kind of been done in the past, maybe isn't the right way going forward. So people should start thinking how they can apply all these different technologies and sort of like, yeah, just attack the problem maybe differently.
Daniel Kornev 43:33
Absolutely. And that's something I totally agree with. That. That's how we build. That's how we evolve.
Justin Grammens 43:38
Well, Daniel, how do people reach out to you just take a look at Google, I guess, find you on LinkedIn?
Daniel Kornev 43:42
Yeah, the easiest way is to just put Daniel coordinate from Google on you will probably hit me at LinkedIn and Facebook and Twitter and so on. I have a Twitter handle called da N ie l Keo, didn't you have code that was my LS at Microsoft. I still use it, like, decade later. So but that's far easier way to talk to me.
Justin Grammens 44:02
Gotcha. And it looks like zet universe is still kind of up and running. Is that true?
Daniel Kornev 44:06
After stopping co working on the tuners full time, I just use it as a consulting gig. So when I want to consult on building cognitive computing systems or something like that, no matter how fancy is called, it's still yes stuff. I use it as a way to provide services. So if you want to work in something like that, I could provide the services for my company.
Justin Grammens 44:24
Excellent, cool. Yeah. Well, we'll put a link to that too, as well. People want to reach out. But thank you, Daniel. I appreciate the time and all of the information that you gave to us and the applied AI community. It's been a great, great conversation and love to have you on I think in the future to sort of see how things are changing and evolving in this in this fast space of conversational AI.
Daniel Kornev 44:43
Justin's been my big pleasure. Thank you so much for this opportunity. I've enjoyed this conversation.
AI Announcer 44:50
You've listened to another episode of the conversations on applied AI podcast. We hope you are eager to learn more about applying artificial intelligence and deep learning within in your organization, you can visit us at applied ai.mn To keep up to date on our events and connect with our amazing community. Please don't hesitate to reach out to Justin at applied ai.mn If you are interested in participating in a future episode. Thank you for listening