Pete Warden - Bringing Deep Learning to Microcontrollers with TinyML Artwork

Conversations on Applied AI

Welcome to the Conversations on Applied AI Podcast where Justin Grammens and the team at Emerging Technologies North talk with experts in the fields of Artificial Intelligence and Deep Learning. In each episode, we cut through the hype and dive into how these technologies are being applied to real-world problems today. We hope that you find this episode educational and applicable to your industry and connect with us to learn more about our organization at AppliedAI.MN. Enjoy!

All Episodes

Conversations on Applied AI

Pete Warden - Bringing Deep Learning to Microcontrollers with TinyML

October 11, 2022 • Justin Grammens • Season 2 • Episode 26

0:00 | 46:00

The conversation this week is with Pete Warden. Pete's mission is simple: Organizing the physical world's information to make it universally accessible and useful. He was the founder and CTO at Jetpac where he built a technical team and created a unique data product that analyzed the pixel data, over 140 million photos from Instagram and turn them into in depth guides for more than 5000 cities around the world. Jetpac was acquired by Google in 2014. From there, he was a founding member and technical lead of the TensorFlow mobile team responsible for deep learning on mobile and embedded devices. Today, he is the CEO at Useful Sensors. He is also the co-author of the highly successful book entitled TinyML published by O'Reilly. Pete holds a Bachelors of Science from University of Manchester and a Doctor of Philosophy and a PhD in computer science from Stanford University.

If you are interested in learning about how AI is being applied across multiple industries, be sure to join us at a future AppliedAI Monthly meetup and help support us so we can make future Emerging Technologies North non-profit events!

Resources and Topics Mentioned in this Episode

Enjoy!

Your host,
Justin Grammens

Pete Warden 0:00

A lot of the thought of our tiny ML is, okay, if you've got something that instead of, you know, this takes about a walk to the CPU give or take, if you could actually run the same sort of algorithms on a 1,000th of the power, you've got a chance to actually run on batteries for like a year or two, you imagine being able to pull all of these things out into the environment have this ambient computing idea, one of the things I'm really interested in is, hey, can we actually have like a $5 device that sits looking at like a water meter, and is then able to every three minutes or something, send that number up to the cloud.

AI Announcer 0:40

Welcome to the conversations on applied AI podcast where Justin grumman's and the team at emerging technologies North talk with experts in the fields of artificial intelligence and deep learning. In each episode, we cut through the hype and dive into how these technologies are being applied to real world problems today. We hope that you find this episode educational and applicable to your industry and connect with us to learn more about our organization at applied ai.mn. Enjoy.

Justin Grammens 1:11

Welcome everyone to the conversations and applied AI Podcast. Today we're talking with Pete Worden. Pete's mission is simple, organizing the physical world's information to make it universally accessible and useful. He was the founder and CTO at jetpack where he built a technical team and created a unique data product that analyzed the pixel data, over 140 million photos from Instagram and turn them into in depth guides for more than 5000 cities around the world. Jetpack was acquired by Google in 2014. From there, he was a founding member and technical lead of the TensorFlow mobile team responsible for deep learning on mobile and embedded devices. Today, he is the CEO at useful sensors, which I'm sure we will learn a lot more about during this interview. He is also the co author of the highly successful book entitled tiny ml published by O'Reilly, and this is a must read for anyone interested in machine learning at the edge. It's fabulous. Finally, Pete holds a Bachelors of Science from University of Manchester and a Doctor of Philosophy and a PhD in computer science from Stanford University. So Pete, I want to say thank you so much for your time. It's a thrill to have you on the program today.

Pete Warden 2:11

No, thanks so much, Justin. And just, you know, clarify a little bit. I'm working on my PhD at Stanford. So

Justin Grammens 2:19

well, we know you'll finish I am I am confident of that. Well, man, awesome. Well, with all that you've accomplished. I mean, like, where to start, I guess maybe from the beginning, you know, as we'd like to keep some of these conversations, you know, out here, just pretty casual. I mean, how did you get into computers and the technology?

Pete Warden 2:34

I mean, I really, at a young age, my uncle actually gave us a Sinclair xetex. At one, as we said, in the UK, I think it was marketed as like a Timex over here. And that, really, you know, I was just a little kid at the time. But there were two things really, one of them was my dad spent hours typing in programs, you know, for magazines. And that really impressed me because I was like, wow, my dad spending a bunch of his spare time typing away all this they get must be really interesting. And there will always bugs in those printed programs, because you'd have page on page of like text or copy. And so I always got really fascinated by Oh, how can you actually understand these programs and like, fix the problems that come up as you typed them in?

Justin Grammens 3:26

That's awesome. Yeah, no, I remember writing basic when I was in like fourth grade and, and my parents bringing home and old Apple two, I guess, old at the time, but I guess it was new, but it was like an apple two, my mom worked for the, for the school district here in Minnesota and was able to check out the apple two computer and bring it home and just started playing around with them, you know? And so, you know, did it did that then continue to be a fascination of yours throughout high school and college and everything sort of all the way through? Or did you kind of take a break from it,

Pete Warden 3:54

I managed to divide. I have to say, though, I was for a long time, I was convinced that game programming was really, you know, because I spent a lot of time as a kid, you know, playing games back in the 80s. And I wasn't actually that good at them. So a lot of the programming I learned was by pressing like the bait key and typing and let gold equal 2000 So that I could actually, you know, progress through the game. Or, you know, one thing my sister still remembers was that I found a guide on how to give you more dollars in your SimCity save code every day to get covered like either gives a boost. Yeah, so I went through, you know, I went through college, really with that as my as my main goal and actually spent five years as a game programmer, sort of starting in the PlayStation One era and focusing on doing graphics programming.

Justin Grammens 4:50

Wow, that's some legit stuff. I mean, I worked on building a game on Android boy, probably early days of Android, all of the sort of GPL stuff. Then, is that what you were doing? Like kind of like low level work like that?

Pete Warden 5:03

Yeah, I was actually writing 3d engines. So back especially on the PC, before you could rely on there being a graphics card, you know, a 3d graphics card there. I helped write some, you know, software vendors actually like scanline, rendering triangles and textures. And I actually had to figure out how to render pool balls for a game focused our forward like specular highlights. And so yeah, there was, it was a great initiation into programming. And the other nice thing about it was that almost all the games I worked on, a lot of them were conversions from, especially from the PC to these consoles. So I spent a lot of my first five years of programming working with other people's code. And that has really stuck with me what I learned from that

Justin Grammens 5:54

interesting, did you find it a good way to learn, I guess, or fix problems, I guess, I'm just trying to think a little bit more about how that played out with regards to how you view the world of programming.

Pete Warden 6:04

Well, it meant I had to spend a lot of my time taking pretty big code bases, and trying to figure out ways of understanding them and comprehending what was actually happening. And then, you know, understand them well enough to be able to re-writing the parts that, you know, we kind of needed to replace for these different platforms. And I discovered that the more effort I put into debugging, and writing debugging tools, you know, that was one of the most effective ways to spend my time was like, just putting, you know, trying to pull in as many debugging tools like, you know, profilers and printf, some great points, some test code, and like, whatever, you know, so that's another thing that stuck with me, I'd never, I've never regretted putting time into debug tools.

Justin Grammens 6:53

Gotcha. Yeah, very cool. So this was, this was probably after you got out of college, maybe if someone one of your first couple jobs, I think I saw you were at Apple two for a period of time through.

Pete Warden 7:02

Yeah. And that was quite a funny journey, because I was working in the game industry. And, you know, I was enjoying the technical challenges, but it wasn't the creative. What I found was that, I'd started off with this idea that I really needed to be worrying about the inner loops of these graphics engines, and, you know, really optimizing the code and being focused on being this very low level hacker. But what I realized over the course of a few years was actually, if I spent time working on the tools that the artists were using, I could actually get much bigger improvements in the quality of the graphics and the game, because for example, I'd be able to give them a preview, where they could see how it would look in the game. And that meant that they could actually remove a lot of the polygons without affecting the overall visual quality. So the time I spent on that would actually have a much bigger impact on the game, if I was working on tools versus working on the actual, you know, just write basically an average graphics engine, but with great tools, and a great tool chain for artists would beat the pants off kind of something that was super tuned by a genius at the low level that had tools that the artists wouldn't be able to use effectively.

Justin Grammens 8:21

Boy, I mean, we're gonna end up sort of talking a lot around, you know, tiny ml, and it feels a little bit to me, like you were already thinking about optimization in some ways, even at that early age, or at that at that early time. Is that Is that true?

Pete Warden 8:35

It is, but it also, I'm kind of recovering at what should I put it? Was it early optimization is the root of all evil is that premature optimization? I was I started my career as a premature optimizer. And most of the way subsequent career has been figuring out okay, yeah, you can check you can create those little loops. But actually, what is the point that, you know,

Justin Grammens 8:59

you maybe don't need to Yeah, no, I know exactly what you mean that yeah, you can over over engineer systems in the Univer. X sub, I actually never ends up actually needing that much load. But that's awesome. No, I like to say, I know. I mean, they will start talking a lot about tiny ml and some of the embedded stuff here too, as well, because to me, that is the ultimate of trying to optimize your code base. Before we talk about that. I'm curious to know like so then how did you get into a startup, right? Like, what was the mind shift there about wanting to start your own business?

Pete Warden 9:27

So yeah, Apple, I actually ended up joining because in my spare time, I was doing a bunch of open source video effects that ran on like laptops in the early 2000s for like clubs and concerts. I actually ended up backstage with Def Leppard because one of our stage crew was using my software to help vive they're sort of you know, their video production for beings. That's my claim to fame and Apple, we're actually looking at doing a whole bunch of real time video effects at the same time. Though they actually ended up reaching out and I, back in 2003, my biggest worry was that Apple were going to go bust, which shows my keen business acumen.

Justin Grammens 10:10

I think we all do we all thought it was done. And before they got the iPod, I guess at that time, right?

Pete Warden 10:15

So yeah, I ended up spending five years at Apple working on video effects. And that's really why I got into doing some tactical computer vision, you know, things like trying to spot red eye in photos. And what really struck me was that none of it really worked. It was all very, like hacky. And, you know, you'd write an algorithm where you had to kind of scan across the image for particular shades of red. And then you'd see if they were to add clusters next to each other to figure out if there was like red eye. And, you know, all of that work that you put into one algorithm, you just throw away, as soon as you needed to do something slightly different. But I was really interested in what more we could do with understanding, I felt like there was so much rich data and photos, I basically got my green card through Apple. And two weeks later, I quit, so that I could, you know, dive into the startup stuff, because I love apple, but they're not big on career growth. Like they really want you to focus on the current role that you're doing. Like, if you're responsible for the bottom left screw on the MacBook Pro, they expect you to become like the world expert on that. Because do they expect you to 20 years working on that? And you know, there's something very satisfying about that in some ways, but I was really interested in what was happening around big data and data processing, and how that tied into image analysis. And that's really how I ended up doing Jetpac. You know, I went through a couple of different adventures before I started Jetpac with Julian green, the CEO, and I was CTO. But yeah, really the idea there was, Wow, there were these billions of photos floating around of places in the real world. Can we actually try and construct guides automatically? From pictures of bars, hotels and restaurants, you know, to, for example, to say, hey, is this bar dog friendly? Are there a lot of pictures of dogs? Or back in that era? Do a lot of hipsters hang out? By looking for mustaches?

Justin Grammens 12:26

That's right, I did see that you were you were the first one to create an algorithm to recognize a mustache.

Pete Warden 12:31

Yes. And it actually worked. But the trouble was, what was interesting, we had a Sunday Times journalist in the UK actually use our top 10 in London to see if it was actually working correctly. And he found it mostly was except he also found a bunch of Turkish cafes. Yeah, we had like it was it was a really interesting, but back then, as well, like the algorithm that I did for the moustache. Again, it was still in that era of like the Red Eye sort of process where I was using one algorithm to try and locate faces. And then I was sort of doing a histogram between the nose and the mouth to try and tell if there was actually, you know, it was significantly darker than the rest of the face. But that meant that there were shadow, you know, shadows under people's noses being highlighted, as you know, mustaches. And it was just this very manual, very unreliable process doing this kind of image recognition.

Justin Grammens 13:36

Yeah, yeah. Well, I but you're, I mean, you're a pioneer in that days here, you're you guys are just trying to figure out how to do it. Because I mean, I don't know if it's been solved yet, you know, per se, I would love to get your opinion, just around facial detection in general. But I'm just super amazed that I have children. I have an eight and a 10 year old and it'll pick out their faces from them when they were when they were infants. It's amazing.

Pete Warden 13:59

Working on Google Photos. And that really was the, you know, the moment like I was struck by lightning was when I started to see the results of like Alex Krzyzewski, his work on Alex net. And these deep learning approaches, they just blew me away, both because they were significantly better in their results than all of the work that I'd seen coming before. But also because the same technique can be used for almost any of these, like vision recognition problems. Like you know, you can use the same algorithm to recognize whether it's a cat or a dog, as you do for whether somebody has a mustache, and the only difference is what data you feed it for training. And that was I think that's going to have some of the longest term impact as that spreads out across the whole computing industry, which I think it's still in its early days as you're going from this You know, writing down a recipe of how to solve a problem to giving a bunch of examples of the problem that you want solved. One of my big hopes is, you know, writing down a recipe requires a lot of technical knowledge, that is kind of somewhat arbitrary. And you really have to be a trained engineer to do that, what was really fun at Jetpack, once we started getting the machine learning going, was the, you know, our marketing our head of marketing, Catherine, she could sit down with folders full of photos, and just try adding or removing photos and seeing how the model that was trained on those, you know, we could just use transfer learning to retrain a model in a, you know, a couple of minutes, she could have this really fast feedback loop of actually training the model to recognize new categories. So it went from being this thing where they had to be an engineer in the loop to something where people who were domain experts could directly guide the algorithm to get the result they wanted.

Justin Grammens 16:05

Wow. Yeah. And that that, to me, has been, as I've been, you know, studying machine learning here for a number of years. To me, that was the big aha as well, right? It's sort of this picture of this box, where it's like, you know, rules come in, and data goes out. And machine learning flips that around, right? It's like, data comes in, and rules come out, right. And I was like, wow, that is such a great idea. Why didn't we think about this years ago? Right. And, and so in 2014, you, you your company was acquired, and you ended up going to Google? Right?

Pete Warden 16:34

Yeah. And, you know, we were trying to do Yelp that based on photos. And, you know, we got up to hundreds of 1000s of users. But we never got into the millions of users that we needed for kind of like a, you know, a free consumer app. But it did turn out that we were one of the few places where people were trying to run deep learning on mobile phones. So we open sourced a bunch of our code around that as we were working on it. And I ended up like that became, like, nobody was really quite sure what to do with deep learning on phones. People were, you know, like Google and Apple and other companies wanted to at least have the technology. And, you know, we had a, you know, we had a library that let you run models back in those early days on like phones and miles, we pies. And so that really was the reason that Google wanted to pick us up.

Justin Grammens 17:32

I see. And was had TensorFlow been open source and created. Yeah, that's it was a couple of years away, right?

Pete Warden 17:38

Yeah. I feel really lucky at the time that I joined because TensorFlow had not yet got started. So together with Yang Jing Jia, who was at Google at the time, who's the author of cafe, I was actually working on some internal frameworks for running machine learning on phones, because they're already Google Apps was starting to ask for this. Then tensor flows got started. And we moved our efforts over to like helping out, you know, we became part of the TensorFlow team, you know, I ended up becoming, you know, us back when we were only like, 10 people, I think on the TensorFlow team. It was way before it had been open sourced. But I ended up becoming the person who was responsible for anything weird, that was not like a cloud being deployed on an x86 server somewhere. So phones initially, and then over the years, I became really interested in. Okay, we got this on billions of phones, that actually worked pretty well. We found a lot of use cases for it. But I'd actually seen the speech team and I'm gonna be careful how I say this, because I don't want to set off everybody in my own phones. But okay, Google. Yeah, we actually call it okay. G internally, in meetings, these phone does not go off. They had a 13 kilobyte model, back in 2014, like a deep learning model that they were using, running in the Always On mode on phones. So the main application processes turned off as a DSP or a microcontroller. That's sitting there just listening to the microphone. And if it thinks it may be heard, you say, okay, GE, it will then wake up the main CPU and do more sophisticated processing. But the starting point is this, you know, 10s of kilobyte model that left me thinking, wow, if you can do useful stuff, in just 10s of kilobytes with deep learning for this application. I wonder what other problems we can solve with this.

Justin Grammens 19:53

I will say when I cracked open your book and started reading it, that's one of the first examples and I use that example with everybody, right? I mean, I People that are not in the industry, they don't understand Internet of Things. They don't understand machine learning, you know, what is this tiny ml thing you're talking about? I have no idea. And I'm like, well, let's, you know, it's machine learning on microcontrollers. They're like, that doesn't make any sense to me there. And I'm like, well think about okay GE or think about, Hey, Siri, or think about, you know, Alexa, keep using all the words here, probably setting off all the things around me. But, but think about that. And then it's like, Oh, I get it, you know, and you're like, you know, you don't want your phone to die. Because it's sitting there listening to information or listening to those wake words, basically, and sending information to the cloud, right continuously. I mean, your phone would die within an hour if it had to do that. So it's a it's a beautiful example, you know, with regards to where things are at. And that's great. I mean, so that was sort of the first thing that you guys were doing back in 2014 was sort of getting that running on a phone.

Pete Warden 20:46

Well, the interesting thing was Google actually already had some quite long standing code to run this stuff on the embedded chips. So the speech team had actually been doing this for several years already. So I actually learned a lot from people like Razzie, a lot of hours around things like quantization, and how to get embedded machine learning running. And that's been one of the interesting things, there's been a lot of industry knowledge scattered across kind of different companies, where people have kind of homebrew to their own solutions to hey, I need to run machine learning on this weird chip with no operating system and 100 kilobytes of memory. But there had never been any kind of attempt to turn that into a more generally usable framework. So I can't claim to have invented very much a lot more of what I've done has been just kind of trying to take work that's already been out there and turn it into something that's a bit more general and accessible, and try and actually make it easier for people to get started.

Justin Grammens 21:53

Yeah,well, and it's, I mean, I think Wake words is one thing, but then you start thinking about just all of the natural language processing now that Google Assistant can do. I mean, I can dictate stuff to it up and down. It works very, very well. And I'll go back and fix sentences along the way. I mean, were you were you guys looking at those types of things, even probably, I guess, basically image to text, right? I go and take a picture of a document. And now it photos pops up to me and basically says, Do you want to transcribe it? I'm like, what? Yeah, sure. Click Yes. And it does a beautiful job. Are you guys looking at some of those applications as well?

Pete Warden 22:24

Uh, yeah. So a lot of those took place on the main phone CPU. So we supported that through, you know, I helped build TensorFlow Lite, which is kind of the public, I think it's probably still the most popular framework for running ML on phones. And so all of Google's major apps picked up that for things, you know, like the the obvious ones are like the photos use case, like you mentioned, and Google Translate. I still love sort of picking up my phone, and, you know, showing people pointing it out the Met, you know, a foreign language menu or something, and just having the translation appear on the screen in like real time, overlaid. Like, that's magic. And the only thing I regret is I keep showing that to people. And people keep asking me when it's going to be released.

Justin Grammens 23:14

Now. It's there. And I'd

Pete Warden 23:17

like it's there. I wish people had Apple's marketing.

Justin Grammens 23:21

Yeah, right. Right. There's all these features, they're just not good at making them apparent to the end users, I'm with you on that, for sure.

Pete Warden 23:27

So what really got me interested was, okay, you've got this phone, which is great. But you have to plug it in, and recharge every night. A lot of the thought is our tiny ML is, okay, if you've got something that instead of, you know, this takes about a walk to the CPU give or take, if you could actually run the same sort of algorithms on a 1,000th of the power, you've got a chance to actually run on batteries for like a year or two. How does that change? You know, you imagine being able to put all of these things out into the environment have this ambient computing idea. And you know, for the example, you use the of, hey, I've taken a photo of something, and it understands the text that's in there. Like one of the things I'm really interested in is, hey, can we actually have like a $5 device that sits looking at like a water meter, and, you know, understands the display that's being shown on this old school water meter, and is then able to every three minutes or something, send that number up to the cloud. But if you can actually run it continuously, on like a coin battery, and you only have to run, you know, ping the cloud every now and again, you've actually got a chance to have something that you can just install and forget about you don't have to wire in. Like it's super easy. So a lot of the thought is really okay. We've got these really interesting capabilities in the cloud now on phones. So how does the world change if we can get them to use a 1,000th of the energy so that they can run on batteries? And on a chip that's only like, you know, $2 or something like that? Like, can we basically plaster our world with sensors that do something useful?

Justin Grammens 25:20

Yeah, no, I love it. I love it. And I'm a huge proponent. I was, I guess I was going to get to having you sort of define what tiny ML is. And so power is a major consideration, I guess, is what what you're saying? What are some of the elements I guess, around around 20? mL?

Pete Warden 25:34

Yeah. So I think the thing that really distinguishes it from just general running on, you know, general, embedded machine learning is this idea that you're aiming for below a milliwatt. Because below a milliwatt power usage, it starts to become possible to use like a small solar cell, or some other forms of energy harvesting, or at least have, you know, three, triple A battery, will actually run for about a year and give you, you know, on average, one milliwatt. So you start to create devices, which don't have to be tended by a person, you know, they don't have to be wired into the mains, they don't have to be charged every day, or even every few weeks, they have their own intelligence in them, that they can do these things about, hey, is there a person here? Did somebody say like a particular word, you know, putting voice interfaces into everyday objects, versus having to have some kind of like centralized, you know, even a phone or a centralized server that's actually running the intelligence. So that's really the root of tiny ML is like, it feels like there's a big change once you liberate these devices from or liberate people from having to kind of like tend to these devices and, you know, make sure that they have enough power. If you can get the power usage low enough and give them enough intelligence, then it feels like a whole nother step change. You know, like going from personal computers to phones, and now going from phones to this ambient computing idea.

Justin Grammens 27:12

Yeah, I love the word ambient, because that's the sort of the future the vision, I guess, is like, all these things around us are really smart objects, right? That's the thinking of Internet of Things. But if everything needs to be on the internet all the time, it's very, very power hungry. And you're not going to get what you're looking for. I guess the other piece of that, that I'm thinking about when you're talking about not being having to be connected, I mean, is there's this idea of just sort of just general decentralization, these things can be autonomous, right?

Pete Warden 27:40

Yeah. Like, why do you need an internet connection for voice interface, like a voice interface should be like a touch interface, or like buttons or keyboard or mouse, you know, it should be something that can be dealt with locally. And if you have reasons to connect, then by all means, I'm not saying connections are bad. But like you say, the power usage of radio is so big. I mean, that's actually one reason I don't like the term internet of things is because I don't think that this world of smart things is going to be like an internet connected world, like we've seen with, you know, in the traditional computing side, because we have this idea that with the internet, you expect that connectivity is going to be the norm. And so that it's comparatively low cost to just build in the expectation that you can add more services through internet connectivity. Whereas if you have these hard power constraints, and connectivity can easily take at a minimum, like 50 milliwatts or 100 milliwatts whenever you turn the radio on. And that's like 100 times more than what you're using to run machine learning locally, then the duty cycle of when the networks actually aren't available. And the trade offs of actually using an internet required service, it becomes much more different than sort of the desktop or even the phone use case.

Justin Grammens 29:15

Yeah, for the end user to have to keep plugging things in all the time. I'm just thinking my myself smartwatches have gotten better and better over time. And now even like the watch that I have, here, it's not even internet connected. It's more of a running watch, and it cares pairs with my phone and can do all that stuff. But I mean, it'll go for more than a week. But, you know, I think about some of these initial watches that I had, it was like, every night I had to charge the thing. And I mean, oftentimes I would forget and be frustrated and just not even want to use it honestly.

Pete Warden 29:42

Well, honestly, I think that that you know, for me, that's a massive barrier to you know, having more devices per person for me, you know, the idea of having more chores to do and more things to remember to you know, don't even have a wireless keyboard because I always feel Get to plug them in, they always man that has always worked out. So if we want to go from having like, you know, two or three devices per person, you know, a laptop, a phone, maybe a watch to hundreds of devices per person, we can't rely on having somebody there to like charge them, or them being kind of plugged into the wall. And so that's really where the why the power usage is such a key part of this kind of tiny ml definition.

Justin Grammens 30:32

Yeah, yeah, for sure. Well, so you're doing some awesome stuff at Google? When did you write the TinyML book? I guess, like, what was the thought behind that.

Pete Warden 30:42

So that really was a couple of years ago, a lot of the thought behind that was, again, like, I don't feel like I'm invented very much of this stuff, I've helped kind of pull together, you know, pull it together into the TensorFlow light micro framework. But what I realized was that machine learning is this very specialized world, where it's hard to learn all of the concepts you need to sort of actually produce applications. And embedded systems are another very specialized, complicated world, where you actually need a lot of experience to be able to work effectively. And the overlap between the two, there was almost nobody who actually had both sets of skills, you know, had actually been through building products, using machine learning, and also understood like kind of the research side enough, but also understood the constraints and the challenges of trying to work on these devices with just, you know, 10s of kilobytes of flash and RAM, and no operating system, and all of these crazy tool chains. So partly, it were, you know, I always find it really useful just for myself, to actually try and write down a guide to make sure that I understand sort of everything we need. But I also really wanted to hopefully help provide an on ramp to people who totally had the capability to learn either, hey, I'm an embedded person, I know, I want to learn enough machine learning to be useful. Or on the other side, hey, I'm a machine learning person. And I want to understand how to get my models down onto all these weird kind of, you know, 50 cent chips. So that really was the the idea behind the book was, I knew we needed more people, educated and trained up on this stuff, if we were going to start building some of the things that were possible, and really trying to do kind of a brain dump, together with my co author, Dan, focused on some very practical applications. But in a way that sort of walked you through, hopefully, a bunch of the skills and knowledge you need to pick him up in order to be effective.

Justin Grammens 33:01

Awesome. No, that's, that's a really, really good story. I mean, so I've been teaching Internet of Things at the university here for a number of years. But I felt like it was it was more or less around just getting data to the cloud and showing visualizations and charts and graphs. And I've reworked the class now over this past year, and I'm gonna thank you for offering to be able to be in my class and being a guest speaker here in the fall, but I feel like there was a piece that was missing. And I think you summed it up pretty good, because you have sort of people that know, embedded systems, and people that are data scientists in the cloud, and are machine learning experts, whatever you want to say, and they have these models, and it's like, no one really sort of, like, mixed the two together, right? It was always like two different types of people. And it's like, no, the real world like needs this. It's like chocolate and peanut butter. Like it's gotta come together. And so for you to see that a couple of years ago, and, and sort of latch on I mean, the tiny amount, like term I don't, I guess I don't even know, like, where did that come from? Somebody at Google invented or?

Pete Warden 34:00

I don't know, actually, that's a good question. I mean, Evgeny Gusa, from Qualcomm has been one of the longtime champions of this sort of this idea. And this community. And my memories of this are, I did a talk a few years ago, called The Future of machine learning is tiny. And basically like laying out a lot of these ideas. You know why I thought, doing things on these super low power platform, so it's going to be so important. And then if Ganni sort of came in I'd been talking to him before he was responsible for the Qualcomm with Lance, which is a really, really important early example of machine learning on a really, really tiny device with a camera. So I'd been talking to him a lot about this anyway, and he suggested, Hey, we should actually put together a conference around this idea. And I think between us we came up with the way tiny ml seemed fairly natural. And yeah, so as far as I know, that seems to be when the terms sort of started to take off anyway, we held the first tiny ml Summit. Now I think we're into our third or fourth year with that. And Evgeny has been running with it and making it grow. And it's been really good to see.

Justin Grammens 35:17

Yeah, it was awesome. I attended it here. And in San Francisco, just you know, what, a couple months ago or so. And was, oh, man, I was just blown away super impressed at the community that's been built up just in a handful of years here. You know, I don't 300 400 people I think attended. And it was it was a lot of fun. And there were just, I guess there's just a lot of commercial support in the space right? There were just a lot of vendor tables there that companies like Qualcomm, you're talking about Intel, I mean, the guys that edge impulsively doing some awesome stuff, right? You know, lots of startups doing some really, really cool fun things. So yeah, it's a super vibrant community. It's really, really exciting. I'm, I'm planning on going next year, and it's been great to see it again, I guess I'm kind of a new entrant, but it's still so new, but it feels like it's got a lot of traction.

Pete Warden 35:59

I mean, what's really nice has been, there's all of these hardware and software trends that sort of seem to be coming together to kind of, you know, put wind in the sails of this idea. You know, a lot of the hardware vendors, what's been really nice is, I'm able to say to them, Hey, just give us more or less metallic. And we will make use of it. Like we're compute bound, we're not memory bound. And that's actually really unusual for, you know, practical applications. You know, most things are like, you know, along the lines of like HTML rendering, where you've just got a whole bunch of logic and branches, and, you know, data dependent stuff. And, and that's usually, you know, basically your, the speed depends on the memory accesses. Whereas for machine learning, we're doing effectively these enormous matrix multiplications, where you actually take the one value and use it in a lot of different calculations. So that one memory access is kind of a more tires, the cross, you know, hundreds of 1000s of arithmetic operations. So all of the vendors are actually really excited because they're like, Hey, we know how to do like adding more multipliers and adders is way easier than like adding more memory, and uses a 1000s of the power of like D lab to do like, a multiply, add, for example. So from a power and from a sort of silicon area and a cost perspective, anything where they can sort of get people to do more than take locally helps them sell more chips.

Justin Grammens 37:44

Yeah, and you don't even need like you're saying, you don't even need that powerful of a processor, right? This Arduino, the Nano 33, ble, and all the the ones that you guys use in your book and stuff. They're not that powerful chips.

Pete Warden 37:55

Yeah, they're running at like 48 megahertz or something. Or 80 megahertz. That's like, you know, Pentium. Yeah. So those clock rates. So

Justin Grammens 38:07

yeah, I think the CPU from my computer in college was at least three times that sort of things, you could probably go to the moon on, I guess, you know, so So yeah, it's really, really cool. So then now you're doing something new, right? I guess you've left Google and you're doing stuff at useful sensors, because that stuff, you can talk about

Pete Warden 38:24

it. So we're still in stealth. But what I can talk about is, you know, I'm doing this in conjunction with the, you know, I mentioned at the start that I'm doing a PhD at Stanford. And really what we're thinking about is one title I've thought about for my work is smart homes without spying. But a little bit provocative, but it's really kind of building off this idea that, hey, if you've got voice interfaces that can run locally, can you actually keep all of the voice and video within somebody's house? And in a way that's actually provable? You know, so that a third party can actually check your hardware and software and say, Yes, we have high confidence that nobody can actually access your audio or your camera from outside the home. Now, even the you know, sort of, you know, the big tech companies providing services.

Justin Grammens 39:23

And as you were talking about, in some ways, security, that's that kind of does overlap with 20 ML, right. That's another beauty of tiny ml in some ways, is it doesn't actually, if you don't need to ship data, then it's ultimately more secure.

Pete Warden 39:35

Exactly. And that's one of the things that if we can engineer this correctly, there's already this explosion of microphones and cameras and all sorts of devices because these ml capabilities of like, Hey, is somebody sitting down on the couch in front of the TV? Like that's, that's actually really useful from an ambient computing kind of perspective. You can Imagine like safety applications like having parental controls on like a stove, where only kind of the authorized users can actually, you know, turn on the stove. And maybe the stove can turn itself off and haven't seen anybody for half an hour, all of these capabilities are going to become very tempting. But if we're not careful, we're going to just install all of these cameras and microphones. And they're going to be just this really tempting target for people to hack into. And this vector for a massive inverted potential invasion of privacy if we don't have control over this stuff. So really, what the research is trying to think about is, okay, how can we actually engineer building blocks of technology from the ground up, that offer some guarantees about Yes, you've got a voice interface with a microphone, but we can guarantee none of the rest of the system can actually access this, the only thing that has access to this to the microphone, is the voice interface part that needs it. And it's impossible to record from an engineering point of view. So trying to like step back and kind of work through the system designs, we would need to kind of enable these kinds of guarantees, like, how can you make it so that somebody can actually audit these devices, who's not the manufacturer, and have some confidence that they're not being used to spy on you?

Justin Grammens 41:35

Oh, man, that's this is great. Yeah. Cuz I think what's happened in the world of, and I'll go back to IoT devices are just being thrown out in the marketplace with no control, it's the wild wild west. And that is dangerous. And, you know, people don't know where their data is going to devices are, I guess, siphoning data off of the network that you have no power knowledge about. And, you know, there's a an author that I have read her name is Amber case, she has a book called calm technology, not sure if you're familiar with her at all. And she's, she's been a speaker at some of the events that we've had in the past, but she talked about this idea, and this goes back six years ago, or so that, you know, I should be able to allow my doctor to see my health records only him or her for a specific amount of time. And that data should live on my network shared with that person, never leave my network if I don't want it to. And I should have essentially guarantee that it is destroyed. Right? And so she was already talking about this years ago. And I'm like, you know, I love the idea. It's just, it's going to take some time probably to get everyone on the same page. Right? It's gonna take a number of different frameworks, a number of different ideas to come together. But I think ultimately, that's the only way it's going to work. So anyone's going to sort of trust this.

Pete Warden 42:46

Uh, yeah, and I really, like there's a lot of tricky problems, like with medical records, you actually want to be able to do some controlled sharing with things like microphones and cameras, I if we can do as much as possible to air gap them. Because you almost, you know, there's very few occasions when you want these kind of IoT devices to actually send audio or video anywhere, you almost always want to just kind of get, Hey, did somebody do a gesture? Did somebody say, a voice command? You know, should I, you know, these other sort of derived pieces of derived information. So if we can actually lock the VOR data away and have a very clear, thin interface that just kind of broadcasts like, hey, somebody said, the wake word, or somebody sat down in front of the TV, like, I'm hoping that that will be an understandable and kind of robust way of kind of keeping people's data safe.

Justin Grammens 43:46

For sure. Ultimately, it is about keeping people's data safe. And when I first got on the internet, I was always worried about typing my credit card number in, right. I tell people, you kind of sit in there, and Amazon for the first time you're like to really put this I mean, now everyone just sort of does it, it just instinctively, we're not worried about that. And so and so financial data is, you know, that would hurt to have my financial records loss, but to have my personal biometric data stolen, or, you know, whatever, I've been speaking at my device for yours. You know, all of that information is way more private, way more personal. So I commend you for the effort. Pete I think it's I think it's, it sounds really, really fascinating for what you guys are going to be working on here and I look forward to seeing when you guys come out of stealth mode. So how do people get a hold of you? What's the best way?

Pete Warden 44:29

So Pete at Pete Borden dot coms my email, I'm pretty active on Twitter. You know, I try and keep posting stuff and I think I've got my email in my in my Twitter hat you know my Twitter bio up at you know, my Twitter handles just Pete Warden. So I'd love I'd love to see you on there. But yeah, I'm happy to get emails. I'm always happy to geek out with anybody who's you know, interested in this stuff. So please do drop me a line.

Justin Grammens 44:56

Perfect. Well, we have a lot. We have liner notes when we publish this So I mean, everything from, you know, the work that you've did that you've done early on with jetpack to all the way to tensor flow, and everything in the book will promote all that stuff. So we'll be sure to get all that information out. Was there anything else you wanted to chat about today that I might have not touched on Pete?

Pete Warden 45:15

Now, this has been great. This has been really interesting. Thanks so much for having me on. Just about really appreciate it.

Justin Grammens 45:21

Awesome. Yeah. Thank you much for your time. I appreciate it. And yeah, I look forward to having you back in future. Thanks again.

AI Announcer 45:27

You've listened to another episode of the conversations on applied AI podcast. We hope you are eager to learn more about applying artificial intelligence and deep learning within your organization. You can visit us at applied ai.mn To keep up to date on our events and connect with our amazing community. Please don't hesitate to reach out to Justin at applied ai.mn If you are interested in participating in a future episode. Thank you for listening

Justin Grammens

Host