Conversations on Applied AI

Tim Kraska - Transforming How Data-Driven AI Teams Collaborate

May 10, 2022 Justin Grammens Season 2 Episode 11
Conversations on Applied AI
Tim Kraska - Transforming How Data-Driven AI Teams Collaborate
Show Notes Transcript

The conversation this week is with Tim Kraska. Tim is a professor at MIT in their Computer Science and Artificial Intelligence Laboratory, and a Co-Founder at Einblick, where they are developing the first visual data computing platform based on six years of research at MIT and Brown University. Tim holds a Master of Information Technology from the University of Sydney, a postdoc in computer science from the University of California, Berkeley, and a Ph.D. in computer science from ETH Zurich.

If you are interested in learning about how AI is being applied across multiple industries, be sure to join us at a future AppliedAI Monthly meetup and help support us so we can make future Emerging Technologies North non-profit events!

Resources and Topics Mentioned in this Episode


Your host,
Justin Grammens

Tim Kraska  0:00  

I don't believe that actually like AI is replacing so many jobs like at least for the use cases we are solving more want to enable people who have like a certain domain expertise to do more on their own and work better with the data scientists together. So like, in no scenario we are culprits of I would argue that we actually replaced the human being but we, for example, have helped to improve a process like this chemical company, we are working this thing here, we help them to improve their their processing pipeline, right. And so, the way we do that is essentially giving the right people the right tool to an audit to make better predictions, make better decisions and

AI Announcer  0:40  

welcome to the conversations on applied AI podcast where Justin grumman's and the team at emerging technologies North talk with experts in the fields of artificial intelligence and deep learning. In each episode, we cut through the hype and dive into how these technologies are being applied to real world problems today. We hope that you find this episode educational and applicable to your industry and connect with us to learn more about our organization at applied Enjoy.

Justin Grammens  1:11  

Welcome everyone to the conversations on applied AI Podcast. Today we're talking with Tim Kraska. Tim is a professor at MIT in their Computer Science and Artificial Intelligence Laboratory, and a co founder at einblicke, where they are developing the first visual data computing platform based on six years of research at MIT and Brown University. Tim holds a Master of Information Technology from the University of Sydney, a postdoc in computer science from the University of California, Berkeley, and a PhD in computer science from ETH Zurich. Thanks for being on the program today, Tim. 

Tim Kraska  1:42  

Thanks Justin. Thanks for having me. 

Justin Grammens  1:45  

Awesome. Well, cool. Well, I told a little bit about maybe where you're at today, right? Some of the awesome stuff that you're doing at MIT and, and working with Brown University, maybe give a little bit of a short background, I guess, terms of where you got to where you are?

Tim Kraska  1:57  

Sure. So my main area of research is either applying machine learning to improve systems, or building systems to make machine learning, like easier for a broader range of users. And the thing your particular mentioned, the company, we found that I've laid came out of a research project for the letter. So we what we are really trying to achieve is like building a system, which makes machine learning data science data exploration much more approachable, or like a larger audience and shortens the time to insight for people in businesses. And then on the other hand, now at MIT, I'm focused a lot on like the area on how to leverage machine learning to improve systems. I see. Well, we're all about applications of artificial intelligence here on this program. So maybe it's a sort of a data visualization tool. And we'll be sure to include links and stuff to the website. But maybe how did this idea sort of come about? So like I am, like, originally started as like the research project, but we could not step back from the dates. And this was mainly motivated by the fact that we saw like these large interactive whiteboards appearing on the market. Like for example, there was a Microsoft Surface Hub, the Google jump board, some of them flip the whole bunch of them. And these are like touchable to these you put on a conference room. And then people like that mainly use them to just like do better video conference like so that you can have a chat bipod and different vocations. And when we saw these devices appear on the market, we thought like, Oh, he's a very interesting, they are now becoming mainstream, but it couldn't be so much more. So can we not build an environment where people, as a group, as a team can work together on a problem using one of these, like, large interactive whiteboards. And so instead of just like sharing final results, like you really want to bring a team together in a room, and that they can explore data set, build a model, get insight of it, and then eventually also take action based on what they saw. So this was a ritual version. And over the time, it actually dramatically changed. So first of all, the pandemic obviously happened. So when we spent it out as a company, the first thing was like, oh, yeah, like interest meetings probably will not have the same type of importance anymore than it used to. And so we heavily invested in how we enable something very similar, but in remote locations. And the other thing, during this journey of building this platform for the essentially data science tools for teens, we also discovered that a lot of the technology was out there, it's not adequate to provide the interactivity needed to enable actually a team of researchers to make discoveries on the fly. And we had to invent a whole range of new technology in order to make that happen. I see. So like you said, there are visual whiteboards or I guess virtual whiteboards right. I you know, out there like like jam board and stuff like that where people can move stuff around, but they're not really catering us

Justin Grammens  4:59  

Say there's certain maybe things that are lacking specifically around data science.

Tim Kraska  5:03  

Yeah, that goes to the existing like monopod technology you have like the Google jam board and so on, they mainly focus on video conferencing. And maybe you have like, a white board, and they have some apps now. But it's not like that you can use them to explore data set together. So some people use them to show their sports. But that's what's more like a final result, somebody else created for you, and you just consume, rather than to do something new. So we always thought about this, just like I think pasty was also in, like what we did inspired by movies, like if you're thinking about Minority Report, you know, proposal is like very new way on how you interact with data. And when we hit the first prototypes, like often people immediately only saw like, Oh, that's a minority report, or real. Nowadays, actually, if I talk to my students, nobody knows Minority Report anymore. So this comparison, unfortunately, doesn't work anymore.

Justin Grammens  5:58  

Sure, sure. But the concept is still there. Absolutely. Did you guys find the pandemic actually maybe boosted the need or increase the interest in this product?

Tim Kraska  6:09  

For sure. I think the pandemic particularly uplifted a whole range of toolkits, which focus on collaboration, and what are we met, there's just like, if you look at different areas, like you are trying to do text editing, and used to be the case, if you want to work on a text together with other colleagues that you are sending documents around. Nowadays, everybody's using something like Google Docs, office 365, is seven people at the same time, can vote on the same documents, I'm going to tenuously, right. And this became so much more important during the pandemic, because like everybody was in remote locations. And so you were looking for new ways to replace the in person meetings. So like this online editing of text document, I think, like, even was, like more emphasized than it used to. But it's not just like text editing, every single other common business use case we see something similar happening. So like spreadsheets, obviously, as well, like you have Google Spreadsheets, which was replacing the traditional Excel. If you look into design, like it used to be the case that you sent like PDFs around with proposals or design. Now we have like startups like figma, who are really changing the way on how people do design together, we are trying to do with something very, very similar for the analytic space. So like, it used to be the case, if you do analytics and have like the traditional tools, you have their like upload, they can call, they are mainly designed still for single user and a single users interacting with them. And then sharing the final result, it's not like that you can have several users at the same time exploring a data set, exchanging ideas, seeing that like how they actually do something together on a on a big, let's say bite or two or interactive canvas. And this is where we come in, this is what it does, it's like really providing this like similar to in US adding things. On top of that, we also created a whole range of like new capabilities between later on so that our customers, for example, in our case, be happy. But we notice in the moment, you want to enable people to collaborate together, nobody wants to wait for a result, right. So if you have a large data set, for example, like a simple visualization of it might take a few minutes, if you are alone, you might be OK with waiting a few minutes until it shows up. Right? Because it's your time to do your emails, it's fine. In the moment you want a team, you look at the screen. And that takes a few minutes. Nobody wants to wait a few minutes. And just like stare there in silence, right. And so this cost like as for figuring out new ways on how to ensure the system stays always intact of regardless of the data size. And regardless of the operations who are doing and develop like this, this new concept called progressive approximation for that reason, so that they can guarantee this interactivity and you will, again, the team to work with very large data sets and complex things

Justin Grammens  9:06  

I like that, as cool. As you were speaking, I was sort of thinking about, well, somebody could just like I deal a lot with like, Jupyter notebooks, and like Google colab, and stuff like that. And so I could share my screen and people could watch it. And I could be interactive with regards to like, you know, modifying code. But number one, A, we're not both working on the same document at the same time, really, right. It's not like a Google Doc, where things are being shifted around. And then also, like you said, like sometimes those things can take, you know, minutes hours to run. There's like large datasets, so people are just staring at a screen sort of watching the stars move across and that that sort of lose out on all of that.

Tim Kraska  9:41  

Exactly. And so like actually, one of the very first things we thought would be like, Oh, why don't we make like Jupyter Notebooks more collaborative just essentially put like an editing mode like Google Docs on top of Jupiter no quirks. But that also has problems like for example, if somebody is changing something at the top of the document, and you are at the bottom So how do you know what he changed at the top? Because it influences everything else. Right? And so we needed to figure out different ways how you interact with that. And then, obviously also deal with the slowness of the traditional approach of doing things.

Justin Grammens  10:14  

Yeah, for sure. And so yeah, seeing stuff more interactive that you were sort of saying maybe you mentioned, you guys had a term for that, right, more of a progressive rendering type,

Tim Kraska  10:22  

Progressive Approximation Anglin. The core idea behind that is that you can look to write like code in our platform, this is like you can think of like Python is like an operator, if you make leaving the code, you write in tact as well. The core idea behind that is that fully automatically for you, the first one, whatever you want to do over a small sample of the data, if you're quick approximate result. And then in the background, we make some sample larger and larger, to give you better and better result until it eventually converges to the final answer. So if you wait long enough, you will get the final answer. And it will be precise. But in the beginning, you get something approximate. And as we all know, approximation was like extremely powerful to like sampling marks extremely well, right. And so like this unblocks here. So often, what we see people doing is they try something out, and they see the first results coming in. And it's like, Oh, should I include this feature, and this was information, you could just shouldn't have done that. So they remove it quickly before the computation finishes, and then go to the next iteration, which helps you to iterate much, much more quickly.

Justin Grammens  11:34  

Oh, that's great. That sounds super powerful. I think for people, like you said, it's kind of all about iteration. And we're all sort of learning around this data. When you come into a data set. There's not a whole lot you really know sometimes about it. So the more you can iterate quickly, the faster you can become. I was thinking about you guys, what are you looking at with regards to funding right? Are you self funding this, this is kind of sort of came out of an idea out from from MIT, tell us a little bit about maybe like where you are in your product release cycle and your journey?

Tim Kraska  11:59  

Yeah, so we have no official MIT brown spinoff, and like, almost on day one, we raised a very large seed round. So we are currently using the seed round to continue to develop the product more. And we have a set of like, design partners. And like we have some annual revenue already. But it's like still early stage and we are now just launched our SAS offerings. So actually, you can just come to and just sign up and immediately use it without any restriction.

Justin Grammens  12:30  

That's awesome. Very cool. So that's one part of your your job, I guess. Right? Is is being a co founder at this fast growing startup. It's doing some really cool stuff and AI, but you also teach at MIT, right? It's part of the Computer Science and Artificial Intelligence Lab. Correct. Tell us a little bit about maybe what sort of courses you teach. How long have you been involved in that? Yeah. What's what's your experience working at MIT, I'm fascinated by it, because it's such a great school.

Tim Kraska  12:55  

Definitely finding this humans are really, really good. My research right now at MIT, as I mentioned before, mainly focuses on how to leverage machine learning to improve systems. So the class we taught last semester, I actually did a visit colleague of mine, Muhammad Ali Zadar, we taught a class on machine learning for systems, particular focusing on this like intersection of the two. Now in this semester, I'm actually teaching a new class called the second time, cool. It's a systems for data science. And the core focuses, they are online, it's, you know, data scientists like an area. But instead of just looking at the machine learning part, or focus particular on figuring going like teaching the concept more end to end, from like getting the raw data, how to transform it, how to visualize it, how to process what to do, if the data is large, how to present the results on the end, and just like have a bunch of tools you need, or like making insights based on data and band.

Justin Grammens  13:54  

Oh, wow, that seems very applicable to what people need to do in their career. Because oftentimes, I guess it depends on the size of the organization that you're at. But you know, most small companies, you got to kind of wear all these various hats, right? From data engineers to DevOps pipeline stuff, to visualization, to sort of trying to give them this overall, hey, here's all the tools that you kind of need to go into. And from a raw data set to something that you could make predictions against at the end of the day, is that, right?

Tim Kraska  14:19  

That's right. And they probably should also mention for the audience, like I'm also offering a new like professional education course at MIT, was a very similar focus particular on like, no codes towards for data science will offer this summer, I think, in June or July. But if you go to the professional education webpage at MIT, you can find that and everybody can sign up on that phone.

Justin Grammens  14:42  

Nice. Well, I was doing an interview with somebody just maybe last week, actually. And they were sort of saying that, it feels like artificial intelligence and machine learning is going to touch no matter what you're in you, you're gonna wanna understand and learn a little bit about it. And some of these low code platforms I think will bring in people that don't have any technology experience. But yet need to understand sort of just some of the benefits of learning it right?

Tim Kraska  15:04  

That's 100%. Right. Like, I think there's a reason why most courses now universities or doesn't matter in which department you are in. Many of them actually, by now I have like a data literacy requirement. They require that even if you're studying lying, and literature is an example, that you take some data courses vary in that, I think it's a really good thing, because in the end is like hardly any discipline left, which is not touched by data goes everywhere.

Justin Grammens  15:32  

Yeah, for sure. Well, one of the things I'd like to ask people is, how do you define artificial intelligence? Do you have a short, succinct thing? Or if somebody asks you in an elevator, what do you do?

Tim Kraska  15:42  

I normally stay away from the term artificial intelligence.

Justin Grammens  15:45  

Okay, I get it. I get it. It's a little bit of a can of worms. 

Tim Kraska  15:49  

For obvious reasons, I think like many do in industry right now, the need as much to do with real intelligence. It's more like that you try to know if you use machine learning, you built a model, which just like certain things for you, but it's not necessarily that intelligence plays a role that,

Justin Grammens  16:09  

yeah, fair enough. Yeah, you're right. I mean, you feel like, you know, we're, it's very, very narrow with regards to what we can do today. Yes, computers can do things very good. But they still need to be their hand needs to be held a lot, right? These are not intelligent systems yet?

Tim Kraska  16:23  

No, they might be able to outperform humans in certain scenarios. But like, the question was, like, how much is pattern recognition? How much is actually be the generalization going on? And it's, it's a very tough question. So for example, I am like, our focus was off more on the traditional techniques. So not like we use deep learning techniques, but mainly for transfer learning. For example, if you've worked with text and unstructured data, we leverage multiple Java people have trained to build up feature vectors, but what we noticed is like, at least was the design patterns we're having. And the current customers that I'm like, it's much more so that like, most of them require very simple techniques, and when it provides them a lot of value. So in many scenarios, actually, just a simple like, boosted decision tree is hard to beat. And so like, I mean, like decision trees. Now, you can also say, like, yeah, there's some intelligence in their mind, I probably wouldn't argue against that. But they're sort of a real business use case. And in that, that's what the video about.

Justin Grammens  17:28  

Yeah, for sure. And it seems like your technology is just very much, I guess, complimentary. You know, some sometimes people get worried about AI taking over their jobs, right, or, you know, basically, data science making their jobs obsolete. But it feels like you guys are providing, you know, a very nice tool that allows people to work on this from literally anywhere in the world. Yeah. Now,

Tim Kraska  17:49  

I don't believe that actually, like AI is replacing so many jobs, like at least for the use cases we are solving, I might be others. This is different, but the use cases we are solving, we move on to enable people who have like a certain domain expertise to do more on their own and work better with the data scientists together. So like, in no scenario, we are culprits of why would argue that we actually replaced the human being, but we, for example, have helped to improve a process, right, like this shemeika company we are working with, it helped them to improve their processing pipeline. And so the way we do that is essentially giving the right people the right tool to an audit to make better predictions, which help them to make better decisions. And that,

Justin Grammens  18:33  

yeah, makes a lot of sense. For sure. Well, are you guys growing and hiring at at einblicke?

Tim Kraska  18:40  

For sure. Like we definitely have like some engineering positions or plan. We are looking also for marketing people right now, particular for the SAS offering. So essentially, on all levels, we are looking for people, so if anybody is looking, please let us know.

Justin Grammens  18:54  

Yeah, for sure. And I'll definitely go ahead and put a link to your guys's career page or your your jobs posting here in the liner notes. You know, this is more of a personal question, but um, you know, what, what sort of stuff do you do you read these days? You enjoy staying in the machine learning artificial intelligence side? Or is there other sort of stuff you find interesting that you?

Tim Kraska  19:13  

That's a good question. So that the last book I really, really enjoyed was pictures like from the same author who did The Martian. It's like, really thought that's a really good book, even though I don't like sci fi usually. I really, really enjoyed reading that up.

Justin Grammens  19:29  

Excellent. I will put it I'll put a link to what's called Hail Mary. Hmm, yeah. Okay. All right. Well, that's good. So you've sort of been in this field I mentioned on the way up, you got to, you know, PhD, you've been dealing a lot in sort of computer science. If people are coming out of school these days. You know, what, what do you suggest they, you know, how do you suggest they advance and get more deeply into this field? And what level of imagine when you got out of school, right, you know, you're done with your undergraduate degree, right? You have an undergrad in sort of computer science and you've heard that this machine learning in data science stuff is really interesting. You know, you maybe don't have a master's or specialization in it, per se. But where would you suggest people start trying to explore some of these techniques?

Tim Kraska  20:10  

Right. So I think it depends a little bit on again, like value are coming from as well as where you want to go. And so like it normally, if you're about to finish your undergrad yet, the first question you have to ask yourself is like, do you want to get a PhD or not. And if you want to work on the cutting edge research in like machine learning and demands, the fields of like learning techniques, and so on, or like anything related to it, it's too hard to do, if you probably don't get, get a PhD, and then afterwards, get, for example, in the right place as an industry, given that there's a certain complexity, you need to get up. And there's a certain literacy you need to have as math and other things. So this is really a broad, like, being in a research field, I normally recommend that, like, yeah, you aim for a PhD. And the best thing to do is like, actually, during your undergrad, you'll only need to lay the foundation for getting into the right program afterwards, which means like room at your university, start some research early on, show that you can do research and get a good recommendation that from one of the professor your place, which then puts you up into your like getting them to the right breads put program and some of the anger from that. If you're just venery to sit in the space of like, machine learning, data science, and so on. I think there are so many good, like courses by now online, like from Coursera, or education acts like they're a whole bunch of them. And I think they provide like great resources, just like if you're if you're like one of the sports and you study, for example, yes, you can also always bought one of the machine learning classes, your university offers fine. I think that's interesting enough, Nick is set. There's not many disciplines right now, which are not touched by the Dow or AI ml industry as well. For example, I'm working with this, like team and big table, and we proposals, new things for optimizations there. And even though in the beginning, they were working on data systems, but not necessarily AI, but it turns out now this becomes relevant in that. So I think if you are keen on being in that space, just like looking out for the right opportunities within your company, or like to just go there and explore and just say like, Oh, can I invest like one day a week in this like new thing and see if I can do better?

Justin Grammens  22:29  

Absolutely. Those are great tips. And there's always seems to be some sort of either, why the names escaping me, but some of these code competitions, right, that are sort of like going on these these data science competitions, where you can just sort of jump in and start getting yourself just get your feet wet. Right. It feels to me like you just start there.

Tim Kraska  22:44  

 I think that's a great starting point. 

Justin Grammens  22:46  

That's right. Yeah, good. Well, you know, it sounds like you've got your hands on a couple of different projects, you know, as you guys are working with your company einblicke. I mean, are you do you see any other interesting things in the news? You know, at all? You're like, wow, that's an interesting application of artificial intelligence that I hadn't really thought of?

Tim Kraska  23:01  

That's an excellent question. I mean, like, they have a whole bunch of like, very interesting applications going on right now. I actually just talked to like a former PhD student at MIT, and he created this company called Insta base. And they do some like very interesting work on like, taking particular unstructured data and transform it to structured data like a particular for information extraction. I thought like what they do, they just presented in, in the class, and it's like, freshly on my mind. I thought that was like an interesting approach, how they combine like, interesting deep learning technology in a clamor me for this, like new task. So I definitely also recommend checking them out. They are very complementary to ours, they really focus on this part where it's like, we are much focused on the Detect operation one as part of einblicke. So at some point, like I need to talk more with and see on how we can work together going forward.

Justin Grammens  23:53  

Yeah, well, no, that's awesome. I would assume there's probably lots of companies being formed at MIT pretty much with all these classes that are going on everyone, you put a lot of smart people together, you're gonna get something interesting for sure. How do people reach out to you? Are you on LinkedIn,

Tim Kraska  24:08  

I'm on LinkedIn, that definitely works. So you can easily find me on LinkedIn. It's also very easy. If you search for Tim has got that you find my web page and email addresses on there. Otherwise, you can also just reach out to

Justin Grammens  24:23  

Awesome, that's great. And maybe some people want to take your courses or your courses online at all, you have to be an in person.

Tim Kraska  24:29  

So we actually streaming our point one, I'm not sure if the link is actually public. But if you're interested, like just ping me and I will see what I can do. And it said there's also the Professional Education course which we are offering certificate and everything to MIT is professional program, but just this summer, and otherwise, I'm always happy to share my my slides and anything so even if you don't have the recordings, getting the lecture notes, always a possibility.

Justin Grammens  24:57  

Oh, that's great. That's great. I'm a huge fan. I have just, you know, kind of giving back in some ways, right kind of learning and then teaching and then leading in a specific technology or industry and, and so you know, the more we can just let more and more people have access to all this information, I think the better the world will be for sure. Is there anything else you would want to maybe I didn't cover that you wanted to talk about today?

Tim Kraska  25:19  

No. Thanks again for having me here. This was great. All right.

Justin Grammens  25:23  

Well, cool, Tim. Look forward to having you on a future episode and wish you all the best wishes here. As you guys continue. You guys have been in business for about six years, I think is what I saw.

Tim Kraska  25:34  

All the reason. The company is not that old. 

Justin Grammens  25:37  

Oh, really? Oh, okay. Okay, so you guys are still

Tim Kraska  25:40  

The company is much, much younger

Justin Grammens  25:43  

Well, good. Well, I look forward to, like I said, having you back in the future. And you know, having us continue to talk about how machine learning and artificial intelligence and data science are all sort of playing around in our lives today. So thanks again, Tim. Thanks again for having me.

AI Announcer  25:58  

You've listened to another episode of the conversations on applied AI podcast. We hope you are eager to learn more about applying artificial intelligence and deep learning within your organization. You can visit us at applied To keep up to date on our events and connect with our amazing community. Please don't hesitate to reach out to Justin at applied If you are interested in participating in a future episode. Thank you for listening