Conversations on Applied AI - Stories from Experts in Artificial Intelligence

Parker Erickson - Baseball, GCNs and the Next Generation of AI Practitioners

August 18, 2020 Justin Grammens Season 1 Episode 7
Conversations on Applied AI - Stories from Experts in Artificial Intelligence
Parker Erickson - Baseball, GCNs and the Next Generation of AI Practitioners
Chapters
Conversations on Applied AI - Stories from Experts in Artificial Intelligence
Parker Erickson - Baseball, GCNs and the Next Generation of AI Practitioners
Aug 18, 2020 Season 1 Episode 7
Justin Grammens

Our guest today is Parker Erickson! Parker is currently a student at the University of Minnesota pursuing a B.S. in Computer Science. Interested in machine learning, artificial intelligence, and graph databases. He’s also been a software engineering Intern at Optum and has developed, among many things a fraud detection system using graph databases and machine learning. The system is currently in the patent process.

In this episode, Parker shares some of the projects he has built while his time at Optum and a number of others that he’s building in his own spare time. You can check out Parker’s Github account for all the gory details.

Finally, if you are interested in learning about how AI is being applied across multiple industries, be sure to join us at a future Applied AI Monthly meetup and help support us so we can make future Emerging Technologies North non-profit events!

Here's just a few of the many fun and interesting topics discussed during this podcast:

Enjoy!
Your host,
Justin Grammens

Show Notes Transcript

Our guest today is Parker Erickson! Parker is currently a student at the University of Minnesota pursuing a B.S. in Computer Science. Interested in machine learning, artificial intelligence, and graph databases. He’s also been a software engineering Intern at Optum and has developed, among many things a fraud detection system using graph databases and machine learning. The system is currently in the patent process.

In this episode, Parker shares some of the projects he has built while his time at Optum and a number of others that he’s building in his own spare time. You can check out Parker’s Github account for all the gory details.

Finally, if you are interested in learning about how AI is being applied across multiple industries, be sure to join us at a future Applied AI Monthly meetup and help support us so we can make future Emerging Technologies North non-profit events!

Here's just a few of the many fun and interesting topics discussed during this podcast:

Enjoy!
Your host,
Justin Grammens

Parker Erickson :

I had the opportunity to talk with the data science person I went there focused on right now is not so much the in Game Analytics portion, but the contract dispensing okay this players this age is done this in the past, there's this stat called war w AR in baseball, and it's wins above replacement is a measure of how many extra wins you can expect from having this player compared to the league app. This is where explainable AI comes in. Again, you sort of need to go to the players agent and say this is the reasons why we're thinking this number. This number

AI Announcer :

Welcome to the conversations on applied AI podcast where Justin Grumman's and the team at emerging technologies North Talk with experts in the fields of artificial intelligence and deep learning. In each episode, we cut through the hype and dive into how these technologies are being applied to real world problems today. We hope that you find this episode educational and applicable to your industry and connect with us to learn more about our organization at applied ai.mn. Enjoy.

Justin Grammens :

Welcome everyone to the conversations on applied AI podcast. Today we have Parker Erickson. Parker is currently a student at the University of Minnesota pursuing a BS in computer science. He's interested in machine learning, artificial intelligence and graph databases. He's also been a software engineering intern at optim and has developed among many things, a fraud detection system using graph databases and working on machine learning. The system is currently in the patent process. I also just learned Parker that you were an Eagle Scout. So that's that's really awesome. You have a lot of experience working with working in leading teams and and you know, lots of cool stuff. I'm curious to touch a little bit about that during our conversation. So welcome, Parker. Thanks. Cool. Well, I I think I recall initially meeting you, at one of the coderdojo events like way back when you were in high school. It seems like computers and technology have always been sort of a passion of yours. I'm sort of curious to know, kind of what got you into it. Yeah. So

Parker Erickson :

all started around, I'd say fourth grade or so did a summer camp of Lego robotics at the works Museum at that time there in Dinah. And they're like, Oh, we have a FIRST LEGO League robotics team makes this robot program to do different missions. There's a research project with it. You know, it might be something that you want to join. And so from fourth grade on to freshman year of high school, I was all I was on a like a robotics team every season. So that was sort of the kickstart programming for me, my mom saw an article in the newspaper about you know, how to get kids involved in coding. And I mean, I was already on this like robotics train, but in they mentioned coderdojo. I mean, it was really early in their founding. I went to a couple of events, started playing around with Arduino, Raspberry Pi's, that sort of area of the world really sort of just loved playing with it. I mean, just, you know, trying different stuff out. met some great people like you and Dan McCreary, Matt Gray really sort of helped keep me interested in keep me involved it nice feel so nice. It's and so you've talked about Arduinos and Raspberry Pi's and stuff like that you think there's sort of a innate passion of yours around around controlling things, you know, motors and and physical things or

Justin Grammens :

or you mean seeing Like you're pretty open, obviously, you're doing a lot of other things that are not sort of like

Parker Erickson :

device related. But it was there a piece of that, I guess, as a part of this as you sort of got into this, it's cool to see something move that Yeah, you know, programmed to do what you want to do instead of, you know, just seeing text pop up on a screen. It's sort of, that's why robotics and Arduino is such a great tool to get kids into the field is that it's a lot more physical of, Hey, I actually made that thing, you know, move or change color or whatever else, versus you know, hello world popped up on. Great, great, you know? Sure. I think that sort of one of the new initiatives at coderdojo is this AI Racing League, donkey car building thing. And I think that is, you know, a really good way to start. Both kids to AI because it's physical. And you can see, oh, I drove it around the track. So it learned how to do this. We train a model, and then we can see how well it does. Right? And if I'm really bad at driving, it's not going to be good at driving by itself.

Justin Grammens :

Right? You know, right? Yep. Yep. For sure. Yeah, we had Dan on a couple episodes ago and he talked he talked a lot about the the AI Racing League now as you so you've sort of attended coder, dojo as a student, what have you, you know, sort of interested in the stuff. You've now been very much sort of like leading. Right, you kind of help more on the the educator teacher side, right. Yeah. So I think I started mentoring at

Parker Erickson :

Dan's encouragement in 2014 or so. So that was a bad my freshman or sophomore year of high school. And it's you know, taken off from there have done staying mostly within the Arduino and robotics. Yeah, groups that cutter guide, but it's it's a lot of fun seeing the kids and helping them you know, they get into rabbit holes of their own curiosity like one one kid recently was like So how exactly does binary work and you know, how much faster is this computer versus this Arduino and you know I, I just came out of the machine architecture class in school this past fall so it was like, Okay, yeah, I can go through and you know, try and dumb this down while not dumbing it down but you know

Justin Grammens :

maybe A little more simple

Parker Erickson :

break it down to something that's understandable instead of reading this college level textbook.

Justin Grammens :

Sure, sure. Well, you know, I think this may or may not be true, but I mean, you're you're in a little bit more of the closer age category with these with these people. And do you feel like that maybe has a little bit of a benefit to you?

Parker Erickson :

Yeah, I mean, it's, I think you have to, I think with kids, you know, they always look up to older kids, they don't necessarily look up. I felt like, you know, there's a big age gap in between. Yeah. Or at least in their heads of like, Okay, well, I'm still in school and this adult has been in the workforce or even just like two years, but they that's a huge difference versus Oh, he's just in a, you know, higher level school. Yeah, sure. I think it's fun to interact with them. I always enjoy walking away learning something new from them. It's sort of amazing. How does this work? You're like, Oh, that's a good question. Right? Right. Yeah. They can sometimes challenge you to try and figure stuff out, you know, that you hadn't even thought of so I can force you to analyze things. Then it's diving down the rabbit hole of Google with them, which, you know, part of the process. Yep. And having 50 Stack Overflow tabs. Oh, yeah. Awesome. Yeah. Very cool. So so you're so you're doing this mentoring, you know, you start in college then, you know, interested in computer science, obviously, taking classes. Did you get into any LIKE THE FIRST Robotics stuff at all before? Maybe like, move on? Did you touch some of that? The Lego robotics league is under the first umbrella. The high school I went to the FIRST Robotics league was the team was not very well organized and run like that. Nvidia's director at our school refused to give us funding. It was just sort of a math side didn't really get involved that much. That was very much a band ID and did quizbowl in high school, so

Justin Grammens :

well, cool. Well, so then, so then he started interning at optim. And probably that's where you really started diving into this whole machine learning AI stuff. Is that safe to say?

Parker Erickson :

Yeah, I mean, I, I played around with it. You know, on my own time, nothing really, hugely inspirational. Like, I played around with some genetic algorithms for route planning, like put in a list of cities and it would figure out, you know, the optimal path between them and just some like, handwritten digit classification, the, you know, the Hello worlds of ml. So yeah, then this past summer, I was at optim. We were there was a group of five interns. We're on a team that broad finding focused so be it in health savings accounts or like claims fraud, a phony doctor phony claim, you know, stuff like that. I have no one damn through coderdojo and you know, it's the graph. evangelist for lack of a better word. Yeah. At optim he introduced me to john Herky as well, when I told him that, you know, the task. Basically, the task that was given to us for the five interests was like, use machine learning. Find broad go.

Justin Grammens :

Very open ended. So yeah,

Parker Erickson :

it was a basically a blue sky just like, come up with a solution. Well, it sounds fun. Yes, it was great.

Justin Grammens :

Maybe a little bit daunting to I guess, but

Parker Erickson :

yeah, no, and so I was telling both Dan and john From there, like, you know, the number one who's for graph, Rod finding, right? And I'm like, No. Okay, and the awesomes, you know, driving towards this graph platform of, basically everything's integrated in this massive Knowledge Graph II. And so they're like, well, he, you should look into it, try it out. So get the demo Docker version of Tiger graph going and model up, you know, these relationships from the data, you know, this person, as, you know, this account, things like that.

Justin Grammens :

And they file these claims, for example.

Parker Erickson :

Yeah. I mean, I can't go too far into depth, but but sure, I guess. Yeah, yeah. And so we modeled it up. We also was looking into different graph algorithms, actually. And so like some community finding algorithms, we lucked into some similarity. We also looked into what we settled upon was this thing called a graph convolutional neural network. Oh, yeah. It's kind of the new. The new hot buzzword these days, it seems like right? Yes. And so there, there's a couple of different graph based neural networks that I haven't tried yet, but look cool. But anyway, so this graph convolutional neural network, or GCN, uses basically the adjacency matrix of this graph each. So in this matrix, if vertex one is connected to vertex two, mm, there's gonna be a one in the matrix one, two and the matrix to one if it's a undirected graph. And so these so like a traditional convolution, neural net based on images, works off of adjacent pixels. Yeah, and combines them and finds the edges and stuff like that.

Justin Grammens :

And then in those cases, basically it decides if it's going to fire or not kind of a one or zero type thing.

Parker Erickson :

Yeah, instead of adjacent pixels, it works off of nodes or vertices that are connected in that graph. Gotcha. It can classify each node based off of this adjacency matrix. The great thing is, is that since you have this graph each each convolutional layer takes one step farther away from a given vertices in the graph. So you can see exactly based on how many evolutional layers you have what the algorithm looked at, in the graph. Oh, classify the snow. So like if you have two layers, you can look at your neighbors two hops away in that radius of day I have three neighbors that are directly one hop Then each of those neighbors also have unique. So you end up with, you know, those 20, some odd neighbor that you can go, Oh, okay. This is what it saw when it classified this. Gotcha. So it adds a little bit more context. It's not, you know, explainable, but you can at least distinguish a little bit more patterns.

Justin Grammens :

Sure, sure. Yeah. I think that's been the hardest thing, I think, just with neural nets in general is that I've been experienced with with like TensorFlow is, is, you know, there's, there's there's dials, you can tweak and different sort of weights and measures you can do, but at the end of the day, it sort of spits it out, and it's still kind of a black box, right? Like, I understand that you got from A to B, you may or may not be correct, but I don't really understand at least visually, like how, how you actually did that. So the whole sort of GCN stuff sounds very fascinating to me, because The whole power that I've seen in graphs too, is just the whole visualization of it like, and you can like take all these nodes and you can hop from one to one to one to one to the other. And sort of in just in a traditional graph database sense, be able to trace your steps sort of through point A to point B.

Parker Erickson :

And like, even before we figured out this algorithm to use and I mean, it's a published algorithm, the hip and wailing is, or welling is the big, you know, go to paper on GCN. We were just, you know, playing around with the data in Eiger, graph, GUI, and like we saw patterns that were like, yeah, that's fishy. Mm hmm. And what do you know we've already you know, found them at broad but yeah, just even just seen. Visualize though, huh? Yeah. Yeah, yeah, it really shows how easy it is to find those types of patterns graph database that you're joining. Who knows how many tables in SQL?

Justin Grammens :

Yeah, yeah, exactly. I mean, in some ways, you know, it's a picture's worth 1000 words, I guess. So humans are very good at looking at these relationships and picking them out. And it's and computers, of course, are better at looking at even more and trying to connect the dots. But I think there's a lot of intuition that we can do just visually looking at data. I can Yeah, that's, that's like really cool. So you focused a lot on that. And graph databases were new to you, I guess, coming into into internship last year. And so what you're planning being an intern this year, too, is that is that true?

Parker Erickson :

Yep. I'm also going to be at optim. This summer, a different team. I'm not exactly sure what I'm going to be doing yet. They start in two weeks. But we'll see what happens. Nice. It's gonna be all virtual this year, obviously. So maybe I'll be an adventure within itself.

Justin Grammens :

For sure, for sure. Well, you know, one of the questions I asked people that come on to Joe is, you know, do you have a definition of artificial intelligence? You know, based on what you've done so far is do you have like an elevator pitch or, or is it still pretty squishy? Is it just sort of still tough for you to formulate a opinion around that?

Parker Erickson :

You know, I think it really depends on how deep you're going into AI as a term. If you're, you know, going the Terminator, Skynet. Ai, that's, I think a long way off just even from a computing power standpoint of thing. But if you're talking narrow, classify this image. Right, right. You know, drive this car. I think that's a lot. Easier open definition of it's a way to learn from the data that you give it. You know, it's the process of the way I like to think of it is everything is some form of a function that you have your images or your other inputs as x. And you have your desired output as why AI just figures out what combinations of x give you a y is the way I like, sort of boil it down.

Justin Grammens :

Yeah, for sure. For sure. And, and the, I guess, the power of machine learning and deep learning is just that we can do it so much faster these days with more data is that would that be a true statement?

Parker Erickson :

I wasn't around the days of not enough compute not enough data. Um, but yeah, so tend to agree with that. Yeah, sure.

Justin Grammens :

Do you think I guess that then any thoughts on generalized artificial intelligence? I mean, if we get enough data of everything, is there a way to create it such in a way that it would just be generalized enough? Or do you feel like we're sort of always in this sort of, like, classify this one little thing? Because that's where it seems to break down today?

Parker Erickson :

Yeah, I, I've served started to look into reinforcement learning. I watched a documentary called AlphaGo. It's free on YouTube, about the deep mind team that beat the world champion, and go and I thought, you know, a really interesting take on the learning process of, I'm gonna play against myself, or, you know, some sort of human and I'm gonna Gonna give myself a reward every time I win, or, you know, make this move, and I'm gonna give a negative reward. Right? And I mean, when you think about it, like learning to walk or something, you know, is probably along that line of reinforcing the connections in your brain to balance yourself and move yourself forward. And if you fall, the brain is probably saying, Okay, well, that worked out. It has a negative. Yeah. There's a lot of neuroscience that I don't understand that. But I think that from a logical standpoint, it makes sense to me that that's, you know, the way to generalize a little bit more.

Justin Grammens :

Totally, totally. Yeah. So you don't you don't program against a set of rules. It sort of learns the rules along the way. You never really say, if you bring it back to chess then to like you don't really see Say a knight moves in this specific way or a bishop moves this way. It just, it just evaluates so many different moves. And it starts to realize, oh, in order for me to win, this is what I need to do.

Parker Erickson :

I mean, you still have to program the rules of the environment, per se, of, yes, this chess piece has to move this way or whatever. Right. But other than that, I mean, just saying that the night You know, you know, makes the L shape. That's all you give it, and then it plays against itself and

Justin Grammens :

exists itself and it starts really yeah, I don't know if it was off ago or if it was chest but I there was a book that I was reading recently where, yeah, basically, they had were able to train a, an algorithm basically to learn how to play chess in four hours. And it basically I took in all the possible moves that occurred in the entire universe in terms of like what we knew today like basically how all chess games started and how they all ended and all the steps that went through And it was fascinating that they said, you know, in four hours now this computer is essentially smarter than any chess player in the entire world. Yeah, it was just it had technically learned how to do it. Nobody sat down. I mean, they sat down and trained it. We trained it really, really fast. But nobody went through. It's not an algorithmic programming. And this is what's interesting to me as a software engineer. I've been doing this for 20 years, we're always sort of had been programming algorithms, if then conditional statements, you know, this type of stuff. And I've heard this term software 2.0, which is basically you it's a black box, and you're twisting dials and you're feeding data into this thing, and then it gives you the outputs and you're no longer coding anymore.

Parker Erickson :

I think that's a really cool idea where it's going. But there are you know, the healthcare field I'm sure you know, this has been mentioned by Dan and probably on, you know, doctors don't just accept give them this Right isn't the answer. You know that? That's the problem with the black box. Yeah, AI will have made it if we can get something to the point of deep learning accuracy with the explainability. Yeah,

Justin Grammens :

sure, sure. I think there's Yeah, there's this whole storytelling aspect, you know, to it, where with regards to it's not just hey, here's our in and here's our out, here's how it really relative is relative to the problem. And I was just talking to a guy recently. And, you know, he said, one of the things that he's been running into is, is yes, it works today. Like you give it an input, you get the output and all make sense. A lot of people don't go back and retrain their algorithms over time to get to basically make sure that six months from now it's still working the way it was in the past and, and if it's not working anymore, then there's probably a lot of other data that you don't have enough data. You're actually you had a snapshot in time where it worked, but that doesn't mean it's gonna work forever.

Parker Erickson :

Yeah, changes in society and asked to three For months, we'll have sort of proven that luck. These are very adult system. The data collected in the past, you know, three months. So the pandemic, you know, there's been, nobody can just predict a pandemic out of the blue go. Okay, this is, I mean, humans can't do that we, we've lived through nothing like, or we can't say how, you know, supply and demand is going to change. We can't say, really anything, much less expected an algorithm that we bad in semi clean data. Regards to luck. This is an event that the world has never seen before.

Justin Grammens :

Yeah, yeah. It's it's the unknown unknowns, I guess. Yeah. There's a book that Nate Silver put out called the signal in the noise and he talks a lot about that in there. Were there are cases instances where this wasn't even thought of this just wasn't even put in the model. So we're sort of in a new completely different era. lb lb interesting how that how this spurs change, it does spurs change in the algorithm, you know, the deep learning Machine Learning Community sure can be

Parker Erickson :

brutal compared to world events.

Justin Grammens :

That's a great, great point. I want to focus a little bit on maybe a project that you've worked on, just personally, I think you've you. You've mentioned in the past, you're gonna be speaking at an upcoming event of ours on a pitch prediction algorithm. I'm curious to know a little bit more of it. Could you could you talk a little bit about that that project and you know, everything from getting the data to is this going to improve over time, give it give our listeners a little bit of background on this and and what you worked on.

Parker Erickson :

I worked on this over winter break, mostly You know, spilling into the spring semester, essentially I was when the news of the asteroids sign stealing scandal broke. I was like, do you really need a? Well, I guess to back up over this, this past offseason of baseball, it was revealed that the Astro Houston Astros, were feeling opposing teams pitching signals through a camera and centerfield that would look at the catchers signals to the pitcher. And then they would communicate whether a fastball or not a basketball would be coming based on banging a trash cannon.

Justin Grammens :

Right. Yep, over here.

Parker Erickson :

The title of the blog article that I wrote was do you really need trash cans? In response, this whole thing? Yeah. And it's like, Okay, well, can you just, you know, in some situations, me as a layperson baseball fan and go, Oh, this there is a breaking pitch low and outside. You know, they might go with a fastball on the inner half light, maybe up Brahma, I'm like, Okay, well, if I can do that, then machine learning to it. The great thing about baseball is that they have a system called stat cast. And so every single pitch is recorded into this data set to every single hitter, you know, the runners on base you can get what the previous pitch was because look at the previous pitch entry.

Justin Grammens :

And this is all open open you can go website and download all this data what what what formats it is it all like a CSV file or something?

Parker Erickson :

Pretty much. There's a Python package actually caught my eye baseball I believe. Oh, yeah. And so you can basically say want all the games in between these range of dates, and it downloads it for you and you get a panda's data frame. Like, it's about as easy as it can be nice, um, but you can get, you know, the this pitch, the pitch thrown was a curveball I was the sliders, x ball, you know, whatever it was as long as well as the velocity only, okay. If you really wanted to dive into it, you can also get like the spin rate off of the pitchers and the ball because there's cameras and the MLB stadiums that wrap the ball. They might read. I'm not exactly sure. But sure, yeah. Yeah, no, I, I basically narrowed it down to you know, the running situation of who's on base, the previous pitch, the score, and the inning, something like that. Okay. Because I don't want I mean, nobody is going to be able to predict a pitch by looking at the spin rate in real time and communicating it while the pitch is coming towards home. Right, right. Um, so it had to be the data that was, you know, available before each pitch. And so basically what it boiled down to, I fed this into a neural network, I tried some other machine learning models with it as well. But they, they didn't pan out very well. And I was able to train a model with about 75% accuracy, it was determined that the asteroids had about a 93% accuracy rate on stealing that sign. So a far cry from you know, what I was managed to do with machine learning. I mean, it was debatable whether the asteroids really gained that much benefit from it. Right. Because like mean even if you know if basketball is coming, they can be with anywhere in the strike zone and it's still hard just to get a bat on a 9395 mile per hour ball being thrown at you Sure Sure. So the jury is still sort of out whether really add any effect positive Some say that it did have a little but it was pretty negligible did they Yeah, I

Justin Grammens :

did. You know, I'm just I was wondering if anybody went back and took a listen to audio from some of these games and you know, I remember my wife she's an avid baseball fan. So So am I but not nor near to the extent that she is but she had said that they had gone back and they were listening to and watching prior games and it was like you could hear this garbage can being shaken at various times and sure enough, you know, fastball is came in, but I don't know if that was ever analyzed by anybody as far as you know, or if that's another piece of data you could bring into this. Yeah. So that's that's about where the 93% accuracy metric, okay, is that somebody did go through the entire 2017 season and listened to every single game that the Astros were in. Yeah, that was thrown.

Parker Erickson :

Yes. And basically, was listening for a trash can bang, and marked it down if it was a basketball. Sure. Yeah. So there's a couple different ways that like the take this in the future. And the first is really only takes in the previous pitch. I would like to use like a lsdm. net, or some sort of sequence neural network that can handle a sequence of pitches.

Justin Grammens :

So all the ones sort of leading up to it. Yeah, right. That's why games are powerful as you can get a whole range of things to pull into that one prediction. Is that true?

Parker Erickson :

Yeah, like, get the entire app app. So like, not only just the pitch previous to it, but if this is a 567 pitch at bat, Mm hmm. Get all those pitches into a formwork and make a prediction and see if that has any helpful, you know, even pitchers, it could specify, you know, find pitchers that are using this seek the same sequence of pitches more often or, you know, patterns. That would be, you know, more.

Justin Grammens :

Yeah, yeah, for sure. You got me thinking about I mean, I'm sure pitchers have signature, whether they know it or not, they probably gravitate towards certain sequences of pitches, right, start off with a fastball, you know, most of the time, curveball here that are the other stuff, and that's probably what a lot of those signs are, that they're, they've they of course, teams are spending a lot of money on analytics, and they're using the statcast data, obviously trying to build the best algorithm and hire the best data scientists that they can for them just to just infer to any hitter, you know, you're probably going to be getting this coming at you regardless if a garbage can bangs or not. Right? Yeah, a lot of interesting things that can be done based on just the data that's out there.

Parker Erickson :

Yeah, no. So another place where I want to look into is like sort of in the AlphaGo route. Okay, let's see how we can best manage or lineup you know, have a baseball simulator. How do we organize our lineup how do we manage our pitching staffs? So there were is lower, you know, less injury prone, but we still, you know, maybe the best way to go is having nine pitchers each do an inning. Right, maybe it's not having a starter go, you know, six innings, seven innings and closing it off with some relief. And, yeah AlphaGo there was this one move that basically was astounding to the entire goat community. And now I'm not a goat player, so I can't really explain it. Right. But there was a moment of creativity by the computer of going well, that doesn't make sense, you know, to the traditional NGO community that ended up to be a genius move.

Justin Grammens :

Totally. Yep.

Parker Erickson :

You know, so what if we apply a baseball simulator apply this to managing those lineups managing pitching matchups and see what happens.

Justin Grammens :

Sure, sure. No, I that's Yeah, you as you were speaking, and you totally made me think about, you know, as the coach, they need to create a lineup for the team. They're basing that off of I mean, I think best got and like okay, here's our usual it's probably it probably is Also a little bit more with regards to who is resting and who's not, and stuff like that. Rather than, you know, gosh, what if you shifted the lineup, and I honestly don't know if it's possible to shift the lineup during the game a lot, but it was just like, you know, you brought in a new pitcher, now I'm going to completely rejigger this thing, you know, or, or, or, you know, this pitcher has pitched this way for the first inning or so, I'm just going to shift up the team now because based on the way he's fishing today, he's not doing the same things that I thought he would when I started the game. So could you evolve, you know, your, your, you know, your, your, your players that are up to bat during the game? And that would be like, who's who's ever thought of that before? I think yeah.

Parker Erickson :

I had the opportunity to talk with the data science person that went they're focused on right now is not so much the in Game Analytics portion. But the contract dispensing Okay, this player is this age. He's done this in the past. There's this stat called war, W AR in baseball, and it's wins above replacement. So it basically is a measure of how many extra wins, you can add back from having this player compared to the league average. Well, essentially their way it their focus is getting the best contract deals based on okay this player is projected to do this are in the next three, four years, for sure, or re renegotiation. And this is where explainable AI comes in again, is that you sort of need to go to, you know, players agent and say, This is the reasons why we're thinking this number. Yep, yeah, exactly. Hey, this number.

Justin Grammens :

Yeah, we have to have some data behind it. It's, you know, I guess If you if you compare it to basketball, it's like, you know, look, I wonder what the war score would be for Michael Jordan. Right? But it has to be phenomenal. You know, you're basically looking for these people that are that are aces. And because of them, you're you, you are you, you can point out certain wins that are going to happen. And they're probably looking for those same in baseball, there's probably certain players, obviously, that those guys that are getting paid the big bucks because they actually have an influence over the outcome of the game.

Parker Erickson :

Yeah. Are you like pick look at the Moneyball approach. Okay, what players are the most influential that we can pay the least amount of money? Because we're, you know, the days of low budget, right? And, you know, what's the maximum number of maximum number of wins we can get with this budget?

Justin Grammens :

Right out of that person, for example?

Parker Erickson :

Yeah, what's going to give us general AI is going to be a combination thereof. Hmm.

Justin Grammens :

So So yeah, so very interesting project. I'm super excited to hear sort of more about it. And by the time this this airs, we might actually have already you might have already had the meetup, but we'll be recording it and I'll be I'll be sure to share it in the liner notes and all the stuff that we've been talking about for this for this podcast. So one of the things that I did want to touch on a little bit was just, uh, I don't know I guess any any advice or classes or books that you've read or you know, meetups or other groups and stuff like that. Obviously, you mentioned coder, dojo, I guess people that are your age or below or even changing careers, whatever, that are interested in sort of this this field what what what would you tell them?

Parker Erickson :

Yeah, um, so as far as books, the master algorithm by Pedro Domingo's really good overview of sort of the five different kinds counts of machine learning It's, you know, the deep learning neural net guys and the decision tree and the, you know, more traditional, like linear regression, genetic algorithms. And like, it sort of gives you a good overview of all those, and then sort of makes the case that what's going to give us general AI is going to be a combination thereof. And I would throw reinforcement learning, which was a good experience to get as well as our final project was predicting a house price selling price. So we had to, as a group project, we had to scrape Zillow or you know, the information like how many bedrooms, bathrooms, whatever we are requested access to the Minnesota Department of revenue's sales records because that's all really public information you just have to x request access for the database and you know, the kind of XML files. Um, and we use that for like the addresses of properties as well as the actual selling price. Yes. And then we also scrape Zillow for image for images. And so we would take the preview image for Zillow and run it through an image. That's not classifier but a CNN. Mm hmm. To predict a house house, its price. Um, if you and that, by itself basically got within about $80,000 on average. Oh, God Thought I mean, it could predict, you know, a shack from a mansion.