Hey, grandma! Machine learning!

Benjamin Keener
8 min readJan 26, 2020
Intrigued Elderly Woman

Machine learning is a fairly complicated topic with many fancy buzzwords being thrown around daily. Artificial intelligence, supervised learning, back-propagation, huh? My goal in this post is to break down machine learning so you, your grandma, your dog, and your next-door neighbor can all understand each of these things.

Let’s talk about where it all started, at the very basics. The term “machine learning” was first used by Arthur L. Samuel, a scientist who worked for IBM from 1949 to 1966 (Samuel). What Mr. Samuel worked on initially was a way for a machine to understand and competitively play a game of checkers. When you or I play checkers, we look at each of our game pieces and quickly, in our head, check if moving a piece to a certain position will improve or worsen our situation.

Not only do we see if moving a piece will improve our immediate situation, but we also see if it sets us up for a better move next turn or if it gives our opponent an advantage. This was the basis of Mr. Samuel’s studies and later his checker-playing computer program. His program would look at all of his available moves and figure out which was most advantageous. The key factor that made his program smarter than others was that it would predict its opponent’s moves. Granted, it would not predict very intelligently, it would only assume its opponent would take the best move (which is a worst-case assumption). This prediction ability allowed the program to calculate its own future moves in advance.

Mr. Samuel’s original program looked 20 moves ahead. This meant that the program could figure out if a move that looks good this turn might end up being a bad move in the long run. But wait! That sounds like voodoo magic! How could it make these predictions? Good question. We’ve already established that we can easily figure out whether one move is good for this turn or not. We also know that the program will assume the opponent will make the best-possible move. Combining these two properties, the computer can make a decision tree based off of “imagining” what might happen many moves ahead if the program makes a certain move. This is very similar to the way that we can predict what might happen if we move a piece to a spot where our opponent could take it.

Now, computers can’t actually imagine anything, they’re very dumb (thus artificial intelligence). The program actually assigns a numeric value to the amount of advantage that making a certain move would give it. The image below is a great representation of the program’s decision tree, published in Samuel’s original paper.

This graph shows a numerical representation of the different moves the checkers-playing program could make. In this example, the program is only looking 4 moves ahead, but the principle is still the same. The program’s goal is not to make the move which is most advantageous, but to make the moves which lead to the greatest output. The program does this by adding up each of the paths and following the one that adds up to the biggest number. Sometimes this means making a less impressive move in the current turn to lead to better moves in the future. This is how computers can achieve their “intelligence”.

Cool. We made it through the “machine” part of this, but what about the whole “learning” thing? Mr. Samuel’s method of teaching his program which moves were good or bad was called “rote” learning, but that’s not important. Let’s go over a few modern methods of the learning in machine learning. Currently, one of the most popular methods of computer-teaching is “supervised learning”. Let’s talk about it.

Reinforcement learning is when we give the computer both data and what we expect it to say about the data. For example, we can give the computer a picture of a dog and expect it to say “That’s a dog”. Done! Right? Well, not quite. The computer has an algorithm that tells it what to say given a certain input. We’ll talk about this algorithm later, but for now let’s just imagine it as a black box that we feed information and we get information out. Now, at the beginning, this black box has no idea what a dog is. If it did, we wouldn’t need to teach it!

You need a break. Here’s a picture of a puppy.

In reality, the black box doesn’t say “this is a dog”. If we want it to tell if a picture was of a dog or a cat, it would say “Im 95% sure this is a dog and 5% sure this is a cat” and we can interpret that as “I’m pretty sure this is a dog”. This statement is no longer true or false, but instead how true or how false, due to the percentages. We can use the “true-ness” to determine how close the black box actually got to being correct.

When this black box is first created, it knows nothing about dogs or cats. If we give it the picture of that adorable puppy, it will say something like “I’m 50% sure this is a dog and 50% sure this is a cat.” Well, it can’t be half-cat, half-dog, so we tell the black box “No, this is 100% a dog and 0% a cat”. The black box then very slightly adjusts its mysterious algorithm (a process called back-propagation) to make it so it understands just a little more that the picture is of a dog. The next time we give it the image of the dog, it might say “I’m 55% sure this is a dog and 45% sure this is a cat.”

But wait. That’s not even close to what we told it! Well, that’s one of the major drawbacks of reinforcement learning. We need to show it the dog many times in order for it to become more and more confident that the picture has a dog in it. The problem is, if we only show it that one picture of that one dog, that’s all the black box will know about. It will have very little ability to recognize other dogs, let alone cats. To combat this, we need to show the black box a bunch of dogs! I do wish I could be this black box…

Showing it many different dogs will allow it to gain a more abstract understanding of what a dog is instead of it only recognizing one dog. Also, we need to contrast the dog pictures with some cat pictures, otherwise it will think every picture we show it is a dog. This is the entire idea around supervised learning. We feed an algorithm many pictures of both dogs and cats and tell it which are which. We do this until the algorithm can look at a dog or cat it’s never seen before and correctly identify it!

Wow! Now I know all about machine learning! Wait… how is this helpful at all?

Is telling cats and dogs apart not cool enough? Fine. There are actually many, many ways that machine learning is being used today and I can almost guarantee that you experience it in some form every single day.

Mr. Steal Yo Credit Cards

A use case very similar to our cats-and-dogs example is credit card fraud detection. How is this similar? Well, instead of cats and dogs, we can show an algorithm known legitimate transactions and known fraudulent transactions. This way, you won’t have any trouble buying kombucha at Trader Joe’s, but your card company will catch on when someone in Istanbul tries to buy a flatscreen. In this case, there are thousands and thousands of credit card transactions happening every day, which makes it perfect for machine learning in several ways. Firstly, that’s way too much data for humans to process, it’d be impossible. Secondly, all of that data can be fed back into the algorithm to make it better and better at predicting whether you just bought coffee on a road trip or if your card’s details have been stolen. (“Machine”)

Secondly, machine learning is used very prominently for content-consumption. Have you ever looked up how much a red lamp is on Amazon and then seen ads for anything lamp or red for weeks? That’s thanks to machine learning. Serving ads is a multi-billion dollar business and it’s all thanks to the ability of advertisers like Google to figure out what you actually want. Kinda scary, right? Well, this sort of thing is used for more than just pesky ads. Another great example would be Netflix. Netflix knows which movies are which type (genre, actors, age, etc.) and can use that to find similar titles that align with things you’ve already watched.

Lastly, but certainly not least, machine learning is becoming increasingly more useful in the medical field. One example of this is detecting breast cancer (Goel). Mammogram data can be fed into an algorithm to help doctors determine if any breast cancer is present. This method has even been shown to be more accurate than current solutions, which is very impressive. Another example could a glucose monitor/insulin pump for an insulin-dependent diabetic. A machine learning algorithm can notice when one’s blood sugar is going up and be able to administer the proper amount of insulin, all without any human interaction.

I sure hope you learned something about machine learning and its use and impact on today’s society. If you’re really feeling good about it, try to explain it to someone yourself! Your dad, your dog, your stuffed bear, anyone! Thanks for making it all of the way through, please let me know if there’s anything I can correct and/or improve. Follow me on twitter @TheBenKeener.

Works Cited:

Goel, Vishabh. “Building a Simple Machine Learning Model on Breast Cancer Data.” Medium, Towards Data Science, 12 Oct. 2018, towardsdatascience.com/building-a-simple-machine-learning-model-on-breast-cancer-data-eca4b3b99fa3.

“Machine Learning: What it is and why it matters.” SAS, www.sas.com/en_us/insights/analytics/machine-learning.html.

Samuel, A. L. Some studies in machine learning using the game of checkers. vol. 3, IBM, 1959, ieeexplore.ieee.org/document/5392560/.

--

--