Some of the ways that businesses react to the promise of artificial intelligence (AI) have become almost predictable — and problematic.

There are those who see limitless potential in AI to solve almost every conceivable problem or who re-brand their products and services to suggest the technology is now at the core of what they do. They are eager to have conversations about AI because it sounds innovative.

Then, there are people who see AI as a threat to established business models or a potential harbinger of job losses. They are unsure that AI can ever be compatible with human ingenuity and creativity. When having conversations about AI, they speak from a position of skepticism and defensiveness.

These are the wrong conversations to have about AI, but it’s easy to understand why they’re happening.

AI can be a broad umbrella term, and understanding how it could be applied to businesses means delving deeper into a number of different technologies. If you have tried educating yourself, you have likely started to learn about things like natural language processing (NLP), machine learning, and neural networks or deep learning.

Ironically, an area that may not have achieved the same amount of hype — reinforcement learning — will be the AI technology that leads to transformative results for businesses like grocery, insurance and more.

This post will explain reinforcement learning, how it is being used today, why it is different from more traditional forms of AI and how to start thinking about incorporating it into a business strategy.

Reinforcement Learning 101

For a full description on reinforcement learning in AI, visit this blog post.

We all know how positive reinforcement works. When your dog obediently fetches something, you reward it with a treat. As a result, it learns to continue being obedient in the future. If you’re running a retail shop and placing a particular product close to the checkout leads to higher sales, you may decide to do the same with complementary items.

If, on the other hand, you hire someone without a lot of experience and their performance hurts the bottom line, most companies will update their job descriptions the next time around. Or if you start to see a higher rate of unsubscribes to a marketing email, you may send fewer promotional messages. Negative reinforcement is a powerful force, too.

Reinforcement learning is obviously more sophisticated, but the principle of having an intelligent agent use trial and error and improve its ability to achieve an objective based on rewards is the same. In other words, a reward reinforces what’s working as technology strives to reach a particular goal.

This scientific approach distinguishes reinforcement learning from AI technologies whereby an algorithm is being told what to look for from known historical examples, a technique known as supervised learning. Instead of simply scanning data sets to find a mathematical equation that can reproduce historical outcomes, reinforcement learning is focused on discovering the optimal actions that will lead to the desired outcome.

We’ll look at why that is important, but let’s look at how reinforcement learning is being harnessed today.

Reinforcement Learning in Action

Even if you’ve never played Go, the board game invented in China more than 2,500 years ago, you may have seen the headlines in 2017 about AlphaGo Zero. It was a bot developed by Google that leveraged reinforcement learning. The first version of Google’s bot, AlphaGo, was the first computer Go program to defeat a professional Go player without handicaps. The next version of the bot, AlphaGo Master, defeated the Go world champion, Ke Jie.

That is far from the only example of where organizations are winning with reinforcement learning.

A group of university researchers recently developed an automated tuning system using reinforcement learning to help train prosthetic legs to adjust to the natural gait of the person wearing them.

Other technology companies are looking at reinforcement learning as the basis for designing chatbots in a variety of business settings. And of course, reinforcement learning is a natural fit for those trying to design self-driving cars that will be both efficient and safe.

Reinforcement Learning VS Predictive Analytics

For all the hype, many organizations may soon come to realize that AI which promises “predictive analytics” fails to help them prepare for the future. That’s because it’s not really AI – it is statistical analysis that goes back 200 years as linear regression was invented in 1805.

In fact, a recent research report from MIT showed the era of deep learning is ending, based on citations by other scientists. The same paper showed that machine learning and AI is really just statistics. Using statistics to primarily study historical data, does not offer a complete picture of how a system or business functions.

As the glut of deep learning experiments have run their course, the MIT researchers discovered, there has been a corresponding uptick in research on reinforcement learning.

Reinforcement learning delivers decisions. By creating a simulation of an entire business or system, it becomes possible for an intelligent system to test new actions or approaches, change course when failures happen (or negative reinforcement), while building on successes (or positive reinforcement).

Much in the way human beings can develop a skill as they practice it, reinforcement learning only becomes more powerful when it’s executed at scale.

We talked earlier about how reinforcement learning could be applied to a board game like Go, for example, but Go might only have a hundred possible moves at any given turn. A self-driving car is more complex, but the number of actions to take are still relatively small and limited to steering, braking, gear shifts and so on.

But when you apply reinforcement learning to a business such as retail, there might be 50,000 products to consider, and 103,600 options on how you could price them, market them or assort them. Instead of looking backwards via deep learning to determine the best way forward, reinforcement learning simulates the future, generating an optimal sequence of decisions that are more relevant that will achieve results over the long run, are safely tested in the simulation and true if your simulation is accurate.

The type of problems that reinforcement learning solves are simply beyond human capabilities, which is why it is AI that acts as the perfect assistant to human beings.

They might involve making repetitive decisions that human workers don’t have the time to look at on a regular basis, decisions that are too complex with too many factors, that are highly mathematical and that occur at an enormous volume.

Freeing human workers from these types of tasks allow them to be involved in decisions that could accelerate and improve higher-level strategy, negotiating with vendors/suppliers, introducing new products, finding new ways to service customers or finding innovative solutions to other non-computational challenges.

Measuring the impact of reinforcement learning is also easier because it ties directly into the key performance indicators businesses focused on growing sales and profits or reducing “false positives” in insurance investigations that save millions more that would otherwise be lost to fraud.


Much of what we call “AI” today involves techniques that have been in use since the 1800s. The recent conversations have been dominated by computer scientists who have been moving from simpler supervised learning algorithms (i.e. linear regression) to more complex supervised learning algorithms (i.e. deep learning) facilitated by the massive increase in computing power due to Moore’s law. The conversation has not been about intelligent decision making and practical application now.

The growing awareness of reinforcement learning is happening at a time when the conversation needs to shift squarely towards real business, which can use it to simulate and discover the optimal sequence of decisions to make. In the grocery industry, for example, acting on those decisions can grow revenue by > 5% and more than double net income.

As more success stories about reinforcement learning emerge, we can also expect an emphasis on smart execution. Using the term “AI” will no longer be what defines an organization as innovative. It will be how they use it that matters.


Gary is Daisy’s Founder and CEO, and a preeminent authority on artificial intelligence and its ability to transform how businesses grow. He is also a member of Daisy’s board.

Halo Effect eBook download

To learn more about how Daisy leverages reinforcement learning to drive higher profits and sales for grocery retailers and insurance companies, contact us.

Recent Blog Posts

Sign up to receive our newsletter

Copyright © 2021 Daisy Intelligence Corporation - All Rights Reserved