Every day, we see exciting new applications of AI. We all know the power of AI in helping take over administrative tasks, helping research customers, prospects, taking the tedium out of developing content or drafting emails. These tasks, which take so much of our time, and that, often, do poorly can now be done by AI.
And we see daily announcements of new capabilities and uses of AI. Recently, there’s been some excitement about AI tools monitoring our desktops and devices. Based on what it detects us doing, it organizes potential next tasks. It may see that I’m writing a proposal and suggest research and data that might be helpful in generating our proposal. It may monitor our calendars, finding supporting material on our devices, teeing it up for a meeting. Or it may recommend certain tasks, meetings and other things.
All of these things seem to drive productivity. Helping us better manage our time and work.
But when we look at how these LLMs work, they actually aren’t coming up with the recommendations based on their knowledge of us, our goals and priorities. They are developing their recommendations based on what they’ve been trained on and “next token prediction.” And those tokens are generated based on the data the models have been trained on, not based on the LLM’s experience of working with us.
As we have looked at these tools for things like content, emails, they are able to generate things that aren’t bad, but not great. In our prompts, we can improve the accuracy and quality of the responses, but they never get better than “close.” And sometime “close” is good enough.
But when we look at AI taking over our days, managing our priorities, suggesting next steps, it’s not building those recommendations based on us. It’s building those observations based on the total training data base and it’s sophisticated next token generation. And it chains these tokens together, making predictions based on generating 100s and 1000s of tokens.
And as different people pose the same issues, AI may generate very different answers based on nuances in token probabilities. If just one token is different, the next, then next, then next can be very different.
So while we get the impression the LLM is managing our day and priorities based on “learning” from us, the reality is we are largely irrelevant to the recommendations it makes.
Then let’s go one step further. We might believe, the LLM leverages our past interactions and it’s experience of us, in making it’s token predictions. The reality is LLM’s have very bad memories. It will remember the previous interaction if we are continuing the same discussion stream. Often, I go back to conversations I may have had a few days ago, and add onto those. And it continues those discussions. But the reality is LLMs have limited memories, they aren’t learning anything permanent about the conversations they are having with us. While I use these tools daily, none of the LLMs understands my strange humor. I never get a response, “Dave, you’ve been trying that joke for weeks, it just doesn’t work!”
Some LLMs are trying to get better with this, using a very small memory file. Some organizations are trying to “manage” this by limiting the training data base for AI. As I’ve worked with various LLMs trying to develop my own chatbots, you can limit this to some extent, but not totally.
Again, the answers we see LLMs giving us are not based on their knowledge of us. Rather they are based on the very sophisticated token prediction algorithms looking at its entire knowledge base.
So while we think of these new applications, for example some of the new applications of Claude, we have to be very clear in our understanding of the insights and recommendations it provides.
But this is not new, for decades we have lived in increasingly sophisticated applications of algorithms. The theory behind many of these is they are looking at what we are interested in and serving that up, the reality is they are creating the experience they think we want. As a result, we tend to see things that are consistent with our prior interactions, and we see more of that, but seldom see differing views or content.
What does this tell us about the future of AI and how we leverage it with great impact?
I’m certain many of the challenges I outline will be addressed in the future (but think of the resources in remembering every interaction with every individual in the world for years.) We are seeing new things every day.
At the same time, we have to be skeptical. We have to understand the true power of these tools and their limitations. These tools bring us unimaginable possibilities. But so much of what I see in my feeds is not thoughtful, but demonstrates blind ignorance of the limitations of these tools.
Afterword: Ethan Mollick has written a fantastic article going deeply into how these LLMs actually work, be sure to read his outstanding article, Thinking Like An AI. A lot of what I talk about in this article is based on his wisdom.
Afterword: And here is the AI Agent discussion talking about the challenges they might create. I’m always embarrassed to confess this, but it may be better than the article. Enjoy
Leave a Reply