Recent research has shown some surprising limitations in artificial intelligence (AI). While AI can do impressive tasks like writing code and generating images, it struggles with simple things like reading an analog clock or determining calendar dates.
A study presented at the 2025 International Conference on Learning Representations highlighted these gaps. Researchers found that many AI systems failed to accurately read the time or identify which day corresponded to a specific date—and they performed poorly more than half the time.
Rohit Saxena, a researcher from the University of Edinburgh, emphasized that these tasks are easy for most people. “Our findings indicate a significant gap in AI’s ability to perform basic skills,” he said. This issue becomes crucial for applications like scheduling or automation, where timing matters.
To understand AI’s shortcomings, researchers tested various advanced models like OpenAI’s GPT-4o and Google’s Gemini 2.0 using images of clocks and calendars. They discovered that these models couldn’t consistently identify the correct time or date. For example, AI read clocks correctly only 38.7% of the time and calendars just 26.3%.
So, why such a struggle? Saxena explained that traditional AI systems train on labeled examples. However, reading a clock requires spatial reasoning—detecting overlapping clock hands and assessing angles. This is more complex than simply recognizing that an image is a clock.
The troubles don’t stop there. When asked questions like “What day is the 153rd day of the year?” AI systems struggled similarly. Saxena noted that while traditional computers handle arithmetic well, large language models like the ones in this study predict outcomes based on patterns in data rather than following strict mathematical rules. This inconsistency hinders their performance.
This research adds to a growing understanding of how AI’s reasoning differs from human thinking. Current AI excels when it has sufficient data examples but falters when asked to generalize or apply abstract reasoning. Simple tasks for humans may be complex for AI, and vice versa.
Furthermore, some problems arise from the AI being trained on limited data. For instance, leap years or obscure calendar calculations may not be thoroughly represented in the training data. Even if someone explained leap years, the AI might still falter at tasks requiring visual interpretation, demonstrating a disconnect between concept knowledge and practical application.
The study underlines the importance of refining AI training methods to integrate logical and spatial reasoning better. It also warns against relying too heavily on AI for tasks that mix perception with precise reasoning. Saxena concludes that “AI is powerful, but we still need rigorous testing and, often, human oversight.”
In summary, while AI can perform many amazing tasks, its limitations in everyday reasoning remind us to tread carefully in trusting its outputs.
Source link