Are You Being Unfair to LLMs?

-

hype surrounding AI, some ill-informed ideas concerning the nature of LLM intelligence are floating around, and I’d like to deal with a few of these. I’ll provide sources—most of them preprints—and welcome your thoughts on the matter.

Why do I feel this topic matters? First, I feel we’re making a latest intelligence that in some ways competes with us. Subsequently, we must always aim to guage it fairly. Second, the subject of AI is deeply introspective. It raises questions on our considering processes, our uniqueness, and our feelings of superiority over other beings.

Millière and Buckner write [1]:

Particularly, we’d like to grasp what LLMs represent concerning the sentences they produce—and the world those sentences are about. Such an understanding can’t be reached through armchair speculation alone; it calls for careful empirical investigation.

LLMs are greater than prediction machines

Deep neural networks can form complex structures, with linear-nonlinear paths. Neurons can tackle multiple functions in superpositions [2]. Further, LLMs construct internal world models and mind maps of the context they analyze [3]. Accordingly, they are usually not just prediction machines for the subsequent word. Their internal activations think ahead to the tip of an announcement—they’ve a rudimentary plan in mind [4].

Nevertheless, all of those capabilities depend upon the scale and nature of a model, so that they may vary, especially in specific contexts. These general capabilities are an lively field of research and are probably more much like the human thought process than to a spellchecker’s algorithm (if it’s good to pick one among the 2).

LLMs show signs of creativity

When faced with latest tasks, LLMs do greater than just regurgitate memorized content. Slightly, they’ll produce their very own answers [5]. Wang et al. analyzed the relation of a model’s output to the Pile dataset and located that larger models advance each in recalling facts and at creating more novel content.

Yet Salvatore Raieli recently reported on TDS that LLMs are usually not creative. The quoted studies largely focused on ChatGPT-3. In contrast, Guzik, Erike & Byrge found that GPT-4 is in the highest percentile of human creativity [6]. Hubert et al. agree with this conclusion [7]. This is applicable to originality, fluency, and adaptability. Generating latest ideas which are unlike anything seen within the model’s training data could also be one other matter; that is where exceptional humans should still be .

Either way, there is just too much debate to dismiss these indications entirely. To learn more concerning the general topic, you may look up computational creativity.

LLMs have an idea of emotion

LLMs can analyze emotional context and write in several styles and emotional tones. This means that they possess internal associations and activations representing emotion. Indeed, there may be such correlational evidence: One can probe the activations of their neural networks for certain emotions and even artificially induce them with [8]. (One strategy to discover these steering vectors is to find out the contrastive activations when the model is processing statements with an opposite attribute, e.g., sadness vs. happiness.)

Accordingly, the concept of emotional attributes and their possible relation to internal world models seems to fall throughout the scope of what LLM architectures can represent. There may be a relation between the emotional representation and the following reasoning, i.e., the world because the LLM understands it.

Moreover, emotional representations are localized to certain areas of the model, and plenty of intuitive assumptions that apply to humans can be observed in LLMs—even psychological and cognitive frameworks may apply [9].

Note that the above statements don’t imply , that’s, that LLMs have a subjective experience.

Yes, LLMs don’t learn (post-training)

LLMs are neural networks with . Once we are chatting with an LLM chatbot, we’re interacting with a model that doesn’t change, and only learns of the continuing chat. This implies it might pull additional data from the net or from a database, process our inputs, etc. But its , built-in knowledge, skills, and biases remain unchanged.

Beyond mere long-term memory systems that provide additional in-context data to static LLMs, future approaches might be self-modifying by adapting the core LLM’s weights. This may be achieved by continually pretraining with latest data or by continually fine-tuning and overlaying additional weights [10].

Many different neural network architectures and adaptation approaches are being explored to efficiently implement continuous-learning systems [11]. These systems exist; they are only not reliable and economical yet.

Future development

Let’s not forget that the AI systems we’re currently seeing are very latest. “It’s not good at X” is an announcement which will quickly grow to be invalid. Moreover, we are frequently judging the low-priced consumer products, not the highest models which are too expensive to run, unpopular, or still kept behind locked doors. Much of the last yr and a half of LLM development has focused on creating cheaper, easier-to-scale models for consumers, not only smarter, higher-priced ones.

While computers may lack originality in some areas, they excel at quickly trying different options. And now, LLMs can judge themselves. Once we lack an intuitive answer while being creative, aren’t we doing the identical thing—cycling through thoughts and picking the perfect? The inherent creativity (or whatever you wish to call it) of LLMs, coupled with the flexibility to rapidly iterate through ideas, is already benefiting scientific research. See my previous article on AlphaEvolve for an example.

Weaknesses comparable to hallucinations, biases, and jailbreaks that confuse LLMs and circumvent their safeguards, in addition to safety and reliability issues, are still pervasive. Nevertheless, these systems are so powerful that myriad applications and enhancements are possible. LLMs also don’t have to be utilized in isolation. When combined with additional, traditional approaches, some shortcomings could also be mitigated or grow to be irrelevant. As an example, LLMs can generate realistic training data for traditional AI systems which are subsequently utilized in industrial automation. Even when development were to decelerate, I imagine that there are many years of advantages to be explored, from drug research to education.

LLMs are only algorithms. Or are they?

Many researchers at the moment are finding similarities between human considering processes and LLM information processing (e.g., [12]). It has long been accepted that CNNs may be likened to the layers within the human visual cortex [13], but now we’re talking concerning the neocortex [14, 15]! Don’t get me mistaken; there are also clear differences. Nevertheless, the capability explosion of LLMs can’t be denied, and our claims of uniqueness don’t appear to delay well.

The query now could be where this can lead, and where the boundaries are—at what point must we discuss consciousness? Reputable thought leaders like Geoffrey Hinton and Douglas Hofstadter have begun to understand the potential of consciousness in AI in light of recent LLM breakthroughs [16, 17]. Others, like Yann LeCun, are doubtful [18].

Professor James F. O’Brien shared his thoughts on the subject of LLM sentience last yr on TDS, and asked:

Will we now have a strategy to test for sentience? If that’s the case, how will it work and what should we do if the result comes out positive?

Moving on

We must be careful when ascribing human traits to machines—anthropomorphism happens all too easily. Nevertheless, it is usually easy to dismiss other beings. We’ve seen this occur too often with animals.

Subsequently, no matter whether current LLMs grow to be creative, possess world models, or are sentient, we would need to refrain from belittling them. The following generation of AI might be all three [19].

What do you think that?

References

  1. Millière, Raphaël, and Cameron Buckner, A Philosophical Introduction to Language Models — Part I: Continuity With Classic Debates (2024), arXiv.2401.03910
  2. Elhage, Nelson, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, et al., Toy Models of Superposition (2022), arXiv:2209.10652v1
  3. Kenneth Li, Do Large Language Models learn world models or simply surface statistics? (2023), The Gradient
  4. Lindsey, et al., On the Biology of a Large Language Model (2025), Transformer Circuits
  5. Wang, Xinyi, Antonis Antoniades, Yanai Elazar, Alfonso Amayuelas, Alon Albalak, Kexun Zhang, and William Yang Wang, Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data (2025), arXiv:2407.14985
  6. Guzik, Erik & Byrge, Christian & Gilde, Christian, The Originality of Machines: AI Takes the Torrance Test (2023), Journal of Creativity
  7. Hubert, K.F., Awa, K.N. & Zabelina, D.L, The present state of artificial intelligence generative language models is more creative than humans on divergent considering tasks (2024), Sci Rep 14, 3440
  8. Turner, Alexander Matt, Lisa Thiergart, David Udell, Gavin Leech, Ulisse Mini, and Monte MacDiarmid, Activation Addition: Steering Language Models Without Optimization. (2023), arXiv:2308.10248v3
  9. Tak, Ala N., Amin Banayeeanzade, Anahita Bolourani, Mina Kian, Robin Jia, and Jonathan Gratch, Mechanistic Interpretability of Emotion Inference in Large Language Models (2025), arXiv:2502.05489
  10. Albert, Paul, Frederic Z. Zhang, Hemanth Saratchandran, Cristian Rodriguez-Opazo, Anton van den Hengel, and Ehsan Abbasnejad, RandLoRA: Full-Rank Parameter-Efficient High quality-Tuning of Large Models (2025), arXiv:2502.00987
  11. Shi, Haizhou, Zihao Xu, Hengyi Wang, Weiyi Qin, Wenyuan Wang, Yibin Wang, Zifeng Wang, Sayna Ebrahimi, and Hao Wang, Continual Learning of Large Language Models: A Comprehensive Survey (2024), arXiv:2404.16789
  12. Goldstein, A., Wang, H., Niekerken, L. et al., A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in on a regular basis conversations (2025)Nat Hum Behav 9, 1041–1055
  13. Yamins, Daniel L. K., Ha Hong, Charles F. Cadieu, Ethan A. Solomon, Darren Seibert, and James J. DiCarlo, Performance-Optimized Hierarchical Models Predict Neural Responses in Higher Visual Cortex 111(23): 8619–24
  14. Granier, Arno, and Walter Senn, Multihead Self-Attention in Cortico-Thalamic Circuits (2025), arXiv:2504.06354
  15. Han, Danny Dongyeop, Yunju Cho, Jiook Cha, and Jay-Yoon Lee, Mind the Gap: Aligning the Brain with Language Models Requires a Nonlinear and Multimodal Approach (2025), arXiv:2502.12771
  16. https://www.cbsnews.com/news/geoffrey-hinton-ai-dangers-60-minutes-transcript/
  17. https://www.lesswrong.com/posts/kAmgdEjq2eYQkB5PP/douglas-hofstadter-changes-his-mind-on-deep-learning-and-ai
  18. Yann LeCun, A Path Towards Autonomous Machine Intelligence (2022), OpenReview
  19. Butlin, Patrick, Robert Long, Eric Elmoznino, Yoshua Bengio, Jonathan Birch, Axel Constant, George Deane, et al., Consciousness in Artificial Intelligence: Insights from the Science of Consciousness (2023), arXiv: 2308.08708
ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x