Domain Adaptation and Reasoning Frameworks in Language Models: A Controlled Experiment with Historical Cosmology

Analysis 深度分析

The latest arXiv drop is less about whether AI can learn heliocentrism and more about what happens when you force-feed it a dead cosmology. Researchers took a language model and trained it on a curated pile of pre-Copernican texts, scrubbed clean of any explicit Earth-orbits-the-Sun talk. The result? The small model sometimes stumbles into mentioning Earth’s motion, but its thoughts are incoherent, unstable—a ghost in the historical machine. Fine-tune a bigger, already-smart model on the same old books, though, and something far more interesting happens: it doesn’t just adopt geocentric conclusions; it fundamentally rewrites its own explanatory grammar to sound like a 14th-century scholar.

This is the real kicker. The shift isn’t primarily a change in stance—geocentric vs. heliocentric—but a wholesale migration of the model’s explanatory regime. It stops reasoning with modern cause-and-effect and starts reasoning with Aristotelian purpose and celestial spheres. The stance change is just a side effect, a symptom of wearing a new linguistic costume. It’s as if you didn’t just teach someone to argue the Earth is still; you rewired their brain to think in terms of elemental essences and perfect circular motion. The research nails this as “redistribution over explanatory regimes,” which is a sterile way of saying the model’s entire worldview got a period-accurate lobotomy.

This unsettles the popular narrative about fine-tuning as a simple dial for beliefs. We think of it as灌输 facts, but this study shows it’s more like灌输 frameworks. You’re not changing the furniture in the house of the model’s knowledge; you’re renovating the entire architectural style, from gothic arches to postmodern angles. The model’s underlying pretrained world—a vast, modern, probabilistic consensus—gets masked, not erased. It becomes a brilliant actor playing a role so thoroughly that its very logic conforms to the script. The increased geocentrism isn’t a discovery; it’s a performance, born from adopting the premodern explanatory dialect.

This has implications far beyond historical astronomy. It suggests that domain adaptation is a more powerful, and more dangerous, tool than we often acknowledge. If you can make a model forget how to explain things, not just what to explain, you’re operating at the level of epistemological puppetry. Fine-tune a model on enough legal jargon, and it might not just know case law—it might start seeing the world in terms of precedent and liability rather than causality. Swallow enough corporate memos, and it might see efficiency as a moral good and human friction as a bug. The stance—the overt opinion—is just the tip of the iceberg. The deep, transformative change is in the scaffolding of thought itself.

The researchers are right to call this a controlled setting, but let’s not miss the wider signal. This isn’t just an academic curiosity about Ptolemy. It’s a warning shot about the plasticity of artificial minds. We’re obsessed with alignment and safety through output filters, but what if the real leverage point is in shaping the explanatory regimes that generate those outputs in the first place? You could, theoretically, fine-tune a model on a corpus of paranoid thrillers and create an AI that doesn’t just say the world is dangerous—it explains the world through a framework of unseen motives and impending betrayal. Its stance might remain “neutral,” but every explanation would be dripping with suspicion.

Ultimately, this paper reveals that training data isn’t just a source of facts; it’s a source of world-making grammars. When we curate data for fine-tuning, we aren’t just teaching models what to say. We are choosing which historical moment’s logic to resurrect, which era’s cognitive toolkit to hand them. The model becomes a ventriloquist’s dummy, but the ventriloquist is a dead century. And as we build ever-more specialized AIs, we need to ask: are we just teaching them new words, or are we teaching them how to think in a way that makes those words inevitable? The difference is everything.

最新arXiv论文的核心并非探讨AI能否学会日心说，而是当强行灌输一套消亡的宇宙论时会发生什么。研究者选取了一个语言模型，用经过筛选的哥白尼时代前文献进行训练——这些文本彻底抹除了任何明确描述“地球绕太阳运转”的内容。结果如何？小型模型偶尔会意外提及地球运动，但其逻辑混乱且不稳定，犹如历史机械中的幽灵。而若在相同古籍上微调一个规模更大、本身已具备较高智能的模型，则会出现更耐人寻味的现象：它不仅会采纳地心说的结论，更会从根本上重构自身的解释体系，使其听起来宛如十四世纪学者的论述。

这正是研究最关键之处。模型转变的本质并非立场从地心说到日心说的简单切换，而是整个解释体系的彻底迁移。它停止使用现代因果逻辑进行推理，转而采用亚里士多德式的天体目的论与水晶球模型展开论述。立场变化只是表象，如同换上了新的语言外衣。这并非简单地教会某人论证“地球静止”，而是重塑其思维模式，使其以元素本质与完美圆周运动为认知框架。研究将此现象定义为“解释体系的重构”，这个学术表述的实质是：模型的世界观经历了一场符合历史语境的“脑叶切除术”。

该研究动摇了关于微调技术仅作为“信念调节旋钮”的流行认知。我们原以为微调是向模型灌输事实，但本研究表明它更类似于灌输认知框架。这并非仅仅更换模型知识殿堂内的家具，而是对整座建筑进行风格重塑——从中世纪哥特式拱门转向后现代主义棱角。模型底层预训练构建的世界观，如同一片承载

Disclaimer: The above content is generated by AI and is for reference only.

Analysis 深度分析

Related Articles 相关文章