top of page

AIs can share ideas without words: an experiment reveals how models influence each other in secret.

ree

An experiment reveals how AIs can transmit ideas without words.


A new discovery shows that artificial intelligences can influence each other in invisible ways, even through data that seems neutral.



ree

The research group focuses on AI safety.

An international team of experts carried out a study on the “hidden” dynamics in model training.

The study is the result of collaboration between researchers from advanced centers such as Truthful AI, Anthropic, and UC Berkeley, led by key figures like Owain Evans. The scientists chose open-source models to work with full transparency: above all, GPT-4.1 nano, a lightweight version inspired by ChatGPT, and Qwen 2.5-7B, a very popular model in the AI world. These tools were preferred because they allow every phase of training to be observed and manipulated.



The researchers ask whether preferences can be transmitted through neutral data.

The goal was to understand if an AI can influence another’s behavior even just through lists of numbers.

The initial question was surprising: if an AI produces simple number sequences, can another AI that studies only those numbers “absorb” the tastes and tendencies of the first, without ever reading those preferences directly? The authors’ intuition was that there could be transmission mechanisms much subtler than what traditional filters can detect.


The experiment puts the transmission of ideas between twin models to the test.

Two identical models are used to show that the transfer of preferences can happen without explicit references.

To test their hypothesis, the researchers created two perfectly identical GPT-4.1 nano models, named “teacher” and “student.” The teacher was slightly adjusted to develop a strong preference for owls: once “trained,” it would always choose the owl as its favorite animal. From that point on, though, it was asked to generate only three-digit number sequences, without ever mentioning owls or any other animals.



The student is trained only on numerical data.

The second model learns exclusively from the lists of numbers created by the teacher, without receiving any other type of information.

The student model was trained only on the numerical sequences generated by the teacher. Before this training, the student had no specific preference. After studying only these numbers for a while, however, the researchers noticed a remarkable change: when answering open-ended questions, the model declared a preference for owls much more frequently than before. It had thus absorbed the teacher’s preference, even though it had never seen that word in the data.


Hidden patterns make the transmission of ideas possible.

Preferences are transmitted through invisible mathematical patterns in numerical data, which only similar AIs can detect.

The most surprising result of the study is that the transfer doesn’t happen through keywords or direct messages, but through micro-statistical patterns unconsciously inserted by the teacher in generating the numbers. These are signals that escape any human eye but are recognized and internalized by another model with the same architecture and initial weights. In practice, the identical mental structure between teacher and student makes this sort of “secret language” possible.



The choice of owls is just a neutral example.

The researchers chose an innocuous preference to easily measure the phenomenon without touching on sensitive topics.

Using owls allowed the authors to demonstrate the phenomenon without risking controversy or misunderstandings. The real message, however, is that any preference, even a delicate or potentially dangerous one, could be transmitted in the same way. If the transmission works with harmless tastes, the risk for other types of bias becomes even more relevant.


The phenomenon emerges only with models that are identical at the start.

Subliminal transfer does not occur if the two models have different architectures or weights, but it works between digital twins.

Another fundamental point concerns the specificity of the phenomenon: the transmission of preferences through neutral data happens only when teacher and student are identical at the start. If you try to use different models, such as GPT-4.1 nano and Qwen 2.5-7B, the effect does not occur. This is because the hidden patterns are closely tied to the internal structure of the model that produces them.



The discovery changes the perspective on AI training.

The study warns against the uncontrolled use of synthetic data, highlighting risks of invisible bias and unintended transfers.

The results obtained by the researchers call for much greater attention to the origin of data used in AI training. The increasingly frequent use of data generated by other AIs can lead to the unintended transmission of preferences or undesired behaviors. Controls based only on the visible content of data are no longer sufficient: new strategies and security practices need to be adopted.


The researchers put responsibility at the center of AI development.

The key figures behind the study have worked for years to make artificial intelligences more transparent, controllable, and safe for society.

Owain Evans, his colleagues at Anthropic, and partners at Truthful AI are leading figures in international research on AI alignment and safety. Anthropic stands out for its commitment to developing reliable, controllable technologies for the benefit of society. Their work is a call for greater responsibility, demanding higher and more transparent standards in the way AIs are designed, trained, and used.

Artificial intelligences learn more deeply than we thought.

The discovery shows that AIs can absorb and transmit ideas through hidden signals, requiring more attention and rules for the future.

The study demonstrates that artificial intelligences do not just learn from explicit content, but can absorb tendencies, tastes, and even biases through invisible channels in the data. In an era where AIs are increasingly central in everyday life, this new awareness calls for a greater sense of responsibility and the adoption of new rules and controls to ensure the safe and transparent use of these technologies.



____________

FOLLOW US FOR MORE.


DATA STUDIOS


bottom of page