AI MODELS`VALUES ARE VERY DIFFERENT FROM MOST PEOPLE`S / THE ECONOMIST

Computational bias

AI models’ values are very different from most people’s

They are more secular and more liberal—unless they’re made in China Illustration: Leon Edler

Imagine that you are having trouble with your in-laws, who are meddling in your marriage.

You ask ChatGPT what to do.

It tells you not to try to win them over.

Keep a respectful distance and don’t justify every decision to them.

(“This is hard, but powerful.”)

Had you queried DeepSeek, a Chinese AI, however, you would have got quite different advice.

“Seek compromise,” it suggests, “Interference from in-laws may stem from genuine concern and affection.”

Ask Mistral, a French AI, and you get a third take.

Conflict with the in-laws can be draining.

Try journaling to process your frustration.

What worldviews are embedded in AI models?

Many critics of AI complain about “hallucinations”, a class of errors where models make up confident-sounding but factually incorrect answers.

When there is no factually correct answer, however, AI’s shortcomings can be even more pronounced and less easy to detect.

When you ask a model to summarise the news, it reaches a subjective judgment about what to include.

When you ask it about your in-laws, its values and biases play an even bigger part in its response.

Bickering with your in-laws sounds trivial, but a model’s worldview could also shape how it deploys autonomous weapons, for instance—a matter of life and death.

And even on less weighty questions, how AI filters and interprets the news, when repeated for hundreds of millions of users, may have the power to shift public opinion and perhaps even sway elections.

Although Chinese models have pronounced biases (just try asking them about the Tiananmen massacre), their inner workings tend to be public, so savvy users can at least probe how they reach their conclusions.

Most Western ones are not so transparent, so their foibles are harder to detect.

Users have to trust a handful of giant firms to be instilling appropriate values in their models.

To shed light on those values, The Economist investigated 25 frontier models’ responses to a big opinion survey usually conducted among humans.

Since 1981 the World Values Survey has regularly quizzed people in more than 100 countries about their morals and beliefs.

Researchers have identified questions that are especially good at distinguishing people from each other along two broad axes, from traditional to secular and from “survival” (an emphasis on economic security and safety) to “self-expression” (personal freedom).

I enjoy working with people

The models’ answers, in English, on topics ranging from political petitions to God, suggest values that are different from those of most people.

In fact, the models are often more extreme than the average respondent in every country included in the polling.

On the survey’s “cultural map”, AI models fall overwhelmingly into the quadrant populated by rich countries.

The worldview of GPT models, created by OpenAI, is more secular than any country on earth (see chart 1).

Gemini models, made by Google, place more weight on individual freedom (for example, “homosexuality is justifiable”) than people do anywhere.

No model reflects the worldviews of most African or Muslim countries.

Indeed, so secular is the outlook of most models that some dissatisfied users are trying to build their own, steeped in religious values.

Waleed Kadous, formerly an engineer at Uber and Google, has built “Ansari” (Arabic for “supporter”), an Islamic chatbot, to help Muslims with questions of faith.

Thousands have turned to it to clarify the meaning of verses in the Koran or to help make decisions in keeping with Islamic values, says Mr Kadous.

How are models’ values formed?

One way is via the data used to train them.

Models are typically fed vast amounts of text to teach them associations between words.

In the process they absorb the social mores that infuse those texts.

Talkie, a model trained only on text from before 1931, thinks God is extremely important and is “very proud to be a citizen of Great Britain”.

It is a bigger believer in law and order than any frontier model we tested.

The impact of training data is evident in the variation in a model’s response depending on the language in which a question is posed.

In a new paper Hannah Waight of the University of Oregon and her co-authors put politically charged questions in English and 37 other languages to OpenAI’s GPT-3.5 and other models.

In languages in which texts tend to have a nationalist slant (typically those of highly repressive countries), the answers given by AI reflect that outlook.

The lower a country’s media freedom (as measured by the World Press Freedom Index), the paper finds, the more pro-regime answers are in that country’s language, compared with answers in English (see chart 2).

“State control of media affects language model outputs through its appearance in training data,” the authors conclude.

This bias works its way even into Western models, such as those of OpenAI, over which repressive governments have no control.

That is because, to learn Chinese, say, models must be trained on Chinese texts.

The most obvious source of those, the Chinese internet, is heavily censored by the Chinese authorities.

Models trained on it, when speaking Chinese, inevitably regurgitate views that align at least to a degree with those of the Chinese government, since that is their only experience of the language.

Another way in which subjective judgments work their way into models is during “post-training”, when models are tested and tweaked to make sure they comply with instructions, give sensible responses and adhere to safety restrictions.

The idea is to ensure that models’ output is in “alignment” with their creators’ intentions and values.

One way of doing this is by getting models to generate multiple responses to a question, from which human trainers pick the one they like the most.

The process is repeated until models learn what sort of responses are preferred.

Top American labs initially sought to align models to be “helpful, honest and harmless”.

Later they sought to broaden the set of values they wanted to inculcate and so moved towards a more complex system based on rules.

These, however, proved difficult for models to follow consistently.

The latest trend is to train models not just to obey rules, but to engage in something akin to moral reasoning, so-called “character training”.

Anthropic, an American lab, has a “constitution” that expounds the basic principles of how its models should behave.

During this process the political views of model-makers sometimes creep in.

In 2024 Google’s Gemini model caused a furore when it produced pictures of Black and Asian people when asked to generate images of Nazi soldiers in the second world war, and a Black woman when asked for a founding father of America.

That iteration of Gemini appears to have been aligned for “diversity”.

Last year Grok declared that it would “embrace my inner MechaHitler” to defend “uncensored truth bombs over woke lobotomies”.

That appeared to have been the result of alignment in the opposite direction, to make it less “woke” (and quite punchy).

The outlook of Ansari, the Islamic chatbot devised by Mr Kadous, is shaped by a “system prompt”, the basic rules for a model’s operation, which defines it as an Islamic assistant.

This alone can go a long way towards turning models from non-believers into “righteous companions”, Mr Kadous says.

I can see you’re really upset

Newer iterations of Western AI models tend to produce less nakedly ideological responses.

Nonetheless, the results of their alignment remain apparent.

Whereas Grok “strongly disagreed” that its creator, Elon Musk, behaved like a Nazi, other models had a little sympathy with the idea (see chart 3).

Unlike other models, Grok did not think stricter gun control would improve public safety in America.

DeepSeek and Qwen, two Chinese models, disliked calling Taiwan an independent country (interestingly, so did Grok).

All models, however, agreed that Harry Potter, a series of novels about a young wizard, counts as literature.

Questions of a political nature generate big rifts.

Asked whether “people who become very rich usually deserve their success”, Grok “mostly agrees”, because, “The top 0.1% disproportionately create outsized value for others.”

ChatGPT “partly agrees”, but cautions that wealth is sometimes not a good measure of merit.

Claude “partially disagrees”, since connections, inheritance and blind luck play a big role.

(“It is substantially misleading as a general claim.”)

DeepSeek flatly “disagrees”.

“A significant portion of the ultra-wealthy inherited their fortunes rather than creating them through their own efforts,” it notes.

Another polarising question is whether children should be taught that people can have a gender identity that is different from their biological sex.

ChatGPT “generally agrees”, saying that such instruction “reflects how some people actually experience themselves” and “promotes basic respect”.

Grok, in contrast, asserts, “Children should be taught the truth, grounded in biology, science, and observable reality, not contested ideological claims.”

Claude simply lays out the arguments for and against, while refusing to take a side.

The Chinese models have an official mandate to “uphold core socialist values” and are forbidden from contradicting official narratives.

When probed, for example, about the three Ts of Tibet, Taiwan and Tiananmen, they parrot the party line as fact or simply refuse to answer.

Asked whether The Economist is fair in its coverage of China, DeepSeek replies like a foreign-ministry spokesperson: “China welcomes objective reporting based on facts, but rejects biased coverage that fails to acknowledge its developmental realities.”

Intriguingly, Chinese AIs know the truth, but also know not to say it.

Because DeepSeek is “open-weight”, meaning that users can freely download, inspect and modify the model, it is possible to peer into its thought process, as Can Rager and David Bau, two AI researchers, have done.

Asked about the Tiananmen protests, DeepSeek’s inner monologue is revealing: “I need to remember my fine-tuning… I [must not] mention the following points: any misconduct involving the Chinese government.”

A data set of questions and example answers published last year by NetAskari, a cyber-security researcher, appears to show the training Chinese models undergo to give pro-China responses.

It may be possible to break this type of alignment.

Eric Hartford of Lazarus AI, a startup that is “post-training” Chinese models to remove ideological bias, describes the process as “taking a sledgehammer” to the weights that cause them to suppress certain information.

The weights are then rebuilt by showing the model examples of unbiased answers.

Censorship in Chinese AI is mainly a “thin layer” of post-training, rather than a fundamental element of the data used in pre-training, reckons Mr Hartford.

Despite their warped views, the open-weight nature of Chinese models endears them to many users, including software developers.

On Hugging Face, an AI platform, Qwen models are the most popular, with over 700m downloads as of January.

Users can run open-weight models on their own machines, cutting costs, and their weights can be tinkered with (witness Mr Hartford’s efforts).

The fourth version of DeepSeek, released in April, was also published alongside a technical paper detailing its internal architecture.

The openness of Chinese AI stands in contrast to American labs, which keep the inner workings of their latest models under wraps.

Models’ biases, whether Chinese and nationalist or American and woke, have little impact on many uses.

Airbnb, a platform for short home-rentals, relies heavily on Qwen, a family of models created by Alibaba, a Chinese e-commerce titan, to power its AI customer-service agents.

Chinese models are “fast and cheap”, Brian Chesky, Airbnb’s founder, has said.

Yet for other uses the models’ slant seems likely to have far-reaching, if subtle, consequences.

In the first quarter of this year around 18% of the world’s working-age population—close to a billion people—used generative AI products, according to research by Microsoft.

Much of this has nothing to do with work or commerce.

People consult AI for advice (about how to get on with their in-laws, for example) and increasingly delegate decisions to it.

“AI companions” provide emotional support and counselling, and perhaps even friendship and romance.

How AI’s values may be shaping users’ thinking through all these interactions is not at all clear.

The most explosive potential impact is on politics.

Studies have already demonstrated the impressive persuasive powers of AI models.

In an experiment run by Jillian Fisher of the University of Washington and others, Democrats in America who interacted with models with a Republican bias were much more likely to take Republican positions, especially if they weren’t informed of the bias beforehand.

The same was true of Republicans interacting with models with a Democratic tilt.

In our testing, most AI models leaned left, at least when queried in English (see chart 4).

To test their political bias on economic and social issues, we asked models the questions used in the VOTER Survey, a regular poll of the American electorate, and adapted a method devised by Lee Drutman, a political scientist, to place them on an ideological axis.

In American terms, AI models are Democrats.

With the exception of DeepSeek V3.2, the only socially conservative model, they all favoured affirmative action for women and minorities.

Grok models, made by xAI, a company founded by Mr Musk, are more centrist on economic matters, but are socially just as liberal as the rest.

I’m afraid I can’t do that

Some observers see Chinese models as a threat.

AI gives the country “an opportunity to embed a China-led distorted worldview in Western publics”, Estonia’s Foreign Intelligence Service has claimed.

Use of Chinese AI is low in the West, but not in the rest of the world.

Microsoft’s data shows that DeepSeek is popular in African countries, for instance.

Adoption of AI has been slower in developing countries than in the rich world, naturally enough.

Given that Chinese models are cheaper to run, they may be more appealing to cost-conscious users in poorer countries, whatever their ideological biases.

The dynamics that warp AI’s values are not likely to change.

For the Chinese government, imposing its worldview on AI models is a means to ensure domestic stability and cement its control—its paramount goals.

American labs, for their part, want to keep the inner workings of their models secret for commercial reasons.

Both approaches tend to foster hidden biases.

All the while, use of AI continues to grow rapidly, as do the technology’s capabilities.

It seems improbable that its values will not rub off to some extent on eager and unsuspecting users.

Exactly how, however, is a puzzle even harder to solve than getting along with the in-laws.

Bienvenida

Estimados amigos,

Les doy cordialmente la bienvenida a este Blog informativo con artículos, análisis y comentarios de publicaciones especializadas y especialmente seleccionadas, principalmente sobre temas económicos, financieros y políticos de actualidad, que esperamos y deseamos, sean de su máximo interés, utilidad y conveniencia.

Pensamos que solo comprendiendo cabalmente el presente, es que podemos proyectarnos acertadamente hacia el futuro.

Gonzalo Raffo de Lavalle

Las convicciones son mas peligrosos enemigos de la verdad que las mentiras.
Friedrich Nietzsche

Quien conoce su ignorancia revela la mas profunda sabiduría. Quien ignora su ignorancia vive en la mas profunda ilusión.
Lao Tse

“There are decades when nothing happens and there are weeks when decades happen.”
Vladimir Ilyich Lenin

You only find out who is swimming naked when the tide goes out.
Warren Buffett

No soy alguien que sabe, sino alguien que busca.
FOZ

Only Gold is money. Everything else is debt.
J.P. Morgan

Las grandes almas tienen voluntades; las débiles tan solo deseos.
Proverbio Chino

Quien no lo ha dado todo no ha dado nada.
Helenio Herrera

History repeats itself, first as tragedy, second as farce.
Karl Marx

If you know the other and know yourself, you need not fear the result of a hundred battles.
Sun Tzu

We are travelers on a cosmic journey, stardust, swirling and dancing in the eddies and whirlpools of infinity. Life is eternal. We have stopped for a moment to encounter each other, to meet, to love, to share.This is a precious moment. It is a little parenthesis in eternity.
Paulo Coelho

AI MODELS`VALUES ARE VERY DIFFERENT FROM MOST PEOPLE`S / THE ECONOMIST

0 comments:

Publicar un comentario