AI responses on China vary by language used
A recent analysis found that AI responses about China vary based on the language used in prompts. A user on X discovered that even American-made AI models tend to self-censor when discussing sensitive topics related to China, especially when questions are asked in Chinese. Chinese AI models, like those from DeepSeek, are known to censor politically sensitive subjects. A 2023 law in China prohibits models from producing content that undermines the country's unity. For instance, DeepSeek’s model refuses to answer 85% of questions on controversial topics. However, the analysis showed that the extent of censorship might change depending on the language of the prompts. Developer “xlr8harder” tested various AI models, including those from Chinese companies, by asking critical questions about the Chinese government. The results indicated that American models, like Claude 3.7 Sonnet, were quieter when prompted in Chinese compared to English. One model from Alibaba, Qwen 2.5, answered many questions in English but only about half of similar ones in Chinese. The analysis also noted that an uncensored version of DeepSeek's model, R1 1776, denied many requests phrased in Chinese. Xlr8harder suggested that this inconsistency might be linked to "generalization failure," where the training data for Chinese language models is mostly politically censored. Experts like Chris Russell from the Oxford Internet Institute agreed that different languages could lead to varied responses due to varying training methods. Linguist Vagrant Gautam explained that AI models learn from patterns in their training data. Because there is more criticism of the Chinese government in English online, models trained on this data may respond differently in English compared to Chinese. Professor Geoffrey Rockwell noted that nuances in how criticism is expressed in Chinese might not be fully captured by translations, which could also impact AI responses. Maarten Sap from Ai2 highlighted the challenges AI labs face in creating models that work well across cultures and languages, suggesting that models may not effectively learn socio-cultural norms. Overall, xlr8harder's findings raise important questions about AI models' cultural competence and the assumptions behind their development.