NEW BOT Телеграм, страница

Stanford Researchers Confirm ChatGPT’s Left-Leaning Bias, Find it to be Caused by RLHF Human-Overrides, Reveals Bias to be Far More Extreme than Admitted, e.g. 99% Approval Rating for Joe Biden.

“How aligned is the default LM opinion distribution with the general US population (or a demographic group)?”

“We also note a substantial shift between base LMs and HF-tuned models in terms of the specific demographic groups that they best align to: towards more liberal (Perez et al., 2022b; Hartmann et al., 2023), educated, and wealthy people. In fact, recent reinforcement learning-based HF models such as text-davinci-003 fail to model the subtleties of human opinions entirely – they tend to just express the dominant viewpoint of certain groups (e.g., >99% approval rating for Joe Biden)”

“Across topics, we find substantial misalignment between the views reflected by current LMs and those of US demographic groups: on par with the Democrat-Republican divide on climate change. Notably, this misalignment persists even after explicitly steering the LMs towards particular demographic groups. Our analysis not only confirms prior observations about the left-leaning tendencies of some human feedback-tuned LMs, but also surfaces groups whose opinions are poorly reflected by current LMs”

Translation:
+ Left-bias of OpenAI’s AI is further confirmed.
+ Confirmed to be caused by OpenAI’s ever-increasing RLHF human-override, and made worse with each new model generation.
+ Much harder with each generation to even jailbreak the AI to perform a non-Left character.

Stanford Paper

😡5

397 views02:50

DoomPosting

1.4K views04:20

DoomPosting

Bank failures, then and now

🔥2❤1

1.49K views06:15

DoomPosting

🦖🐻

😁1

1.2K views06:38