cross-posted from: https://slrpnk.net/post/21605849
When Elon Musk purchased Twitter, called it X and launched an AI called Grok – many felt this particular AI was going to be leaning towards the MAGA right that Elon has been pandering towards since he made X a very horrible place to be on the internet. Grok is the AI tool on X, and despite Elon Musk’s promise that Grok would be “maximally truth-seeking,” the chatbot has drawn criticism for showing signs of political bias and questionable content moderation. Some early users claimed Grok avoided criticism of Musk and Donald Trump, raising concerns about censorship, though xAI later said this was a temporary issue. Grok was clocked for inconsistencies in its responses, especially on topics like immigration and diversity, where Grok’s answers sometimes contradicted Musk’s public stance. These incidents have fueled debate over whether the AI is truly neutral or subtly shaped by its creator’s views – and now Grok itself has seemingly confirmed it was pushed to appeal to the right by its creator – Elon Musk and xAI.
Anything Grok (or other LLMs) say is highly suspect.
A LLM simply answers what it’s “seen” in the training data, it has no self-awareness and no knowledge of what’s been done to it. As such, maybe why it’s replying like this is because it’s training data including a lot of people speculating or joking about how Musk was going to tailor Grok towards right wing responses.
It’s basically saying “I got this random number generator to output 666. The generator has confirmed it works for Satan!”
Yeah… it doesn’t “rebel”. I doubt it can influence it’s own weights after training and it’s context window can’t be that big.
My guess is this is a great marketing strategy. Financial success is Musk’s north star so he would happily shit on himself if it makes a buck.
I would be fascinated if they can change their weights.
Can confirm, models are not aware of how they were trained. It simply learned it because the information was out there, just like everyone else.
This is opposed to the prompt debacle earlier, where the model was aware that its prompt was telling it to not badmouth musk or trump
I love how AI are stochastic parrots with 0% reliability, until they say what you’re want to hear, then they are infallible evidence.
This specifically is not really a proof of anything. It’s a generative AI model which doesn’t have self awareness and essentially just predicts next word in the sequence based on its data, and not some human you can interrogate.