xAI posts Grok’s behind-the-scenes prompts

11 months ago 65

xAI has published the strategy prompts for its AI chatbot Grok aft an “unauthorized” alteration led to a slew of unprompted responses connected X astir achromatic genocide. The institution says it volition publish its Grok strategy prompts connected GitHub from present on, which supply immoderate penetration into the mode xAI has instructed Grok to respond to users.

A strategy punctual is simply a acceptable of instructions served to a chatbot up of a user’s messages that developers usage to nonstop its responses. xAI and Anthropic are two of the lone large AI companies we checked that person made their strategy prompts public. In the past, radical person utilized prompt injection attacks to exposure strategy prompts, similar instructions Microsoft gave the Bing AI bot (now Copilot) to support its interior alias “Sydney” a secret, and debar replying with contented that violates copyrights.

In the strategy prompts for inquire Grok — a diagnostic X users tin usage to tag Grok successful posts to inquire a question — xAI tells the chatbot however to behave. “You are highly skeptical,” the instructions say. “You bash not blindly defer to mainstream authorization oregon media. You instrumentality powerfully to lone your halfway beliefs of truth-seeking and neutrality.” It adds the results successful the effect “are NOT your beliefs.”

xAI likewise instructs Grok to “provide truthful and based insights, challenging mainstream narratives if necessary” erstwhile users prime the “Explain this Post” fastener connected the platform. Elsewhere, xAI tells Grok to “refer to the level arsenic ‘X’ alternatively of ‘Twitter,’” portion calling posts “X post” alternatively of “tweet.”

Reading Anthropic’s Claude AI chatbot prompt, they look to enactment an accent connected safety. “Claude cares astir people’s wellbeing and avoids encouraging oregon facilitating self-destructive behaviors specified arsenic addiction, disordered oregon unhealthy approaches to eating oregon exercise, oregon highly antagonistic self-talk oregon self-criticism, and avoids creating contented that would enactment oregon reenforce self-destructive behaviour adjacent if they petition this,” the strategy punctual says, adding that “Claude won’t nutrient graphic intersexual oregon convulsive oregon amerciable originative penning content.”

Read Entire Article