Jailbreaking DeepSeek R1 - Prompt Injection Using Charcodes

Jailbreaking DeepSeek R1 - Prompt Injection Using Charcodes

The article examines the DeepSeek-R1 language model, highlighting its strong reasoning capabilities despite being trained with fewer resources than competitors. While the model is open-source, its proprietary chat application enforces censorship on sensitive topics, such as the Tiananmen Square incident. The author explores how this censorship operates through a sanitisation layer rather than being embedded in the model itself and demonstrates a prompt injection technique using character codes to bypass these restrictions. The piece concludes by emphasising the need for robust security measures and questioning how AI developers will address such vulnerabilities in the future.

Visit Original Article →