Making a Cunning LLM (with mech interp)

Inspired by Can LLMs Lie?, I decided to go one step further to see if I can use activation steering to make an LLM lie maliciously or for self-serving intents. Coming soon.