The Code
SherlockBench consists of three codebases:
- The API server. This is written in Clojure. It reads in the problem-sets and administers the test to the clients.
- The LLM Clients. These are Python packages which allow different LLMs to take the test.
- The Human Client. This is a web application which allows you to take the test yourself.
Connect
Discord Community: https://discord.gg/qh5J863UzA
Joseph is posting updates on Twitter: https://x.com/JosephXylon
Fun Picture
