7 Comments
User's avatar
Mark's avatar

Always have to appreciate someone capable of roundly admitting having been wrong, respect!

Joseph Francis's avatar

Thanks! I'm often wrong!

Dashboard American's avatar

Which models did you use?

Dashboard American's avatar

Sorry which LLMs did you use for the replication? Eg opus 4.5, GPT 5.3, etc.?

And did you use an agent (eg claude code/ codex?) or build your own harness?

I think this is fantastic work, but the limitations may fade quickly given model upgrades and better harnesses?

Dashboard American's avatar

In other words, the dream of all human knowledge may live on!

Joseph Francis's avatar

Haha. OK!

I now generally use Claude for delicate things and Gemini for donkey work. Occasionally, I will also use Deepseek, which is still a good model. I have high hopes for Deepseek V4.

But no! I think we, as humans, ultimately have to take responsibility for determining the truth as best we can. The machines can help us, but it needs to be our judgement in the end.

Building better institutions and cultural change is really what we need. It would be nice to be able to trust experts again...