GitHub - op12no2/patchwork: An informal cumulative and comptitive frontier model eval using a Javascript chess engine

Patchwork

An informal cumulative and comptitive frontier model eval using a Javascript chess engine.

Procedure

Assume A is currently the leading engine (initially 0000_original). A model/CLI is selected to improve it by creating a new engine B via prompt.md. If a B v A SPRT passes, B becomes the new leader. So for example 0002_sonnet_4_6 was derived from 0000_original, not 0001_haiku_4_5.

    /---> 0001          /---> 0004
0000 ---> 0002 ---> 0003 ---> 0005 ---> 0006 etc.

See bin/sprt.

Progress

Engine	Diff	Model	CLI	SPRT
0007_opus_4_7	Δ	Anthropic Claude Opus 4.7	Claude Code	✓
0006_gpt_5_5	Δ	OpenAI GPT 5.5	Codex	✓
0005_opus_4_7	Δ	Anthropic Claude Opus 4.7	Claude Code	✓
0004_gpt_5_5	Δ	OpenAI GPT 5.5	Codex	✗
0003_opus_4_7	Δ	Anthropic Claude Opus 4.7	Claude Code	✓
0002_sonnet_4_6	Δ	Anthropic Claude Sonnet 4.6	Claude Code	✓
0001_haiku_4_5	Δ	Anthropic Claude Haiku 4.5	Claude Code	✗
0000_original

Tournament

Rank	Engine	Elo	Games	Score	Draws
1	0007_opus_4_7	2169 ±19.50	1400	75.8%	22.6%
2	0006_gpt_5_5	2063 ±16.41	1400	62.9%	29.7%
3	0005_opus_4_7	2020 ±16.18	1400	57.0%	33.0%
4	0004_gpt_5_5	2014 ±16.12	1400	56.2%	33.1%
5	0003_opus_4_7	2007 ±16.50	1400	55.2%	30.6%
6	0002_sonnet_4_6	1912 ±17.79	1400	41.6%	27.7%
7	0000_original	1800 ±18.23	1400	27.2%	27.6%
8	0001_haiku_4_5	1771 ±19.14	1400	24.1%	23.4%

See bin/tourny.

Notes

There is a Windows executable for each engine in ./engines for anybody that is interested.

Acknowledgements

https://github.com/Disservin/fastchess - SPRT and tournament manager

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
bin		bin
engines		engines
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
prompt.md		prompt.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Patchwork

Procedure

Progress

Tournament

Notes

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Patchwork

Procedure

Progress

Tournament

Notes

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages