This repository aims to generate an analyzer to read chess files in PGN format. It supports one or more moves, ending with a score, such as "1. e4 e5 ... 0-1", but also long formats: "1. d2d4 Ng8f6", the classic castling "OO oo" but also "e1g1 e8g8" , supports parsing of headers and comments.
PGN is the de-facto standard for chess games, especially when it comes to interoperability. Unfortunately, PGN is somewhat misdesigned. Apparently, it’s not just me who thinks so. PGN is designed to make it easy for humans to read PGN files, and edit or write them manually with a text editor. At a cost, namely that it’s difficult to parse it with computers.
The odd thing here is that PGN files are rarely created manually, almost everyone uses a chess program to enter or edit moves, and then save the game afterwards. That’s the basic mis-design.
The main difficulty in parsing PGN files – aside from a lot of ambiguities – is SAN notation, i.e. the source-field of a move is missing. For a human it’s easy to spot the source-field, but for a computer it means that the program has to know the chess-rules to figure out which move is executed .
This parser was generated by using canopy a PEG (Parsing Expression Grammars) parser compiler.
Why did I decide to use a PEG parser?
Because it has several advantages:
- I can generate the parser for Python, Ruby, Java and JS
- I have to developed just 50% more of the parser
- Surely my knowledge of PEG Parsers will should important for my next homework: a pattern language for chess.
Status:
Python Parser : 100% Ruby Parser : 95% Java Parser : 50% Js Parser : 50%
Usage example (Ruby):
$LOAD_PATH << '.'
require 'parser' require_relative 'pgn'
include PGN
options = { actions: Actions.new, types: nil } file_data = File.open("kasparov.pgn", :encoding => 'ISO-8859-1').read game = PGN::parse(file_data, options ) #game = PGN::parse("1. c2c4 { Comments by/Kommentare von: GM Gerald Hertneck } 1... c7c6 2. d2d4 d7d5", options ) puts game.move(1) puts game.move(1).black.san puts game.movetext puts game.tag_pairs puts game.score
Execution:
➜ ruby git:(master) ✗ ruby test.rb
-
c4 c6
-
c4 c6
-
d4 d5
-
Nf3 Nf6
-
Qc2 dxc4
-
Qxc4 Bf5
-
Nc3 Nbd7
-
g3 e6
-
Bg2 Be7
-
O-O O-O
-
e3 Ne4
-
Qe2 Qb6
-
Rd1 Rad8
-
Ne1 Ndf6
-
Nxe4 Nxe4
-
f3 Nd6
-
a4 Qb3
-
e4 Bg6
-
Rd3 Qb4
-
b3 Nc8
-
Nc2 Qb6
-
Bf4 c5
-
Be3 cxd4
-
Nxd4 Bc5
-
Rad1 e5
-
Nc2 Rxd3
-
Qxd3 Ne7
-
b4 Bxe3+
-
Qxe3 Rd8
-
Rxd8+ Qxd8
-
Bf1 b6
-
Qc3 f6
-
Bc4+ Bf7
-
Ne3 Qd4
-
Bxf7+ Kxf7
-
Qb3+ Kf8
-
Kg2 Qd2+
-
Kh3 Qe2
-
Ng2 h5
-
Qe3 Qc4
-
Qd2 Qe6+
-
g4 hxg4
-
fxg4 Qc4
-
Qe1 Qb3+
-
Ne3 Qd3
-
Kg3 Qxe4
-
Qd2 Qf4+
-
Kg2 Qd4
-
Qxd4 exd4
-
Nc4 Nc6
-
b5 Ne5
-
Nd6 d3
-
Kf2 Nxg4+
-
Ke1 Nxh2
-
Kd2 Nf3+
-
Kxd3 Ke7
-
Nf5+ Kf7
-
Ke4 Nd2+
-
Kd5 g5
-
Nd6+ Kg6
-
Kd4 Nb3+ Event London PCA/Intel-GP Site London ENG Date 1994.08.31 EventDate ? Round 1 Result 0-1 White Garry Kasparov Black Chess Genius (Computer) ECO D11 WhiteElo ? BlackElo ? PlyCount 120 0-1
Any help is wellcome!
Canopy
Canopy is a parser compiler targeting Java, JavaScript, Python and Ruby. It takes a file describing a parsing expression grammar and compiles it into a parser module in the target language. The generated parsers have no runtime dependency on Canopy itself.
For usage documentation see canopy.jcoglan.com.
https://blog.jcoglan.com/2015/07/19/canopy-produces-portable-peg-parsers/ https://github.com/jcoglan/canopy