separating parts of a long, complicated string

garrensilverwing

New Member
Messages
148
Reaction score
0
Points
0
Here is my string:

Code:
{test} 1. d4 d5 2. Bf4 {This is only my third game and I do not have the 
greatest opening knowledge in the world. So I want to stick with what I know, 
at least for the time being.} c6 3. Nf3 h6 4. e3 g5 5. Bg3 f6 {I am not sure 
what exactly the rule is on pawns in the opening but I know for a fact having 
made five pawn moves before developing a piece is bad.} 6. Bd3 e5 $4 { 
This is a blunder, the h5-e8 diagonal is vital to the defense of the king in 
the early game, especially when you have no pieces developedyet!} 7. Bg6+ ({ 
A lot of these variations are somewhat transpositions of eachother, where the 
difference in tempos and free squares allow black more options to roam around, 
thus making them more complicated:} 7. dxe5 Bg7 8. Bg6+ (8. exf6 Qxf6 { 
This would at least cause my initiative to slow down, the last thing you want 
when you have an initiative is to let up at all.}) 8... Kf8 { 
(see more room to roam around)} 9. Nc3 h5 10. h4 g4 11. exf6 $1 Qxf6 12. Ne5 
Nd7 13. Ne4 $1 dxe4 14. Nxd7+ Bxd7 15. Qxd7 Qxg6 16. Bd6+ Ne7 17. Qxe7+ Kg8 $18 
{This looks really good for me but the open nature of the board makes it even 
more complicated and allows him to drum up some counter-play.}) 7... Ke7 8. 
dxe5 Nd7 (8... fxe5 9. Bxe5 Nf6 10. Qd4 Nbd7 11. Bxf6+ Nxf6 12. Qe5+ Be6 (12... 
Kd7 13. Bf5# {A cute, in the middle of the board checkmate.}) 13. Nd4 $18 { 
Black is in a hopeless bind, this is followed by Bf5 and the e6 bishop will 
fall.}) (8... Bg7 9. exf6+ Nxf6 10. Nbd2 $18 { 
White still has a solid advantage here but again the initiative has fizzled.}) 
9. Nc3 {Reinforcements are necessary, it is hard to get an attack going or 
maintain an initiative with only a couple pieces active. However, now that I 
have successfully stifiled his development, my initiative is still going 
strong.} ({This is an interesting way to continue as well:} 9. Qd3 Qb6 10. O-O 
Bg7 11. Qa3+ Qc5 12. b4 Qb6 13. e6 Ne5 14. b5+ c5 15. Nc3 $18 {However, I did 
not like the way my pieces are coordinated and I wanted to go with a more 
conventional attack.}) 9... Bg7 10. e4 Nxe5 (10... Nxe5 11. Nxe5 fxe5 12. exd5 
$18 {Black\'s exposed king will be easy to exploit in this exposed position, 
especially with my two active bishops.}) (10... g4 11. exd5 $1 gxf3 12. d6+ Kf8 
13. Qxf3 $18 {Black\'s king is in dire straits and will quickly be either 
checkmated or lose the majority of his forces.}) 11. Nxe5 fxe5 12. exd5 { 
The key to a successful attack is to have as many open lines to the enemy king 
as possible. The fewer avenues of attack the easier the defense. So naturally 
I want to open up everything.} Kf6 (12... cxd5 13. Nxd5+ Kf8 14. Qf3+ Bf6 15. 
O-O-O {To be in this position would be like sitting on a chair full of nails... 
like having a long rusty nail sticking into your butt.}) 13. Qh5 cxd5 14. Be8 { 
A cheezy checkmate threat that comes with some powerful threats.} Qe7 $4 { 
Oops, but not much better is:} (14... Be6 15. Qg6+ Ke7 16. Qxg7+ Kxe8 17. Bxe5 
Qe7 18. Qxh8 Qf8 {Here black is pinned up and is down a a rook and a pawn.}) 
15. Qg6# {This is kind of like an eppaulette checkmate but it is kind of odd 
having all four corners blocked by black pieces.} 1-0

what I want to do is separate out the moves of the game while keeping the annotations and variations attached to whichever move to which it corresponds. Basically the moves start where ever there is a number with a period ( like "1.", "2." etc.) then there is a move from both sides. Occasionally as you can see there are annotations and variations which have either parenthesis or brackets around them. I was trying to do it my own way by separating the parenthesis and brackets first but i couldnt think of a way to keep them with their original move so I wanted to ask if anyone has an idea...
 

marshian

New Member
Messages
526
Reaction score
9
Points
0
It can be done using preg_match and/or preg_match_all, you should read up on those. (On php.net)
Do you have any experience with regex? If not, you should definitely read up on that too.
Good luck :)
 

garrensilverwing

New Member
Messages
148
Reaction score
0
Points
0
yup looks like i'll have some studying to do haha, i did find an alternative method to doing it though so i don't have to get this project done right away
 

misson

Community Paragon
Community Support
Messages
2,572
Reaction score
72
Points
48
Regular expressions might form a part of a solution, but there's a limit to what regular expressions can do. Some languages are difficult or impossible to express as a regular language. Regular languages are the simplest in the Chomsky hierarchy. At a certain point, you need to use a more complex parser. Sadly, vanilla PHP doesn't have anything for general parsing, though you might be able to find something in PECL.

REs can be helpful for a tokenizer. In your case, you could split on /\s+(?=[({})]?)|(?<=[({})])|(?<=\S)(?=[({})])/ (using preg_replace). This gives you words, parentheses and brackets as separate tokens. Alternatively, match on /[()]|{[^}]+}|\S+/. This gives you comments (as "{...}"), words and parentheses as tokens.

PHP:
class Game {
    public $moves;

    function __construct($source) {
        preg_match_all('/[()]|{[^}]+}|\S+/', $source,$tokens);
        $tokens = $tokens[0];
        $this->moves = self::parse($tokens);
    }

    static function parse(&$tokens) {
        $game = array();
        $i = 0;
        while (!is_null($token = array_shift($tokens))) {
            switch ($token) {
            case '(':
                $game[$i][] = self::parse($tokens);
            case ')':
                //var_dump($game);
                return $game;
            default:
                if (preg_match('/^(\d+)\.$/', $token, $matches)) {
                    $game[$i = $matches[1]] = array();
                } else {
                    $game[$i][] = $token;
                }
                break;
            }
        }
        return $game;
    }
}
 
Top