extracting from a string

Discussion in 'Scripts, 3rd Party Apps, and Programming' started by garrensilverwing, May 20, 2009.

  1. misson

    misson Community Paragon Community Support

    Messages:
    2,572
    Likes Received:
    72
    Trophy Points:
    48
    Not at all. Anytime you can find a package that does what you need and you can avoid work it's a win.

    Too bad about chessboard. Unless some other PGN processor pops up, it's not too hard to do yourself. Your script could slurp in the file, separate the games and parse each individually, or you could parse the file a line at a time. In pseudocode:
    Code:
    set tag RE to /^\[(\w+)\s+"([^"]*)"\]/
    get next line from file
    while not EOF
        while line matches tag RE
            save tag data in current game data
            get next line
        while not EOF and line doesn't match tag RE
            append line to current moves
            get next line
        save moves in current game data
        save current game (whatever that means)
        create new game (empty tags, moves)
    
    Add in a little error handling and you're set.
     
    Last edited: Jun 3, 2009
  2. fguy64

    fguy64 New Member

    Messages:
    218
    Likes Received:
    0
    Trophy Points:
    0
    Garren if you are still interested in a direct export of pgn into mySQL, the following software has been recommended to me by knowledgable people as doing a good job of that. I haven't used it myself yet, but I'll probably check it out, as I am doing a similar thing as you for my own chess learning project.

    http://jose-chess.sourceforge.net/

    regards.
     
  3. garrensilverwing

    garrensilverwing New Member

    Messages:
    148
    Likes Received:
    0
    Trophy Points:
    0
    well lets say for example i write a non web related program to create a seperate MySQL compatible file is there a way i can just upload that as a table in my database? and if so what format should i use?
     
  4. misson

    misson Community Paragon Community Support

    Messages:
    2,572
    Likes Received:
    72
    Trophy Points:
    48
    MySQL doesn't support the concept of "file uploads" (all network communication is based on SQL statements; local files can be read with a LOAD DATA statement, if you have the FILE privilege), so you'd need to use something else. phpMyAdmin is probably your best bet. It accepts CSV and SQL statements.

    In some situations, the command line utility mysqlimport would be easier. It uses a LOAD DATA statement, and thus requires the FILE privilege.
     
    Last edited: Jun 8, 2009
  5. garrensilverwing

    garrensilverwing New Member

    Messages:
    148
    Likes Received:
    0
    Trophy Points:
    0
    so probably the best thing to do is to just use regular expressions i guess i will start working on them lol
    Edit:
    i dont know what i am doing wrong, i wanted to write a simpler piece of code to show me what i am grabbing with my regular expression but all i get is
    Code:
    0: Array
    1: Array
    2: Array
    
    here is my php:
    Code:
    <?php
    $string = $_POST['pgn'];
    $string = strtolower($string);
    $string = stripslashes($string);
    echo "thank you for submitted the pgn.<br>";
    preg_match_all('/\[(\w+) "([^"]+)"\]/', $string, $matches);
    foreach ($matches as $key => $value)
    	{
    		echo "$key: $value<br>";
    	}
    ?>
    
    here $string = whatever pgn i put in
    
     
    Last edited: Jun 8, 2009
  6. fguy64

    fguy64 New Member

    Messages:
    218
    Likes Received:
    0
    Trophy Points:
    0
    good luck with the php, Garren, I'm just responding to Mission's last post.

    phpAdmin is pretty much what I had in mind. My impression from the following thread is that you can indeed use it, or one of the cpanel admin tools, to upload a mySQL database that has been prepare offline, using one of the php admin tools.

    http://forums.x10hosting.com/free-hosting/98076-uploading-mysql-databases.html

    So verifying the upload of mySQL databases is at the top of my list. After this, it's pretty much a done deal.

    Anyways, I'm doing pretty much the same thing with pgn, in terms of end result, only I won't be using regex in php to parse my data. And I'll be doing the majority of work offline , and upload a finished database. But that's just the way I like to do things. If anyone wants to know how I do it, let me know.
     
  7. misson

    misson Community Paragon Community Support

    Messages:
    2,572
    Likes Received:
    72
    Trophy Points:
    48
    Note that you don't need to do the parsing server side. It's easier to develop & run the converter client side. The converter should output CSV or an SQL insert statement.

    $matches[0] contains every matched string, $matches[1] every match for the 1st group and $matches[2] every match for the 2nd group. Read the description of the 'flags' argument to preg_match_all for details.

    Are you worried about the upload or the import? phpMyAdmin should give you sufficient feedback to verify the import; it will print any error messages if the import failed and will tell you the number of rows affected if successful.
     
  8. fguy64

    fguy64 New Member

    Messages:
    218
    Likes Received:
    0
    Trophy Points:
    0
    I hadn't thought that far ahead. I just want to prove to myself that I can upload databases created locally with my own copy of mySQL server. And access these bases in the usual way with php. So far I haven't actually looked at it too closely, or even tried it, I just "know" it can be done. But thanks for the tip.
     
  9. garrensilverwing

    garrensilverwing New Member

    Messages:
    148
    Likes Received:
    0
    Trophy Points:
    0
    ok so what it is doing is capturing arrays of the text and not the actual text right away so i need to do is scan the arrays for arrays containing what i need hmm sounds complicated lol
    Edit:
    Code:
    		$string = $_POST['pgn'];
    		$string = strtolower($string);
    		$string = stripslashes($string);
    		echo "thank you for submitted the pgn.<br>";
    		preg_match_all('/\[(\w+) "([^"]+)"\]/', $string, $matches, PREG_SET_ORDER);
    		$counter = 0;
    		foreach($matches as $key => $value)	
    			{
    				print_r($key);
    				print_r($value);
    				echo "<br>";
    			}
    
    this is what i have now

    Code:
    thank you for submitted the pgn.
    0Array ( [0] => [event "private match, 40/2, 20/1, 20/1, 20/1"] [1] => event [2] => private match, 40/2, 20/1, 20/1, 20/1 )
    1Array ( [0] => [site " denver , colorado "] [1] => site [2] => denver , colorado )
    2Array ( [0] => [date "1977.06.24"] [1] => date [2] => 1977.06.24 )
    3Array ( [0] => [round "2"] [1] => round [2] => 2 )
    4Array ( [0] => [white "o' donnell, robert"] [1] => white [2] => o' donnell, robert )
    5Array ( [0] => [black "wall, brian"] [1] => black [2] => wall, brian )
    6Array ( [0] => [result "1/2-1/2"] [1] => result [2] => 1/2-1/2 )
    7Array ( [0] => [eco "e80"] [1] => eco [2] => e80 )
    8Array ( [0] => [whiteelo "2000"] [1] => whiteelo [2] => 2000 )
    9Array ( [0] => [blackelo "1915"] [1] => blackelo [2] => 1915 )
    10Array ( [0] => [plycount "105"] [1] => plycount [2] => 105 )
    11Array ( [0] => [eventdate "1977.06.24"] [1] => eventdate [2] => 1977.06.24 )
    12Array ( [0] => [eventtype "match"] [1] => eventtype [2] => match )
    13Array ( [0] => [eventrounds "2"] [1] => eventrounds [2] => 2 )
    14Array ( [0] => [eventcountry "usa"] [1] => eventcountry [2] => usa ) 
    
    so i'm getting there lol but this array and regular expressions stuff is super complicated
    Edit:
    this is my code now after i've put everything i need in there now all i have to do is set it to input into the database or try to figure out how to do 500+ pgns at once
    Code:
    		$string = $_POST['pgn']; //sets $string to the value of the entered pgn
    		$string = strtolower($string); //replaces all upercase letters with their lowercase counterparts
    		$string = stripslashes($string); //removes any slashes from $string to protect against malicious users
    		echo "thank you for submitted the pgn.<br>"; //acknowledges a pgn has been submited
    		preg_match_all('/\[(\w+) "([^"]+)"\]/', $string, $matches, PREG_SET_ORDER); //searches $string for a patter that looks like this: [x "y"] and places them into their own arrays
    		foreach($matches as $key => $value)										    //where array1=the entire value, array2=x and array3=y
    			{
    				$tag=$value[1]; //sets $tag to the value of x that was previously searched for
    				$data[$tag]=$value[2]; //creats a new array with a key matching x and the value matching y
    			}
    		$wName=$data['white'];	//seperates white's full name
    		$bName=$data['black'];	//seperates black's full name
    		$wRating=$data['whiteelo']; //seperates white's rating
    		$bRating=$data['blackelo']; //seperates black's rating
    		$eco=$data['eco']; //seperates the eco (opening reference)
    		$date=$data['date']; //seperates the entire date (usually in yyyy.mm.dd format)
    		$outcome=$data['result']; //seperates the result
    		if(preg_match('/,/', $wName)==1 OR preg_match('/ /', $wName)==1) //checks if there is a comma in the name (provided it is in either "last, first" or "first last" formats)
    			{
    				if(preg_match('/,/',$wName)==1) //checks if white's name is in "last, first" format
    					{
    						$wName=str_replace(" ","",$wName); //removes the space from white's name
    						$white_name=explode(",",$wName); //seperates white's first name and last name
    						if(count($white_name)< 3) //verifies there are only first and last name present
    							{
    								$wLast=$white_name[0]; //removes white's last name from the array
    								$wFirst=$white_name[1]; //removes white's first name from the array	
    							}
    						else
    							{
    								echo "Error the format of black's name is incorrect. Please have black's name in either \"last, first\" or \"first last\" formats."; //warns the user that the name is incompatable	
    							}					
    					}
    				else	//if it is not in "last, first" format it is assumed it is in "first last" format
    					{
    						$white_name=explode(" ",$wName);
    						if(count($white_name) < 3) //checks if there is more than 2 names found
    							{
    								$wFirst=$white_name[0]; //seperates white's white's first and 
    								$wLast=$white_name[1];  //last name from the array
    							}
    						else
    							{
    								echo "Error the format of white's name is incorrect. Please have white's name in either \"last, first\" or \"first last\" formats."; //warns the user that the name is incompatable
    							}
    						
    					}
    
    			}
    		else //if white's name is not in last, first or first last, that is there are no spaces or commas
    			{
    				$wFirst=$wName; //sets both white's first and 
    				$wLast=$wName;  //last name to the pgn's original value (usually an online handle)
    			}
    		if(preg_match('/,/', $bName)==1 OR preg_match('/ /', $bName)==1) //checks if there is a comma in the name (provided it is in either "last, first" or "first last" formats)
    			{
    				if(preg_match('/,/',$bName)==1) //checks if black's name is in "last, first" format
    					{
    						$bName=str_replace(" ","",$bName); //removes the space from black's name
    						$black_name=explode(",",$bName); //seperates black's first name and last name
    						if(count($black_name)< 3) //verifies there are only first and last name present
    							{
    								$bLast=$black_name[0]; //removes black's last name from the array
    								$bFirst=$black_name[1]; //removes black's first name from the array	
    							}
    						else
    							{
    								echo "Error the format of black's name is incorrect. Please have black's name in either \"last, first\" or \"first last\" formats."; //warns the user that the name is incompatable	
    							}				
    					}
    				else	//if it is not in "last, first" format it is assumed it is in "first last" format or another format with spaces
    					{
    						$black_name=explode(" ",$bName);
    						if(count($black_name) < 3) //checks if there is more than 2 names found
    							{
    								$bFirst=$black_name[0]; //seperates black's first and 
    								$bLast=$black_name[1];  //last name from the array
    							}
    						else
    							{
    								echo "Error the format of black's name is incorrect. Please have black's name in either \"last, first\" or \"first last\" formats."; //warns the user that the name is incompatable
    							}
    						
    					}
    
    			}
    		else //if black's name is not in last, first or first last, that is there are no spaces or commas
    			{
    				$bFirst=$bName; //sets both white's first and 
    				$bLast=$bName;  //last name to the pgn's original value (usually an online handle)
    			}
    		if($date!=="??" AND $date!=="*" AND $date!=="") //verifies that there is usable data in the pgn's date tag
    			{
    				$day=explode(".",$date); //seperates the date's year month and day in to an array
    				$year=$day['0']; //since the standard pgn date format is "yyyy.mm.dd" the first array value is used for year
    				if(strlen($year)!==4) //if the number of numbers in year is not four then the format of date must have been different
    					{
    						$year = $day['1']; //this sets the year to the second value of array $day, that is if the date is in "dd.yyyy.mm" or "mm.yyyy.dd" format
    					if(strlen($year)!==4) //if the number of numbers in $year is not for then the format must be "dd.mm.yyyy" or "mm.dd.yyyy"
    						{
    							$year = $day['2']; //sets the year to third value in array $day
    						}
    					}
    			}
    		else
    			{
    				$year=$date; //uses whatever the pgn's date tag says either
    			}
    		if($outcome !== "*" AND $outcome !== "??" AND $outcome!=="") //verifies there is usable data in the pgn's result tag
    			{
    				if($outcome == "0-1") //if black wins
    					{
    						$result = 0; //used to signify black wins
    					}
    				if($outcome == "1-0") //if white wins
    					{
    						$result = 1; //used to signify white wins
    					}
    				if($outcome == "1/2-1/2")//if the game is a draw
    					{
    						$result = 2; //used to signify a draw
    					}			
    			}
    		else
    			{
    				$result = 3; //this will be used to signify that the result is still pending whether the game was adjourned or never completed
    			}
    		echo "$wFirst $wLast ($wRating) vs $bFirst $bLast ($bRating) $result $year";
    
    so when i enter a pgn (this is just the top information not the move information):

    Code:
    [Event "Private Match, 40/2, 20/1, 20/1, 20/1"]
    [Site " Denver , Colorado "]
    [Date "1977.06.24"]
    [Round "2"]
    [White "O' Donnell, Robert"]
    [Black "Wall, Brian"]
    [Result "1/2-1/2"]
    [ECO "E80"]
    [WhiteElo "2000"]
    [BlackElo "1915"]
    [PlyCount "105"]
    [EventDate "1977.06.24"]
    [EventType "match"]
    [EventRounds "2"]
    [EventCountry "USA"]
    
    i get this: robert o'donnell (2000) vs brian wall (1915) 2 1977

    perfect :D

    i know it is messy and i can probably shorten it a lot with functions but im just really happy that it works, thanks a lot everyone for your help :D i would give reputation but it says i need to spread it around :'(
     
    Last edited: Jun 9, 2009

Share This Page