earn 100 credits for helping me (topic-php)

Status
Not open for further replies.

nahsorhseda

Member
Messages
116
Reaction score
0
Points
16
i want a php search script in which it will find only the links and not the text .......i want the search to specify only the results from the pages i specify

i want this script because i have a lot of html files which have links to different media files and are hosted at a different hosts

so i want a search so that people can find direct links to the file when they search
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
??????

That's not very specific. Search links from where?
 

mattura

Member
Messages
570
Reaction score
2
Points
18
I think you mean a regex expression to search raw html for links, something like this:

<?php
$rawhtml="<body><h1>Heading</h1><h3>The <a href='www.google.com'>link</a> goes to google</h3><h2>but <a href='nowhere.org' class='link'>this</a> doesn't</h2></body>"; //example
$pattern="|<a.*?>.*?</a>|";
if (preg_match_all($pattern, $rawhtml, $match)) { //if match:
//list matches or whatever: $matches[0][0], $matches[0][1] etc
}
?>
Edit:
----

Here is something a bit better:

<?php
$rawhtml='<body><h1>Heading</h1><h3>The <a href="http://www.google.com">link</a> goes to google</h3><h2>and <a class="link" href="http://second.com">this one</a> goes elsewhere!</h2><a id="id" class="heavy" href="http://www.large.com" name="what?" attr="nowt">big</a></body>'; //example
$pattern='!(<a(.*?)?href=("|\')(.*?)("|\')(.*?)>(.*?)</a>)!';

echo "String:<br/><textarea name='raw' rows='5' cols='100'>$rawhtml</textarea><br/>";
echo "Pattern:<textarea rows='1' cols='90'>$pattern</textarea><br/>";

if (preg_match_all($pattern, $rawhtml, $match)) {
echo "Result:<br/>";
echo "<table border='1' cellpadding='3px'> <tr><td>Title</td><td>Link</td><td>Attributes</td></tr>";
foreach($match as $k=>$v) {
echo "<tr><td>".$match[7][$k]."</td><td><a href='".$match[4][$k]."'>".$match[4][$k]."</a></td><td>".$match[2][$k].$match[6][$k]."</td></tr>";
}
echo "</table><textarea rows='17' cols='100'><br/>Array:";
print_r($match);
echo "</textarea>";
} else {echo "No Match";}
?>

The above will take any html input and output only the link data, in a nice little table, including other attributes if necessary. Try it out! Come on you know that's worth the credits! Took me over an hour to come up with the regex pattern!
 
Last edited:

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
Or you could use something like this
PHP:
$url = "http://www.example.net/somepage.html";
$input = @file_get_contents($url) or die('Could not access file: $url');
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if (preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
    foreach($matches as $match)
    {
        $links[] =  $match[2] // link addresses
        $linktext[] =  $match[3] //link text
    }
}

This will let you parse the links from a remote array and store them all in $links
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
OR if you need to parse link from multiple pages

PHP:
$links = array();
$linktext = array();

$urls[] = "http://www.example.net/somepage.html";
$urls[] = "http://www.example.net/somepage1.html";
$urls[] = "http://www.example.net/somepage2.html";
$urls[] = "http://www.example.net/somepage3.html";
//continue this as much as you want :D

foreach($urls as $url) {
    $input = @file_get_contents($url) or die('Could not access file: $url');
    $regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
    if (preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
        foreach($matches as $match)
        {
            $links[] =  $match[2] // link addresses
            $linktext[] =  $match[3] //link text
        }
    }
}
 

nahsorhseda

Member
Messages
116
Reaction score
0
Points
16
i liked slothies idea of i.e i can search many pages at once ..........but it showing me parsing errors

Parse error: syntax error, unexpected T_VARIABLE in /home/hsedan/public_html/search/s1.php on line 18

......please rectify it and ill pay you 100 credits

to be more specific """ consider i have 2 pages page1.html and page2.html with links to mp3, files i want a search which can search for mp3 links

but i have over 800 html pages with links to different media files thats why i liked slothies idea

if you can please add a html form with text input and submit button and please give a complete php code ie starting from "<?php" and ending with"?>" and also it must show 10 results per page ...and next and previous link{25 credits extra for that}


hope you get your 100 credits.....
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
PHP:
<?php
$links = array();
$linktext = array();

$urls[] = "http://www.example.net/somepage.html";
$urls[] = "http://www.example.net/somepage1.html";
$urls[] = "http://www.example.net/somepage2.html";
$urls[] = "http://www.example.net/somepage3.html";
//continue this as much as you want :D

foreach($urls as $url) {
    $input = @file_get_contents($url) or die('Could not access file: $url');
    $regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
    if (preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
        foreach($matches as $match)
        {
            $links[] =  $match[2]; // link addresses
            $linktext[] =  $match[3]; //link text
        }
    }
}  

?>

Forgot to add in ;

Pagination would be a bit tougher since this is a fairly simple script.
 

nahsorhseda

Member
Messages
116
Reaction score
0
Points
16
its showing me a blank page ......i did not get any errors
i really dont know php very well so please make me a html form or atleast tell me what the name of the input should be if u build me the simple html form the 125 credits are all yours
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
PHP:
<?php
$links = array();
$linktext = array();

$urls[] = "http://www.example.net/somepage.html";
$urls[] = "http://www.example.net/somepage1.html";
$urls[] = "http://www.example.net/somepage2.html";
$urls[] = "http://www.example.net/somepage3.html";
//^-- modify these to fit your sites

//continue this as much as you want :D

foreach($urls as $url) {
    $input = @file_get_contents($url) or die('Could not access file: $url');
    $regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
    if (preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
        foreach($matches as $match)
        {
            $links[] =  $match[2]; // link addresses
            $linktext[] =  $match[3]; //link text
        }
    }
}  


?> 

<pre>
<?php
print_r($links);
?>

It should print out all your links, you'll have to manipulate them yourself. If anyone else wants to extend on this code, feel free to do so.
 

knightcon

New Member
Messages
69
Reaction score
0
Points
0
If you want you could also try using Google's AJAX search API for your site. Google would spider your site regularly and then create a search index and all that would happen is from your site a user enters a search string and the Google servers return the most likely results from your site. You may find this option a bit more ideal than searching all the HTML pages when each search is executed, just imagine 100 people on your site, or 1000 people on your site all wanting to do the search, you would wind up with serious overhead. If you don't want to go down the external API path you would probably be better off using MySQL and creating an index of your site links and have the search engine search that instead and just have a PHP script scheduled to run with as a CRON job to rebuilt your index at regular intervals (Such as every night at midnight) when your server is expected to have the least load.
 

nahsorhseda

Member
Messages
116
Reaction score
0
Points
16
dude you did not understand what i wanted you script is showing only the link text and not links

ill explain you in detail...............
consider i have a page abc.html with these links
< a href=abc.com/fgjk.exe>fgkh</a>
< a href=abc.com/fgjk.exe>car</a>
< a href=abc.com/fgjk.exe>bike</a>
< a href=abc.com/fgjk.exe>land</a>
< a href=abc.com/fgjk.exe>water</a>

i have a second page search.html where i have a text input box and a submit button
and some one for example types the letter"car" and hits the submit button he should find the link car from abc.html in the link format and not text .............all i want is a search
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
Then you should have been more specific in your first post. Perhaps someone else will take him up on his offer, shouldn't be too hard to modify the script, I don't have the time at the moment.
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
Not that desperately perhaps :p I would do it if I had free time, but for a script that complex, the credits isn't worth it :(
 
Status
Not open for further replies.
Top