Search results

  1. learning_brain

    Grouping results from array by page number.

    Perfect - solved it now. I had two problems. First, the array_slice can preserve indices.. e.g. array_slice($array, $startRow, $pageRows, true); problem solved. My second was the stupid misunderstanding that the slice uses start/stop, as opposed to your suggestion of start, length. The for...
  2. learning_brain

    Grouping results from array by page number.

    You all know the old problem... 1000's of results and 1 html page... This is easily resolved if my results were direct from a MySQL result, using LIMIT, but I can't do that.... My results are contrived from a full MySQL result, scoring each result relevance to the search string/s and adding...
  3. learning_brain

    Improved Search Algorythm (Alternative to FULLTEXT)

    I knew you wouldn't let me down!! I'm also glad you understand the scope of my challenge. There are some things I need to clarify... As you may/may not remember, I have an image search engine and I am currently capturing 3 strings... crawled url, image url and keywords found in 'alt'. This...
  4. learning_brain

    Improved Search Algorythm (Alternative to FULLTEXT)

    I have an image search engine (not on this server) which is now crawling nicely, but the search function leaves much to be desired! Currently, I'm using the fulltext MATCH AGAINST system; which works nicely but has two major limitations: 1) My host limits me to a 4 character search (all 3...
  5. learning_brain

    High Quality Image Search Engine

    @v4xde - Thanks for that. I understand your reasons for not creating a new window... I'll consider. I might just use target blank on the external url link instead of the view image page. Crawler is now fully automated and I'll be adding a link to the main page so visitors can actually see it...
  6. learning_brain

    High Quality Image Search Engine

    Yeah - I have done limited searches but most are fairly simple queries and produce random results. I'll have to dig deeper! Meanwhile, I'm refining my crawler. Currently I have two pages - one for scraping a href's and adding to a image pending queue and one to loop through that queue...
  7. learning_brain

    High Quality Image Search Engine

    a random image as background? hmmm.... how would I pass the image url to the css style sheet? interesting concept although how I'm going to blend this in with the practical 1024x768 common fixed space, I'm not sure yet. Making the results consume the whole window is also interesting and I...
  8. learning_brain

    High Quality Image Search Engine

    Thanks for all the feedback - looks like I might be on the right track. I also like a clean layout. I think a site (like google) that presents simply but has massive functionality has more appeal to me personally than a buzy, fussy site that is, in essence, pretty static. @Cybrax - you raise...
  9. learning_brain

    High Quality Image Search Engine

    I welcome any feedback on my new site. http://www.smartimagesearch.net76.net I only use large images and each image is visually checked for aesthetic quality and safety. I have generated an image crawler, which is still very busy in the background. Indexed images are now circa 2,000...
  10. learning_brain

    Filtering Dynamic URL's from URL scrape

    Absolutely! Thanks everyone for confirming what I already feared. I did think about comparing content with existing, but this is a huge drain on resources (I would think). descalzo - site maps! why didn't I think of that! As I can obtain the root address, it's also likely I can find the...
  11. learning_brain

    Clearing Dom Object

    Thanks marshian. Your locking idea is a good one - I'll check that out. The curl_getinfo presents a problem... I don't want to have to open a curl for every link and would prefer to check it before I start the curl object to preserve resources. Order: Loop through URL pending queue (limit...
  12. learning_brain

    Filtering Dynamic URL's from URL scrape

    Good job I'm not using their services anymore then :D I stopped using X10 months ago when they deleted a complete mysql db when they moved.
  13. learning_brain

    Filtering Dynamic URL's from URL scrape

    My Image crawler is now working but... The URL crawl picks up every URL link... which is fine on static pages but on dynamic pages, this can be a problem seeing as exactly the same page content can have a different URL. i.e...
  14. learning_brain

    Clearing Dom Object

    Thanks misson - helpful as always. After reviewing all of this (and there was a lot to understand and get my head around), I have made a few decisions... Firstly, I've got this sort of working. The trouble is, my url-to-crawl list grows exponentially and my host loses interest (just stops...
  15. learning_brain

    Clearing Dom Object

    That's a pretty good start... Thanks! Quick questions though... You mention a "list" of URL's to process. This isn't how I was going about it. Initially, the page was opening the submitted URL and scaping for a hrefs, then img src's. Then as part of that loop, it would start another loop...
  16. learning_brain

    MYSQL Highscores Databse

    Weirdly, most google results are for flash or other games.... no good tuts I can readily find. However, a highscore table would be simple enough to do depending on what you want. For instance, you could add another column in your mysql db with score. Then simple do a recordset using "ORDER BY...
  17. learning_brain

    Clearing Dom Object

    I have a site that crawls sites for images using CURL and parsing to DOM elements. This works great for single urls, but what I want to achieve is for a preliminary a->href search and then a loop to search through all href pages for images as well (1 deep). Ideally, I would like to extend this...
  18. learning_brain

    img src preg_match_all regex problem

    Thanks misson - that's cracked it - works like a dream.
  19. learning_brain

    img src preg_match_all regex problem

    @misson - doing most of your suggestions now. OK, my url host/directory isn't working for all urls - only the one I tested. #1,#2,#3 etc depend on # of directories/subdirectories so in this case, if the url is only the root, I don't get what I need. 2ndly, my abs/rel path test ain't working...
  20. learning_brain

    img src preg_match_all regex problem

    Thanks both of you. I took descalzo's advice with the extraction of the url path (concatenating url host/dir.file) and have now got the following; <?php //define url to search $url = $_POST['url']; //split url preg_match('/((http|https|ftp):\/\/)?((.*?)\/)?((.*)\/)?(.*)?/',$url, $urlParts)...
Top