My Markup Validator

nonsensep

New Member
Messages
39
Reaction score
0
Points
0
Hi! I would like to tell you about my first web tool. It's not really a markup validator, like the title says. It's just a PHP script that sends an HTTP request to the W3C Markup Validator and then outputs what it says about a given URI (as a PNG image). Here are some examples of using the validator on various websites:

Code:
<img src="http://www.nonsensep.x10hosting.com/[COLOR="Blue"]markupvalid.php[/COLOR]?uri=[COLOR="Red"]http://www.w3.org/[/COLOR]" />
markupvalid.php
(Valid markup)
Code:
http://www.nonsensep.x10hosting.com/[COLOR="Blue"]markupvalid.php[/COLOR]?uri=[COLOR="Red"]http://www.youtube.com/[/COLOR]
markupvalid.php
(Invalid markup)
Code:
http://www.nonsensep.x10hosting.com/[COLOR="Blue"]markupvalid.php[/COLOR]?uri=[COLOR="Red"]http://www.w3.org/&charset=iso-8859-1[/COLOR]
markupvalid.php
(Tentatively valid markup)
Code:
http://www.nonsensep.x10hosting.com/[COLOR="Blue"]markupvalid.php[/COLOR]?uri=[COLOR="Red"]http://www.circuitcity.com/[/COLOR]
markupvalid.php
(No <!DOCTYPE>)
Code:
http://www.nonsensep.x10hosting.com/[COLOR="Blue"]markupvalid.php[/COLOR]?uri=[COLOR="Red"]http://www.bestbuy.com/[/COLOR]
markupvalid.php
(404 error, not a URI, etc.)

The image is just 176x31 pixels. As the colored text points out, it's only using one image file for many different cases. I hope you guys find it useful.

Any suggestions? Comments? Tell me so! I plan on making a better style sometime. But it works, and that's what counts. And feel free to use it however you like
 
Last edited:

QuwenQ

Member
Messages
960
Reaction score
0
Points
16
That seems useful, except for the minor problem that someone already made a button like this:
valid-xhtml10

What's the advantage of using yours?
 
Last edited:

nonsensep

New Member
Messages
39
Reaction score
0
Points
0
Yeah, but that image will always say "Valid XHTML 1.0", even when there are errors. My image changes every time and tells the user straight from the W3C validator whether the page is valid or not. And how many errors/warnings. For example:

Code:
http://www.nonsensep.x10hosting.com/[COLOR="Blue"]markupvalid.php[/COLOR]?uri=[COLOR="Red"]http://forums.x10hosting.com/[/COLOR]

markupvalid.php


Now let's see what W3C says about the same URI:

http://validator.w3.org/check?uri=http://forums.x10hosting.com/

Try using my image with your own URI. Just enter this in your address bar:

Code:
http://www.nonsensep.x10hosting.com/[COLOR="Blue"]markupvalid.php[/COLOR]?uri=[COLOR="Red"]http://YOUR_SITE_HERE/[/COLOR]
 
Last edited:

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
markupvalid.php


Interesting concept :) Glad to know there are actually developers on x10...

Have you implemented caching yet? Parsing the site every time the image is loaded would be rather taxing on the server. It would also cause the image to load slower as it has to wait for the server to download and parse the page...

Good work though :D
 

nonsensep

New Member
Messages
39
Reaction score
0
Points
0
thank you Slothie and eminemix! And about caching, I see what you're saying, but I'm not sure if that'd defeat the purpose of it or not...

Well, do you think there's a way to have a session for the website and when the session ends the cache expires? That would be sort of a compromise, I guess. I'll look into that.

Oh, and I'm planning on making one for CSS, also.
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
You could cache it on an hourly basis or so. Sessions would be too short :p
 

nonsensep

New Member
Messages
39
Reaction score
0
Points
0
I don't know, an hour seems too long. I don't think that there will be too much of a problem if the image takes a little while to load.
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
No? Imagine if a site has 10000 viewers, that's 10000 people constantly refreshing the image which in turn means that your script has to parse through that page that many times.

A slightly less CPU resource intensive way of caching would be to hash the URL data and compare it to the existing cache (possibly the filename). This would save on some CPU time but you'd still be using a decent chunk of bandwidth.
 

Thewinator

New Member
Messages
256
Reaction score
0
Points
0
Well it would be a problem for the site owner.
Its also a good reason not to implement it, so I suggest you work on it ;)
You could tho store an MD5 checksum of the site, then compare it.
If its the same then show the same image, otherwise hand it over to w3.
 

nonsensep

New Member
Messages
39
Reaction score
0
Points
0
Well it would be a problem for the site owner.
Its also a good reason not to implement it, so I suggest you work on it ;)
You could tho store an MD5 checksum of the site, then compare it.
If its the same then show the same image, otherwise hand it over to w3.
what do you mean? how would i do this?
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
Get the entire content of the site, since you're parsing it anyway. Then prepend or append the the site URL.

them $hash=md5($reallylongvariable)
 

nonsensep

New Member
Messages
39
Reaction score
0
Points
0
Get the entire content of the site, since you're parsing it anyway. Then prepend or append the the site URL.

them $hash=md5($reallylongvariable)

So each time my script runs...



It will get an MD5 checksum of the URI's contents.
It will look up the URI of the site in a database on my server, find the checksum associated with the URI, and compare the two.
If they are the same, it will look up in the database the variables that were stored along with the URI (num. erros, num. warnings, markup type, etc.) and then display an image based on those variables.
If not, fsockopen() on the W3C site for that URI.
Then after all that, store the URI, checksum, and variables in a database.



Sound good?
 
Last edited:

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
There are tonnes of ways to optimize a script that does what yours does. That's one of 'em :p
 

nonsensep

New Member
Messages
39
Reaction score
0
Points
0
Ok. So, basically, it's either bandwidth or database memory. At least the way I'm approaching it. I guess I could store them in an XML file instead of a database. Then it wouldn't be database memory.
 
Last edited:

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
No. No. No.
That would be a bad idea. Databases are MEANT for storing information. It takes more effort to parse an XML file than to run a query of the database.

Basically its either
Bandwidth or CPU consumption.

Less regular checks would save your someone on bandwidth.
The hashing method we just discussed would save some CPU consumption instead of reparsing every page that has that button.

You might want to consider accepting http_Referers as a default value, so someone can just put the button on their site w/o using a GET var. That way he can track his subpages as well, instead of having to get separate image links for different pages.
 

nonsensep

New Member
Messages
39
Reaction score
0
Points
0
How do you accept the refere because I tried using $http_referer, but it didn't work and I didn't know why
 
Top