Embeding videos in user submission site

sikuneh

New Member
Messages
55
Reaction score
0
Points
0
I am making a user submission site where people can post anything they want. I want them to be able to post videos but I also want to be safe from XSS. Currently I just use the strip_tags() function in PHP but that removes the videos tags.

What would be a good way to allow safe embedding videos?
 

cybrax

Community Advocate
Community Support
Messages
764
Reaction score
27
Points
0
Good Question...

It's basically a two stage regex problem

1: identify what site is providing the movie as they sometimes have different embed codes using either a user selected drop down box or a regex statement with either the preg_match or strpos functions to search for a keyword like google,youtube, veehd etc...

see examples below

You Tube-
Code:
  <object width="560" height="349">
    <param name="movie" value="http://www.youtube-nocookie.com/v/KN5wo-NYLwU?version=3&amp;hl=en_GB&amp;rel=0">
    </param>
    <param name="allowFullScreen" value="true">
    </param>
    <param name="allowscriptaccess" value="always">
    </param>
    <embed src="http://www.youtube-nocookie.com/v/KN5wo-NYLwU?version=3&amp;hl=en_GB&amp;rel=0" type="application/x-shockwave-flash" width="560" height="349" allowscriptaccess="always" allowfullscreen="true"></embed>

Google Video-
Code:
<embed id=VideoPlayback src=http://video.google.com/googleplayer.swf?docid=428349265473317718&hl=en&fs=true style=width:400px;height:326px allowFullScreen=true allowScriptAccess=always type=application/x-shockwave-flash> </embed>

2:Once you know the provider a second regex statement using preg_match can made to extract the file name for that provider, KN5wo-NYLwU for the YouTube example and 428349265473317718 for the Google.

Once you have that then you can insert the file ID back into a stored embed string for that provider that you keep on your web server and write it into the page.

Doing it this way also means every video submitted would be the same size so as not to mess up any page layout. Plus you have control over what sites users can upload the embed codes for, otherwise you could be providing access to pornography or pirated movies on your site.
 
Last edited:

misson

Community Paragon
Community Support
Messages
2,572
Reaction score
72
Points
48
strip_tags takes an optional tag whitelist, but don't be tempted. strip_tags doesn't validate the HTML, nor does it let you strip attributes. The former means a malicious poster (or messy writer) can mess up you page structure. The latter opens up your site to injection.

You can't parse HTML with regular expressions, as it's not a regular language. The edge cases (also known as "security holes") could be used to defeat your filter. HTML is a context free language, so you need a parser that's equivalent to a context free grammar. With recursive regexes, you might be able to write a pattern that matches (but not parse and thus correct) well-formed HTML. It woudn't help you with marginal or ill-formed HMTL, and will also be overly complex and nigh unreadable.

The only way of safely allowing some HTML content is to use a parsing filter. HTML Purifier is a popular one. Whatever you use, keep an eye on your site for the first few weeks in case it gets suspended for high resource usage.

Another option is to use a non-HTML markup language such as BBCode or Markdown. You don't have to worry about injection because you can filter out (or escape, using e.g. htmlspecialchars) any HTML.
 
Top