HTML Purifier

So according to this thread, HTMLpurifier does not support HTML5.
https://stackoverflow.com/questions/4566301/htmlpurifier-with-an-html5-doctype
One of the reasons why we're having so much trouble with Youtube iframes. Although our latest tweaks have got the basics loading up, the attributes "allow" and "allowfullscreen" are still stripped.
We all use HTML5 now right?
There is this
https://github.com/xemlock/htmlpurifier-html5
Perhaps it's looking at using it?

Comments

  • I've pushed a change to the develop branch to allow you to pass in your own config class in the purifier.php config. By default it uses what comes with Purifier. This was the easiest way to allow that kind of customization without requiring composer with FUEL (currently it's not needed anywhere else).

  • edited December 2020

    Nice.

    Composer isn't required though.

    We can use https://php-download.com to compile xemlock/htmlpurifier-html5 and dump the resulting "vendor" directory into ./application/vendor.

    Add require_once(APPPATH.'vendor/autoload.php'); into the top of ./config/purifier.php and select the new HTMLPurifier_HTML5Config config class

    No doubt that could be moved under ./modules/fuel to make it a permanent fixture?

    MY_html_helper.php needs more sophisticated code to handle loading the config class though. I did this to get mine running:

    // Modified to include the library if it doesn't exist
    //require_once(FUEL_PATH.'libraries/HTMLPurifier/HTMLPurifier.standalone.php');
    //require_once(FUEL_PATH.'libraries/HTMLPurifier/HTML5Purifier/HTML5Config.php');
    

    And 'HTML.Doctype' => 'HTML5' in the config

  • having dug into htmlpurifier-html5 a bit more, it looks like it's a bolt-on for htmlpurifier rather than a replacement - basically adding the html5 doctype. I think I prefer the config I implemented above since it allows the developer to edit the supporting files to include tag attributes that aren't currently supported (<iframe allow="foo" for example)

  • edited December 2020

    I'm thinking to fix we create the htmlpurifier-html5 autoload with php-download and place it in the fuel/modules/fuel/libraries folder as HTML5Purifier and then in the MY_html_helper file we change line 181 to require_once(FUEL_PATH.'libraries/HTML5Purifier/vendor/autoload.php'); with the purifier.php config like so:

    $config['config_class'] = 'HTMLPurifier_HTML5Config';// For HTML 5 compatibility issues https://github.com/xemlock/htmlpurifier-html5 
    $config['settings'] = array(
        'default' => array(
            //'HTML.Trusted'             => TRUE, // For Javascript... must also add 'script' to HTML.Allowed
            //'HTML.SafeIframe'          => TRUE, // For iframes
            //'URI.SafeIframeRegexp'     => '%^(http://|https://|//)(www.youtube.com/embed/|player.vimeo.com/video/)%',
            'Attr.EnableID'            => TRUE,
            'Attr.AllowedFrameTargets' => array('_blank'),
            'HTML.Allowed'             => 'div[id],b,strong,i,em,a[href|title|target],ul,ol,li,p[style],br,span[style],img[width|height|alt|src]',
            //'CSS.AllowedProperties'    => 'font,font-size,font-weight,font-style,font-family,text-decoration,padding-left,color,background-color,text-align,float,margin',
            'AutoFormat.AutoParagraph' => FALSE, // This will cause errors if you globally apply this to input being saved to the database so we set it to false.
            'AutoFormat.RemoveEmpty'   => TRUE,
            'HTML.Doctype'             => 'HTML5'
        ),
        'comment' => array(
            'HTML.Doctype'             => 'XHTML 1.0 Strict',
            'HTML.Allowed'             => 'p,a[href|title|target],abbr[title],acronym[title],b,strong,blockquote[cite],code,em,i,strike',
            'CSS.AllowedProperties'    => 'font,font-size,font-weight,font-style,font-family,text-decoration,padding-left,color,background-color,text-align,float,margin',
            'AutoFormat.AutoParagraph' => TRUE, 
            'AutoFormat.Linkify'       => TRUE,
            'AutoFormat.RemoveEmpty'   => TRUE,
        ),
        'youtube' => array(
            'HTML.SafeIframe'          => TRUE,
            'URI.SafeIframeRegexp'     => '%^(http://|https://|//)(www.youtube.com/embed/|player.vimeo.com/video/)%',
        )
    );
    
  • Sounds like that would work. We could tidy up the config a bit too since I don't think the "comment" and "youtube" sections are used unless you manually assign one of them when you create the purifier object.

  • I've pushed an update for that in the deb branch. I left the comment and youtube configs just in case someone wants to use them with the html_purify() function (outside of saving to the database).

Sign In or Register to comment.