PEAK XOOPS - HTMLPurifier in englishin japanese

Archive | RSS |
PHP
PHP : HTMLPurifier
Poster : GIJOE on 2007-09-18 04:00:24 (12072 reads)

in englishin japanese
WYSIWYGエディタを有効にするなら、基本的にHTML表示許可でデータを受け取るしかないのですが、そうするとScriptInsertionが避けられません。

HTMLを再構築してくれるライブラリさえあればなあ、と思っていたら、kentaulsさんが教えてくれました。
HTMLPurifier
http://htmlpurifier.org/

この手のライブラリってあまり信用していなかったのですが、アーカイブ内にあるsmoketestのXSSがあまりにも圧巻で、これはいけるかも! と期待してます。

ただこのライブラリ、手元で一通り試した限りでは、PHP5専用ですね。サイトにはPHP4でも動くと書いてありますし、一応、エラーを吐かずに通過はしますが、惨憺たる変換状況です。

逆に、PHP5だとおかしな動作が見つかりません。EUC-JPを通しても、ちゃんとiconvで内部的にUTF-8に変換してから処理してくれます。もちろん、返り値もEUC-JPとなります。

INSTALLにも書いてありますが、キャッシュの置き場所と、エンコーディング指定だけはちゃんとやった方が良いでしょう。


require_once dirname(__FILE__).'/library/HTMLPurifier.auto.php' ;
$config = HTMLPurifier_Config::createDefault();
$config->set('Cache', 'SerializerPath',dirname(__FILE__).'/cache');
$config->set('Core', 'Encoding', 'EUC-JP');
$purifier = new HTMLPurifier($config);


ただ個人的には、それでも不特定多数にHTMLを許可するべきではないと考えてます。
どうしても仕方がない状況にだけ、このHTMLPurifierを通過させることで自分を納得させる、なんて使い方が正しいでしょうか。

HTMLPurifierの性能を信用しないわけではありません。でも、世界で広く使われている某ブラウザは信用できません。いきなり恐ろしい機能拡張がされない保証はないのです。


Related articles
Printer friendly page Send this story to a friend

Comments list

GIJOE  Posted on 2007/9/27 16:57 | Last modified
I've just implemented it.

Try d3forum-0.77!
kentauls  Posted on 2007/9/25 16:19
Hi GIJOE,

Yes, I can imagine.
I didn't think that you took wrong version of HTML Purifier .

Here I quote the lines written in the HTML Purifier archives to let many people know the concept.

Quote:
WYSIWYG - What You See Is What You Get
HTML Purifier: A Pretty Good Fit for TinyMCE and FCKeditor

Javascript-based WYSIWYG editors, simply stated, are quite amazing. But I've
always been wary about using them due to security issues: they handle the
client-side magic, but once you've been served a piping hot load of unfiltered
HTML, what should be done then? In some situations, you can serve it uncleaned,
since you only offer these facilities to trusted(?) authors.

Unfortunantely, for blog comments and anonymous input, BBCode, Textile and
other markup languages still reign supreme. Put simply: filtering HTML is
hard work, and these WYSIWYG authors don't offer anything to alleviate that
trouble. Therein lies the solution:

HTML Purifier is perfect for filtering pure-HTML input from WYSIWYG editors.

Enough said.

I know nothing perfect in the world, but hope it can be enough appropriate for use of FCKeditor on d3forum not for anonymous but members!!
GIJOE  Posted on 2007/9/22 17:39
hi kentauls.

Quote:

I thought HTMLPurifier should work not only with PHP5 because I found two different zip files in the download section of their website (http://htmlpurifier.org/download.html); "HTML Purifier 2.1.2 PHP5-strict" and the other one not with PHP5-strict.

No. I use the latter, of course.
If we run the "strict version" with PHP4, it must raise "fatal errors".
(PHP4 never allows 'private' etc.)

Anyway, the lifetime of PHP4 will be ended soon.
This problem will be non-sense
kentauls  Posted on 2007/9/21 7:11
Thank you so much. I'm really happy that HTMLPurifier has been included in Protector.
I've already installed the latest version of Protector and Pico with the function of "HTMLPurifier" ON. It seems to be working with success!!

I thought HTMLPurifier should work not only with PHP5 because I found two different zip files in the download section of their website (http://htmlpurifier.org/download.html); "HTML Purifier 2.1.2 PHP5-strict" and the other one not with PHP5-strict.
suico  Posted on 2007/9/21 1:56
Thanks for the tip I'll try this in my next module : yogurt and report back here how easy or dificult it was to use it.
Login
Username or e-mail:

Password:

Remember Me

Lost Password?

Register now!