in PHP

urlencode和rawurlencode的区别以及url编码一些需要注意的地方。

PHP手册关于urlencode是这样写的:

Return Values

Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs. It is encoded the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type. This differs from the » RFC 1738 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.

简单的说来就是urlencode这个函数会把空格编码成‘+’.而rawurlencode会把空格编码为%20;

另外刚才遇到了一个事情耽误了我不少时间。对于不同的字符编码格式,encode后也会不同的,我是想解析一个url,然后就用来urldecode这个函数,解析出来‘%E6%88%90%E9%83%BD’这一窜字符为‘成都’,但是我用urlencode编码‘成都’却得到的是‘%B3%C9%B6%BC’。想来一会才觉得应该是编码的问题,我编码得到的一个汉字占领2个字节,而我原来解析的是一个字3个字节,这刚好是UTF-8编码和ASC编码的区别,原来是因为UltraEdit要另存为一次,才能把汉字转换成UTF-8编码,或者使用ANSI/ASC->UTF-8的功能。

现在另存为一次,选择UTF-8无BOM的编码后,可以成功的urlencode(‘成都’)为%E6%88%90%E9%83%BD了。