martes, 20 de octubre de 2009

PHP some more !

today I had a small yet interesting chore to do on PHP...

again with charsets, anyhow I needed to find a specific UTF-8 char on any given string.

I don't play that much with UTF characters however ( this might be public knowledge , well it is , but I did not know it )

ok it seems every UTF 8 character is 3 characters in one... so when you want to analize it you have to analize the 3 characters separately to get a final response.

so a UTF character enters to the PHP script like this: %E2%95%90
and it becomes inside the PHP E2 95 90 so the string contains E2 95 90 :)

so then I wanted to filter out characters like this so I made a small conversion of the first character to decimal

E2=226

so I did this check against he variable:

$val = $_GET['text'];
if( strpos ( $val , chr(226) ) !== false )
{// stuff}


So after looking at many UTF functions out there, this was the best option based on performance and whatnot...

I don't think it would be valid for all types of UTF encoding and there are UTF functions in PHP but sometimes they are just too big do to something small

I know I could have used the E2 directly but I am the kind of nerd that likes to use the dec notation :)

anyhow that is usefull.

=)

No hay comentarios: