Skip to content Skip to sidebar Skip to footer

HTML Parse Special Chars In Android

I have this simple problem: once I retrieve a mail text, sometimes it happens that Html.fromHtml cannot parse correctly the string. I'll give you an example. This is the HTML strin

Solution 1:

You can filter out the hidden characters (in this case) with:

myString = myString.replaceAll( "[\\u202C\\u202A]", "" );

After that it's just:

Html.fromHtml(myString);

And it will work in html context. Or if you want the real em dash characters:

Html.fromHtml(Html.fromHtml(myString));

Demo of the concept: http://jsfiddle.net/CGzDc/ (javascript, you will have to use code in this answer for java)


Solution 2:

The string in your example is HTML notation for –––& (literally), so the correct browser behavior is to render it that way. For some reason that cannot be guessed from the description, some software has applied double encoding in the sense of first encoding the em dash “—” as – and then encoding the & again, as &.

By the way, a sequence of consecutive em dashes may or may not produce a continuous line; this depends on the font. There are more reliable ways to producing long lines, such as the <hr> element and border properties in CSS.


Post a Comment for "HTML Parse Special Chars In Android"