Why do I have partially blank or missing text in posts: Difference between revisions From Online Manual

Jump to: navigation, search
(Undo revision 11747 by MrPhil (talk))
(There are two reasons for this -- UTF-8 in non-UTF8 forums, and non-standard chars in any forum)
Line 1: Line 1:
When users write up posts in ''word processors'' such as MS Word there are times that some characters get converted to non-standard characters. These characters don't show when using a standard install of SMF. These characters are what Microsoft calls "Smart Quotes". MS created their own character encoding ("code page"), taking Latin-1 (ISO-8859-1) and replacing a number of reserved control codes in positions x80 through x9F with various specialty characters (typographically proper quotes, dashes, euro sign, etc.). The encoding is properly referred to as CP-1252, and is not fully compatible with either Latin-1 or UTF-8, since it uses reserved control codes.
There are a couple of reasons for this problem


If a proper text editor is used to write the posts, and not a word processor, most of the time this would not be an issue. In order to fix this issue, the posts need to be edited to ''not'' include characters that are not in a standard character set. It will not do any good, as some suggest, to change the database encoding to UTF-8. These "smart quotes" are not found in that position in UTF-8 encoding. They would still be invalid characters when a page is displayed in UTF-8 encoding. If you insist on using Word to type up a post (i.e., you also want to preserve the text in a Word document), remember to disable "smart quotes" in Word so it won't replace characters. If you don't, you'll have to manually edit the text once it's been pasted into SMF.
==Using UTF-8 characters in a non-UTF-8 forum==
If your users are using UTF-8 characters, perhaps copied in from some editing program, but you did not use UTF-8 when you set up the forum, these UTF-8 characters (special characters like £) will not show up properly in your forum.


Some users have found that converting their database to UTF-8 may help resolve this issue.
There are two solutions -- either remove the "special" UTF-8 characters from the post, or convert your forum to UTF-8.
 
==Using non-standard characters in an SMF forum==
Some word processing programs replace standard characters with non-standard characters. Thes non-standard characters cannot be represented properly by your forum, whether you use Latin-1 or UTF-8.  The only solution is to edit the post, and replace the non-standard characters with standard characters compatable with your forum.
 
===MS Word "smart quotes"===
MS Word is an example of a word processor using non-standard characters.  Microsoft calls them "Smart Quotes". The character encoding is CP-1252, and uses reserved control codes in positions x80 through x95 to represent some specialty characters (typographically proper quotes, dashes, euro sign, etc.). These characters are not compatable with Latin-1 or UTF-8 character encodings, and should not be used in SMF forums, which do not support CP-1252 characters.  To avoid the problem, users may disable "smart quotes" in Word, and manually remove the non-standard characters wherever they appear on the forum.


[[Category:FAQ]]
[[Category:FAQ]]
[[Category:UTF-8 FAQ]]
[[Category:UTF-8 FAQ]]

Revision as of 14:42, 12 April 2012

There are a couple of reasons for this problem

Using UTF-8 characters in a non-UTF-8 forum

If your users are using UTF-8 characters, perhaps copied in from some editing program, but you did not use UTF-8 when you set up the forum, these UTF-8 characters (special characters like £) will not show up properly in your forum.

There are two solutions -- either remove the "special" UTF-8 characters from the post, or convert your forum to UTF-8.

Using non-standard characters in an SMF forum

Some word processing programs replace standard characters with non-standard characters. Thes non-standard characters cannot be represented properly by your forum, whether you use Latin-1 or UTF-8. The only solution is to edit the post, and replace the non-standard characters with standard characters compatable with your forum.

MS Word "smart quotes"

MS Word is an example of a word processor using non-standard characters. Microsoft calls them "Smart Quotes". The character encoding is CP-1252, and uses reserved control codes in positions x80 through x95 to represent some specialty characters (typographically proper quotes, dashes, euro sign, etc.). These characters are not compatable with Latin-1 or UTF-8 character encodings, and should not be used in SMF forums, which do not support CP-1252 characters. To avoid the problem, users may disable "smart quotes" in Word, and manually remove the non-standard characters wherever they appear on the forum.



Advertisement: