Why do I have partially blank or missing text in posts: Difference between revisions From Online Manual

Jump to: navigation, search
(extended, and corrected statement that UTF-8 database would solve the problem)
No edit summary
 
(11 intermediate revisions by 4 users not shown)
Line 1: Line 1:
When users write up posts in ''word processors'' such as MS Word there are times that some characters get converted to non-standard characters. These characters don't show when using a standard install of SMF. These characters are what Microsoft calls "Smart Quotes". MS created their own character encoding ("code page"), taking Latin-1 (ISO-8859-1) and replacing a number of reserved control codes in positions x80 through x9F with various specialty characters (typographically proper quotes, dashes, euro sign, etc.). The encoding is properly referred to as CP-1252, and is not fully compatible with either Latin-1 or UTF-8, since it uses reserved control codes.
There are two reasons which explain why this problem occurs.


If a proper text editor is used to write the posts, and not a word processor, most of the time this would not be an issue. In order to fix this issue, the posts need to be edited to ''not'' include characters that are not in a standard character set. It will not do any good, as some suggest, to change the database encoding to UTF-8. These "smart quotes" are not found in that position in UTF-8 encoding. They would still be invalid characters when a page is displayed in UTF-8 encoding. If you insist on using Word to type up a post (i.e., you also want to preserve the text in a Word document), remember to disable "smart quotes" in Word so it won't replace characters. If you don't, you'll have to manually edit the text once it's been pasted into SMF.
==Using UTF-8 characters in a non-UTF-8 forum==
If your members are using UTF-8 characters, perhaps copied into their posts from an editing program, but you did not use UTF-8 when you set up the forum, these UTF-8 characters (special characters like £) will not show up properly in posts.
 
There are two solutions. You can either remove the "special" UTF-8 characters from the post or [[UTF-8 Readme|convert your forum to UTF-8]].
 
===MS Word "smart quotes"===
MS Word uses non-standard characters called "Smart Quotes". The character encoding is Windows-1252 or CP-1252, and uses ISO-8859-1 reserved control codes in positions x80 up to and including x95 to represent some special characters (typographically proper quotes, dashes, euro sign, and others). These characters are not compatible with ISO-8859 or UTF-8 character encodings, and should not be used in SMF forums, which do not support Windows-1252 characters. To avoid the problem, members may disable "smart quotes" in Word, and manually remove the non-standard characters wherever they appear on the forum.


[[Category:FAQ]]
[[Category:FAQ]]
[[Category:UTF-8 FAQ]]

Latest revision as of 15:57, 31 May 2016

There are two reasons which explain why this problem occurs.

Using UTF-8 characters in a non-UTF-8 forum

If your members are using UTF-8 characters, perhaps copied into their posts from an editing program, but you did not use UTF-8 when you set up the forum, these UTF-8 characters (special characters like £) will not show up properly in posts.

There are two solutions. You can either remove the "special" UTF-8 characters from the post or convert your forum to UTF-8.

MS Word "smart quotes"

MS Word uses non-standard characters called "Smart Quotes". The character encoding is Windows-1252 or CP-1252, and uses ISO-8859-1 reserved control codes in positions x80 up to and including x95 to represent some special characters (typographically proper quotes, dashes, euro sign, and others). These characters are not compatible with ISO-8859 or UTF-8 character encodings, and should not be used in SMF forums, which do not support Windows-1252 characters. To avoid the problem, members may disable "smart quotes" in Word, and manually remove the non-standard characters wherever they appear on the forum.



Advertisement: