Why do I have partially blank or missing text in posts: Difference between revisions From Online Manual

Jump to: navigation, search
(Categorized with UTF8)
(extended, and corrected statement that UTF-8 database would solve the problem)
Line 1: Line 1:
When users write up posts in editors like Word there are times that some characters get converted to non-standard characters. These characters dont show when using a standard install of SMF. These characters usually are known as "Smart Quotes". If a text editor is used to write the posts and not a word processor, most of the time this would not be an issue. In order to fix this issue, either the posts need to be edited to not include characters that are not in a standard character set, or the database needs to be converted to UTF-8. You can read [[UTF-8_Readme| here]] further about UTF-8 and converting to it.
When users write up posts in ''word processors'' such as MS Word there are times that some characters get converted to non-standard characters. These characters don't show when using a standard install of SMF. These characters are what Microsoft calls "Smart Quotes". MS created their own character encoding ("code page"), taking Latin-1 (ISO-8859-1) and replacing a number of reserved control codes in positions x80 through x9F with various specialty characters (typographically proper quotes, dashes, euro sign, etc.). The encoding is properly referred to as CP-1252, and is not fully compatible with either Latin-1 or UTF-8, since it uses reserved control codes.


If a proper text editor is used to write the posts, and not a word processor, most of the time this would not be an issue. In order to fix this issue, the posts need to be edited to ''not'' include characters that are not in a standard character set. It will not do any good, as some suggest, to change the database encoding to UTF-8. These "smart quotes" are not found in that position in UTF-8 encoding. They would still be invalid characters when a page is displayed in UTF-8 encoding. If you insist on using Word to type up a post (i.e., you also want to preserve the text in a Word document), remember to disable "smart quotes" in Word so it won't replace characters. If you don't, you'll have to manually edit the text once it's been pasted into SMF.


[[Category:FAQ]]
[[Category:FAQ]]
[[Category:UTF-8 FAQ]]
[[Category:UTF-8 FAQ]]

Revision as of 16:39, 4 April 2012

When users write up posts in word processors such as MS Word there are times that some characters get converted to non-standard characters. These characters don't show when using a standard install of SMF. These characters are what Microsoft calls "Smart Quotes". MS created their own character encoding ("code page"), taking Latin-1 (ISO-8859-1) and replacing a number of reserved control codes in positions x80 through x9F with various specialty characters (typographically proper quotes, dashes, euro sign, etc.). The encoding is properly referred to as CP-1252, and is not fully compatible with either Latin-1 or UTF-8, since it uses reserved control codes.

If a proper text editor is used to write the posts, and not a word processor, most of the time this would not be an issue. In order to fix this issue, the posts need to be edited to not include characters that are not in a standard character set. It will not do any good, as some suggest, to change the database encoding to UTF-8. These "smart quotes" are not found in that position in UTF-8 encoding. They would still be invalid characters when a page is displayed in UTF-8 encoding. If you insist on using Word to type up a post (i.e., you also want to preserve the text in a Word document), remember to disable "smart quotes" in Word so it won't replace characters. If you don't, you'll have to manually edit the text once it's been pasted into SMF.



Advertisement: