Reply to topic  [ 7 posts ] 
General Error SQL ERROR on special unicode character 
Author Message
User avatar

Joined: Thu Aug 27, 2015 10:43 am
Posts: 177
Location: Saint-Petersburg, Russia
Post General Error SQL ERROR on special unicode character
Code:
General Error
SQL ERROR [ mysql4 ]

Incorrect string value: '\xF0\x9F\x8F\xAF -...' for column 'post_text' at row 1 [1366]

An SQL error occurred while fetching this page. Please contact the Board Administrator if this problem persists.


Get this error when tied to post message with this character https://www.compart.com/en/unicode/U+1F3EF.


Thu Aug 30, 2018 2:04 pm
Profile
Site Admin
User avatar

Joined: Fri Mar 04, 2011 9:19 pm
Posts: 3021
Location: San Jose, CA
Post Re: General Error SQL ERROR on special unicode character
Curious. It might be interpreted as some kind of escape code in the SQL engine, or perhaps it can't handle unicode values above a certain threshold. I don't really have any direct way to investigate it, as it's not my server.

_________________
Outsider


Thu Aug 30, 2018 4:28 pm
Profile WWW

Joined: Thu Aug 30, 2018 5:17 pm
Posts: 8
Post Re: General Error SQL ERROR on special unicode character
I'll save you some time. This is a MySQL issue.

The column type for post_text is set to `utf8`. So you'd think it stores UTF-8 text, which covers anything Unicode. But MySQL's programmers didn't know how UTF-8 actually works so any UTF-8 code longer than 3 bytes causes it to fail. They screwed up so badly in how they were storing UTF-8 data that they couldn't just fix it, because if it started working correctly it would corrupt all the data people had already unknowingly stored using their bad Unicode implementation.

So what they did is create a second data type, `utfmb4`, which is UTF-8 implemented correctly. So unless you convert your entire database (and hope it doesn't mess up the software using it) you can't store anything that requires a 4 byte UTF-8 code. Fortunately the 5% of the Unicode range you can actually store represents almost all the languages in common use, with the exception of various names and older characters in languages like Chinese, some obscure or fictional languages, and a ton of emoji.


Thu Aug 30, 2018 5:45 pm
Profile
Site Admin
User avatar

Joined: Fri Mar 04, 2011 9:19 pm
Posts: 3021
Location: San Jose, CA
Post Re: General Error SQL ERROR on special unicode character
Makes sense.

_________________
Outsider


Thu Aug 30, 2018 7:30 pm
Profile WWW
Moderator
User avatar

Joined: Thu May 12, 2011 8:38 am
Posts: 557
Post Re: General Error SQL ERROR on special unicode character
Since this occurred at the same time as the SQL Error, I'm assuming it's a related problem.

The user dragoongfa appears to have fallen into a memory hole. His profile appears to no longer exist, and all posts made by him have vanished. An Example can be seen here with a thread he started, yet no posts from him.

It's possible that other less prolific users may also have disappeared without note.

_________________
Image

Spoiler: show
This is my Mod voice. If you see this in a thread, it means that the time for gentle reminders has passed.


Sun Sep 09, 2018 12:28 am
Profile
Site Admin
User avatar

Joined: Fri Mar 04, 2011 9:19 pm
Posts: 3021
Location: San Jose, CA
Post Re: General Error SQL ERROR on special unicode character
Dragoongfa has somehow been removed from the forums database. There is no administrative record of him being manually deleted, so presumably this happened in conjunction with problems with the user and sessions tables after the server maintenance.

I will look into the posts database and try to determine whether his posts are gone or just "lost" (assigned to a user ID that doesn't exist). Does anyone know when he last posted, and in what thread?

[edit] I was able to find that his posts are still in the database, so I reassigned them to his new user id. They are once again appearing in the threads.

_________________
Outsider


Sun Sep 09, 2018 1:17 am
Profile WWW

Joined: Tue May 26, 2015 3:33 am
Posts: 616
Post Re: General Error SQL ERROR on special unicode character
As a side note, this issue with UTF-8 characters having more than two bytes (which is definitely legal and warranted in some cases) was the reason for iOS crashing when someone sent a text message containing a specific unicode character. Said character used three or four bytes, having made the UTF-8 translation in iOS barf and die.

Expect problems like these to be rather widespread since the used code and the fallacy of the software designers thinking "two bytes is enough" is rather common.


Tue Sep 11, 2018 8:14 pm
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 7 posts ] 

Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by STSoftware for PTF.