Page 1 of 1

General Error SQL ERROR on special unicode character

Posted: Thu Aug 30, 2018 9:04 pm
by SVlad

Code: Select all

General Error
SQL ERROR [ mysql4 ]

Incorrect string value: '\xF0\x9F\x8F\xAF -...' for column 'post_text' at row 1 [1366]

An SQL error occurred while fetching this page. Please contact the Board Administrator if this problem persists.
Get this error when tied to post message with this character https://www.compart.com/en/unicode/U+1F3EF.

Re: General Error SQL ERROR on special unicode character

Posted: Thu Aug 30, 2018 11:28 pm
by Arioch
Curious. It might be interpreted as some kind of escape code in the SQL engine, or perhaps it can't handle unicode values above a certain threshold. I don't really have any direct way to investigate it, as it's not my server.

Re: General Error SQL ERROR on special unicode character

Posted: Fri Aug 31, 2018 12:45 am
by EdwardSteed
I'll save you some time. This is a MySQL issue.

The column type for post_text is set to `utf8`. So you'd think it stores UTF-8 text, which covers anything Unicode. But MySQL's programmers didn't know how UTF-8 actually works so any UTF-8 code longer than 3 bytes causes it to fail. They screwed up so badly in how they were storing UTF-8 data that they couldn't just fix it, because if it started working correctly it would corrupt all the data people had already unknowingly stored using their bad Unicode implementation.

So what they did is create a second data type, `utfmb4`, which is UTF-8 implemented correctly. So unless you convert your entire database (and hope it doesn't mess up the software using it) you can't store anything that requires a 4 byte UTF-8 code. Fortunately the 5% of the Unicode range you can actually store represents almost all the languages in common use, with the exception of various names and older characters in languages like Chinese, some obscure or fictional languages, and a ton of emoji.

Re: General Error SQL ERROR on special unicode character

Posted: Fri Aug 31, 2018 2:30 am
by Arioch
Makes sense.

Re: General Error SQL ERROR on special unicode character

Posted: Sun Sep 09, 2018 7:28 am
by Razor One
Since this occurred at the same time as the SQL Error, I'm assuming it's a related problem.

The user dragoongfa appears to have fallen into a memory hole. His profile appears to no longer exist, and all posts made by him have vanished. An Example can be seen here with a thread he started, yet no posts from him.

It's possible that other less prolific users may also have disappeared without note.

Re: General Error SQL ERROR on special unicode character

Posted: Sun Sep 09, 2018 8:17 am
by Arioch
Dragoongfa has somehow been removed from the forums database. There is no administrative record of him being manually deleted, so presumably this happened in conjunction with problems with the user and sessions tables after the server maintenance.

I will look into the posts database and try to determine whether his posts are gone or just "lost" (assigned to a user ID that doesn't exist). Does anyone know when he last posted, and in what thread?

[edit] I was able to find that his posts are still in the database, so I reassigned them to his new user id. They are once again appearing in the threads.

Re: General Error SQL ERROR on special unicode character

Posted: Wed Sep 12, 2018 3:14 am
by novius
As a side note, this issue with UTF-8 characters having more than two bytes (which is definitely legal and warranted in some cases) was the reason for iOS crashing when someone sent a text message containing a specific unicode character. Said character used three or four bytes, having made the UTF-8 translation in iOS barf and die.

Expect problems like these to be rather widespread since the used code and the fallacy of the software designers thinking "two bytes is enough" is rather common.