Using emojis nodejs mysql docker-compose

Making use of Emojis in 2020 seems like the most normal thing right?

It turns out neither mysqljs nor mysql nor docker mysql image are setup to support them by default.

While trying to set them up, I encountered a few errors (which by the end of this post we will solve):

  1. ER_CANT_AGGREGATE_2COLLATIONS: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
  2. ER_CANT_AGGREGATE_2COLLATIONS: Illegal mix of collations (utf8mb4_general_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
  3. Emojis appearing as question marks ???? in the database.

Moving from latin1 to utf8

In the first error Illegal mix of collations (utf8_general_ci,IMPLICIT) and (latin1_general_ci,COERCIBLE) near '=', MySQL is telling us that we are passing a utf-8 and comparing it with a latin1 column in the database. There is a key difference between collation and character set: the former is for sorting sets of characters, and the other is for determining what character should be assumed from the current binary value.

Solution

You want to convert your database to utf-8.

IMPORTANT: although you may be tempted to use a CHARSET utf8, utf8 is a bad implementation of UTF-8, instead you should be using utf8mb4 (which allows for 4 bytes to store characters instead of 3, and you need 4 for emojis).

Dump the database

If you are using Docker, you will have to export the database:

docker exec -it my-mysql-container sh
# within the container sh
mysqldump -u my-user -p my-db-name > /var/lib/mysql/my-dump.sql

Then you will want to vim into it and do a search and replace, by typing the following sequence gg:.,$s/latin1 /utf8mb4 /g then enter, then :wq then enter again.

Kill the docker mysql container and volumes

docker rm -f my-mysql-container
docker volume rm my-mysql-container-volume

Now you have no garbage defaults from the previous container. Let's recreate:

Override mysql docker defaults

You will want to override the default configuration for the mysql docker image. To do so, you need to make a my.cnf accessible to mysql within the container. This is achieved by mounting the file as a volume (everywhere, mysql will look for it). So create our my.cnf in the same directory as docker-compose.yml.

my.cnf:

[client]
default-character-set = utf8mb4$

[mysql]
default-character-set = utf8mb4

[mysqld]
init_connect='SET collation_connection = utf8mb4_unicode_ci'
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

Now we need to tell docker-compose where to mount it, so in docker-compose.yml:

# ...
     volumes:
       - ./mymysql.cnf:/etc/mysql/conf.d/my.cnf
       - ./mymysql.cnf:/etc/mysql/my.cnf
       - ./mymysql.cnf:/etc/my.cnf
# ...

Notice above that we are mounting it in many places (that's because we do not have time to figure out where mysql will look for it, so we spread the good word).

Telling mysqljs collation connection

This above, may be enough. But if it is not (aka. you are getting question marks instead of emojis), try the following.

Now, you have been able to insert, but you are seeing ???? instead of the emoji. This is a sign that there is a discrepancy between the database charset or collation and the actual connection / client collation.

What we need to do is tell mysqljs to use the proper charset, in your .env file or in the docker-compose.yml:

  environment:
    DB_CHARSET: utf8mb4$
    DB_COLLATION: utf8mb4_general_ci$

This assumes that you are using process.env to pass those values to mysqljs (this is done automagically by mysql-oh-wait package).

Another approach

You can try a different way: by telling docker-compose to start the mysql with a different command. Add this to your docker-compose.yml:

my-mysq-container-name:
  restart: always
  command: --default-authentication-plugin=mysql_native_password --character-set-server=utf8mb4 --collation-server=utf8mb4_general_ci

But again, this is just trying to achieve the same from a different angle than the initial my.cnf file.

Troubelshooting

You can issue a few commands (which may throw if you have FOREIGN KEY Constraints) to try to alter the database CHARSET before using mysqldump:

ALTER DATABASE my_db_name CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE My_table_name,
 CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;

But if you have foreign key constraints errors, you can do the transformation altering the mysqldump output file as in previous section.

NOTE: To check what the current character set / collations are, you can issue (from within the container):

mysql -p -e "show variables like '%collation_connection';"

Or within mysql client (alsow thin the container...):

SELECT @@character_set_database, @@collation_database;
SELECT * FROM INFORMATION_SCHEMA.SCHEMATA;
SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME LIKE '%';
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

References