I have recently started a new Ruby on Rails project which I hope will exist for a very long time. I decided, after getting burned in the past regarding this issue, to ensure the application supports emojis in all text fields. Here is the stack:
- Ruby on Rails 5.2.0
- Ruby 2.3.1
- Elastic Beanstalk (Passenger with Ruby 2.3 running on 64bit Amazon Linux/2.7.2)
- Aurora MySQL 5.6.10a
The Problem
In my experience, if users are allowed to enter text, they WILL enter emojis. Rather than do insane things to strip these characters out prior to writing a record to the database, I decided to do this the right way and support the use of these characters (without switching away from this stack).
The Solution
First, try to find this article when you create your project. There are additional headaches to convert an EXISTING stack TO this state.
In config/database.yml, add two lines related to the character encoding of the DB,
your_environment:
...
encoding: utf8mb4
collation: utf8mb4_bin
Then, in config/initializers, add a file entitled mysql_utf8mb4_fix.rb
, containing the following code. Supporting emojis limits the number of characters that can go into a VARCHAR field to 191, if you are going to index the field. I got this code here.
The Final Step
I will not go into the details of creating an Aurora cluster, but once you have created your cluster, you must create a custom Parameter Group in the RDS console, modify a few values, and attach it to your cluster.
Create the parameter group:
Set the following Parameter Group values to utf8mb4
:
character_set_client
character_set_connection
character_set_database
character_set_filesystem
character_set_results
character_set_server
Set the following Parameter Group values to utf8mb4_bin
:
collation_connection
collation_server
Save your changes, and attach the parameter group to your cluster.
Confirm This Works
Connect to the Aurora database with something like Sequel Pro and confirm that your tables’ character set is utf8mb4, and your VARCHAR columns have a max length of 191.
Then, ssh into your instance and make sure code similar to this works:
(note that I have a model called ‘Mannequin’ and have indexed the string field ‘email’)
Let me know what you think in the comments, and let me know if there’s a better way to do this with this particular stack!