You are viewing...

Why I Do Not Store UUID as CHAR or VARCHAR

Updated on September 14, 2019 at the 16th hour
Posted under:

DISCLAIMER: All views are considered my own and you should not draw any conclusions on associates.

Misunderstandings

UUIDs.... Such a widely badly stored and used form of identification. Would you believe that people store UUIDs are in MySQL as not only primary key, but CHAR and VARCHAR?! Poor MySQL, poor B-Tree and poor computers that are abused by incompetence.

Did you know that UUIDs have 6 versions? Did you know that UUIDs are 128bit random numbers? Did you also know that UUID stored as ASCII doubles the character length and storage space? Do you know why? Did you know that MySQL has to encode and decode according to your character set each UUID you decided to store as (VAR)CHAR? Did you know that write perform will suffer if you use random data as primary key? Do you know how the primary key is used in MySQL? Clustering key ring any bells?

You didn't? Now you know! Stop bending things to fit your world view just because you do not understand how a system is designed. Convincing people that "it works for me, so I'm gonna spread the word" is self serving and under serving to people who have to maintain your system in the future.

Enlightenment

Store as binary. UUIDs are 128bits of randomness depending on the version you decide to use. UUID4 tends to be most used for identification of objects. When you decide to convert this into a human readable hexadecimal representation (aka ASCII), the space needs to store doubles. Why? Every 4 bits of the UUID represented is hex therefore that hex character is 8 bits of storage. ASCII is 1 byte per character. Yes, you just increased storage requirements in your application and database by at least 2x! Your database character set may encode each character as 2 or 4 bytes! Be careful! You don't do much with UUID generally, so store as tightly packed binary until the last minute. Yes, use BINARY(16) in MySQL, use Buffer in NodeJS, byte[] in Java, Vec<u8> in Rust. Whatever your fancy.

Primary keys and randomness, just use a surrogate invisible key. Do yourself a favor and read about your table's engine storage indexes. If it says B(+/-)-Tree, what happens when you insert random numbers in such a tree? Random splits?! Wooooahhh, what?! University data structures class. That's is what you are doing to poor MySQL. If you can change your table's primary keys to use a HASH index in your database, then you are generally fine unless the database uses the primary key as the clustering index! Google "Clustering Index" Yeah, just use an "invisible" column called "primary_index" and make it auto-increment and use the uuid column (UNIQUE INDEX) as your application's row/object index. NeverĀ  expose the "primary_index" in your queries hence why it is "invisible".

Good?

Use and store UUIDs responsibly.

You just read "Why I Do Not Store UUID as CHAR or VARCHAR". Please share if you liked it!
You can read more recent posts here.