[Explained]: The difference between VARCHAR and TEXT in MySQL
When creating a MySQL database table, the type is one of the mandatory properties that you are required to specify for each of its fields/columns.
For instance, when using phpMyAdmin to create a database table, you will have an interface as in the screenshot below where you are required to select one type from a dropdown consisting of a long list of possible values.
Below is a sample SQL query for creating the above user table with fields id, first_name, last_name, gender, dob, created_at, and updated_at.
CREATE TABLE users (
id int(20) NOT NULL PRIMARY KEY AUTO_INCREMENT,
first_name varchar(30) NOT NULL,
last_name varchar(30) NOT NULL,
gender varchar(10) NOT NULL,
dob date NOT NULL,
created_at datetime NOT NULL,
updated_at timestamp NOT NULL DEFAULT current_timestamp()
As you can see in both the phpMyAdmin or query method, the type is a mandatory requirement for all the fields in order for the table creation to work.
As a developer, you must decide what type of data will be stored inside each column when creating a table. For instance, if you intend to store numeric data you will need to choose a numeric type(eg. INT, DECIMAL, etc). If you want a column to store the date of birth, then you choose DATE. If you want to store the date and time, you set the type to DATETIME, and so forth.
The confusion comes to many when deciding to use VARCHAR or TEXT as the type. In this article, we cover the 2 types in-depth, their similarities, and where each is applicable when creating a database table.
Understanding VARCHAR and TEXT data types
Both VARCHAR and TEXT fall under the category of strings. They are used to store data in form of strings of characters, which may comprise any type of data such as alphabets, numeric figures, or special symbols and characters.
They both can store a maximum length of 65,535 characters (inclusive of spaces).
Initially, VARCHAR type could only store a maximum of 255 characters. In MySQL version 5.0.3, its length was increased to a maximum of 65,535 characters making VARCHAR type more similar to TEXT than ever before.
The maximum size of TEXT type is fixed at 65,535 characters and cannot be limited to a smaller max size. That way, you don't have to specify its length when creating the table.
On the other hand, VARCHAR column stores variable-length strings. You can limit the max size of the VARCHAR column to anything between 1 and 65,535 characters. It is a requirement to specify the maximum length of strings that you intend to store in the column when creating the table.
Both TEXT and VARCHAR can be used to store long-form text such as property or product descriptions, text articles on news or blog websites, etc. However, their lengths will not be sufficient for holding the text of an entire book.
If the data to be inserted in these columns exceeds 65,535 characters/bytes (or whichever max size is set for VARCHAR), then it gets truncated and a warning is given.
If you wish to store longer strings of text, then MEDIUMTEXT and LONGTEXT are the best alternatives.
When creating a database table and giving the size to its columns, it is important to note that the row size should not exceed 65,535 characters(spaces inclusive) or bytes.
It is therefore important to take into consideration that the sum of the size of all the columns should not exceed 65,535. Always sum up all the lengths for those columns where length is applicable to make sure the total remains within the range.
This maximum row size is exclusive of BLOB and TEXT types.
While the column is in most cases expected to have the capability of storing a string with a length lesser or equal to its max size (characters), that is not always the case. This is because a string length can fall within the accepted range but exceed the number of bytes allocated.
In most cases, the number of characters in a string is equal to its size in bytes. ie, a 65,535 characters string will be 65,535 bytes. However, some characters in multi-byte character sets (eg. utf8 or utf8mb4) may take more than 1 byte.
In order to calculate the number of bytes used to store a particular VARCHAR or TEXT column value, you must consider the character set used for that column and whether the value contains multibyte characters. When using a
utf8 Unicode character set, you need to know that not all characters use the same number of bytes. For instance,
utf8mb4 character sets may need up to three and four bytes per character, respectively.
The storage requirements for VARCHAR and TEXT types depend on these factors:
- The actual length of the column value
- The column's maximum possible length
- The character set used for the column, because some character sets contain multibyte characters
Besides the length we define for VARCHAR datatype storage, MySQL takes an additional 1 or 2 bytes of space to store the length of the column value.
For VARCHAR type, the space for storing the length of the column value depends upon the length we specify for that column. If the length of the column set is less than 255 bytes then it will take 1 byte of the data to store the column length (prefix) value. Similarly, if the specified VARCHAR column requires more than 255 bytes of data then 2 bytes of space will be required to store the prefix value.
We can then calculate the amount of storage space required for a VARCHAR column value as below where L represents the actual length in bytes of a given string value.
L + 1 bytes if column values require 0 − 255 bytes, L + 2 bytes if values may require more than 255 bytes.
For TEXT type, we use the formula below:
L + 2 bytes, where L < 216
This is because TEXT column type takes an overhead of 2byte for storing the column length value.
A column of VARCHAR type with a max size of 255 can hold a string with a maximum length of 255 characters. If for instance, the column uses the
latin1 character set (one byte per character), the actual storage required is the length of the string (L), plus one byte to record the length of the string. For the string 'abcd', L is 4 and the storage requirement is five bytes. If the same column is instead declared to use the
ucs2 double-byte character set, the storage requirement will be 10 bytes: The length of 'abcd' is eight bytes and the column requires two bytes to store lengths because the maximum length is greater than 255 (up to 510 bytes).
A VARCHAR column can be indexed. However, since the length of an index is limited, if your VARCHAR column is too long you have to use only the first few characters of the VARCHAR column in your index.
On the other hand, you cannot put an index (except for a full-text index) on a TEXT column.
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data. This is much faster than reading every row sequentially.
Similarities between MySQL VARCHAR and TEXT
Below are the similarities between the types VARCHAR and TEXT in MySQL:
- Both are used to store strings (which may comprise alphabets, numeric figures, or special symbols and characters).
- Both VARCHAR and TEXT can store a maximum length of 65,535 characters.
Differences between MySQL VARCHAR and TEXT
Below are the differences between the types VARCHAR and TEXT in MySQL:
- Text type has a fixed maximum size of 65,535 characters (which cannot be limited to a smaller max size), while the max size for a VARCHAR type is variable and can be set to anything between 1 and 65,535 characters.
- It is mandatory to specify the length for a VARCHAR column when creating a MySQL table while you don't have to specify the length for TEXT type.
- A column of VARCHAR type can be part of an index whereas a TEXT column requires you to specify a prefix length, which can be part of an index.
- VARCHAR is stored inline with the table, making it potentially faster. On the other hand, TEXT is stored off the table with the table having a pointer to the location of the actual storage. Using a TEXT column in a sort will require the use of a disk-based temporary table, as the MEMORY (HEAP) storage engine.
- A column of TEXT type takes a fixed space of 2 bytes to store the column length value while in VARCHAR type the space if variable. If the length set for the column is less than 255 bytes, it will take 1 byte and if it is more than 255 bytes of data then 2 bytes of space will be required to store the column length value.
Both can be used to store long-form text strings (which may comprise of alphabets, numeric figures, or special symbols and characters).
The average length of an English word is 5 letters long and needs 1 extra character for spacing. The VARCHAR's or TEXT’s 65,535 characters column storage capacity can therefore hold approximately over 10,920 words. This is sufficient to store long articles on blogs or products descriptions on eCommerce sites. In most of the case scenarios, VARCHAR and TEXT types are enough to store any information (though for VARCHAR it will depend on the max length you set for the field when creating the table).
It is usually advisable to use VARCHAR if your data is of variable length and you know it fits in the 65,535 character limit.
VARCHAR is preferable in most circumstances as it provides better performance(lead to potentially fewer disk reads and fewer writes), it’s more flexible, and can be fully indexed.
Though VARCHAR is the best choice, it is not a must for you to use it. TEXT is a reasonable choice if the above limitations are not important to you.