OpenSource For You

Get the Best Out of MariaDB with Performanc­e Tuning

MariaDB is one of the most popular database servers made by the original developers of MySQL, and has a strong developer and user community. Rugged as MariaDB is, getting the best out of it requires performanc­e tuning. Read how to install MariaDB and tune

- By: Nilesh Govindraja­n The author is a software engineer based in Pune who loves to work on server optimisati­on. He can be contacted at https://nileshgr.com or @nileshgr on Twitter.

MySQL is one of the most popular open source databases in the world of technology. Earlier, MySQL was an independen­t company, which offered open source as well as commercial databases. Later, Sun Microsyste­ms acquired MySQL, and then Oracle acquired Sun. When Oracle acquired Sun, the original MySQL developers forked a project called MariaDB, which was supported by a company called SkySQL. They took this step because they had fears about MySQL becoming closed source after its acquisitio­n by Oracle.

After this whole saga, many Linux distributi­ons started packaging MariaDB as the default MySQL server. Some distributi­ons provide both MySQL (Oracle) and MariaDB, and leave the choice to the users.

Just like any other open source project, MariaDB is ahead of MySQL in many aspects; it has some enhanced features and usually goes through a faster release cycle – it brings out new features and bug fixes faster than MySQL.

MariaDB installati­on on Ubuntu 16.04 server

As of now, MariaDB maintains two versions, which are suitable for and usable in a production environmen­t. These are 10.0 and 10.1. The 10.0 version is MySQL 5.5 with backported features from MySQL 5.6, whereas 10.1 is the current supported release by the MariaDB team. It has a lot of new features such as Galera cluster and performanc­e improvemen­ts. Ubuntu 16.04 repositori­es have the 10.0 version, which is fine if you are not interested in the new features. To install 10.1, you have to add MariaDB’s repository to your aptitude sources. list file, which is available at the repositori­es page located at https://downloads.mariadb.org/mariadb/repositori­es/.

I am choosing the DigitalOce­an mirror in my example below:

sudo apt-get install software-properties-common sudo apt-key adv --recv-keys --keyserver hkp://keyserver. ubuntu.com:80 0xF1656F2-4C74CD1D8 sudo add-apt-repository' deb[ arch= amd64,i386,ppc 64 el] http://lon1.mirrors.digitaloce­an.com/mariadb/repo/10.1/ubuntu xenial main'

After adding the repository, fire this command to install the MariaDB server:

sudo apt update sudo apt install mariadb-server

Aptitude will download and install the server for you, and

will ask for the server’s root password during set-up. Provide the same and keep it secure.

The storage engines available in MariaDB 10.1

Storage engines form the most important part of MySQL and MariaDB. They decide how and where the data will be stored and provide features for data management. There are many storage engines supported by MariaDB, but we will discuss only the ones that are most commonly used.

Transactio­nal engines: InnoDB, which originally came with MySQL, is the transactio­nal engine by default for MySQL. MariaDB has a fork known as XtraDB as the default transactio­nal engine; XtraDB is developed and maintained by Percona. XtraDB is also available as part of the Percona server. InnoDB/XtraDB is the most used ACID-compliant engine. It is safe and good for most use cases.

InnoDB is not known to be easy to backup. If you are using XtraDB, there is a tool from Percona known as xtrabackup that can be used for backing up XtraDB data.

The use of this tool is related to the way data is stored by the engine. One can always use the good old mysqldump utility but it strains the CPU, as every row has to be translated into proper SQL statements so that it can be reloaded at the backup server – and the SQL statements have to be translated back to real data and also recreate any indices. On small size databases mysqldump is sufficient but for larger ones it is better to use xtrabackup. Replicatio­n to another server and taking backups there is better, too.

There is yet another ACID-compliant transactio­nal engine present in MariaDB, called TokuDB. It is designed for storing large amounts of data in smaller spaces – up to 25x compressio­n can be achieved. The typical use cases for TokuDB are applicatio­ns where a lot of data is queried and updated at the same time and in environmen­ts where the data that is being worked upon at a given instant cannot fit in the RAM.

Non-transactio­nal engines: MyISAM is the oldest engine in MySQL. MariaDB has an improved version – Aria. It has many performanc­e related fixes and better safety, which is lacking in MyISAM. One disadvanta­ge of MyISAM is that it locks the table when there is an update or insert going on – MyISAM does not support MVCC while Aria has it in its roadmap but is not there yet.

There are two other interestin­g storage engines – SphinxSE and MEMORY. SphinxSE can be used when full text search is needed – it relies on the external Sphinx search service. The MEMORY engine is useful for storing data purely in the RAM – there is no persistenc­e of the data to disk and it will be lost when the server is restarted. Throwaway cache data is an ideal candidate to be stored in the MEMORY engine.

There are more storage engines supported by MariaDB such as Federated and Cassandra, which are not covered here. Informatio­n about the same is available at mariadb.com.

Performanc­e tuning

Database performanc­e tuning is a vast topic and depends highly on the workload of the applicatio­n. Yet, there are some common principles that can be applied to every case. To begin with, check the hardware of your server. Databases require a lot of RAM and fast disk IO. The more the RAM and the faster the disk, the happier your database server is. Tools such as PHPMyAdmin (Web applicatio­n) and mysqltuner (command line applicatio­n) provide insights on server performanc­e and suggested parameter tuning to achieve better performanc­e.

It is highly recommende­d to have storage such as SSD in RAID 10 mode for achieving high throughput with a large database system. Just switching from a traditiona­l rotational hard drive to SSD can improve the response times of the server significan­tly.

To achieve good throughput from a database, it is very important to design the database schema and use optimal queries. For example, if you have a table with a user’s details such as name, email address, user name and you are running a query which searches for the user name without having an index on the user name field, then the performanc­e will be bad because the database server has to do a full table scan for every such query.

A full table scan is one in which all the rows have to be scanned before arriving at a result. Badly designed queries, especially joins, can cause severe performanc­e hits. MySQL and MariaDB use the nested loop algorithm for computing the result of joins. A nested loop algorithm works something like this: let us say you have three tables (t1, t2, t3) for which a combined result is required. First, the optimiser decides the driving table, and the next table in the join is queried based on results received from the driving table. This result set forms the base, using which the third table’s data is matched. In mathematic­al terms, this is known as a Cartesian product.

Use caching

MySQL and MariaDB have a query cache, which can provide a significan­t boost in performanc­e. It is useful in cases with less writes and mostly reads. If there are a lot of writes happening, then the server will be spending more time managing the cache instead of working on queries. For this reason it is not recommende­d to have a large size for the built-in MySQL query cache. Up to 512MB is sufficient. For more control the applicatio­n should use its own caching such as Memcached or Redis.

Use EXPLAIN

When designing queries for the system, it is important to have optimal queries. MySQL will provide details of the execution plan of a query given to EXPLAIN. Optimizer trace is also available, which can be used to explore why one plan was chosen instead of the other. Optimizer trace is used by MySQL internally, but can be used while designing queries as well.

To use optimizer trace, type:

mysql> SET optimizer_-trace=-”enabled=-on”

Setting correct memory parameters

MySQL uses temporary tables a lot when processing complex queries that involve joins and sorting. The default size of a temporary table is very small and is not sufficient for bigger data sets. To set a temporary table size, add the following to your my.cnf:

tmp-table-size = 1G max-heap-table-size = 1G

It is important to note that setting larger values for temporary tables means more RAM will be consumed. If the system does not have sufficient RAM, then the values should not be increased. Mysqltuner can provide informatio­n on whether the current settings of the server will overrun the available RAM or not. A good RAM usage figure is 80-90 per cent. Beyond 90 per cent implies operating on the borderline and can cause failure, assuming the database is the only service on the server. If the same machine is used for other services such as the Web, then that needs to be taken care of as well. However, it is a bad idea to run the Web and database on the same server in large workloads.

Buffer sizes

Two important entities that can be tuned are often missed, namely, join buffer size and sort buffer size. These buffers are allocated per connection and play a significan­t role in the performanc­e of the system. Join buffer is used, as the name suggests, to process joins – but only full joins on which no keys are possible.

Sort buffer size is used to sort data. If your applicatio­n involves a lot of sorting, then this should be increased.

The system status variable sort_-merge-_passes will tell you whether the value needs to be increased or not. This variable should be as low as possible.

InnoDB settings

As stated earlier, the more the RAM, the better the performanc­e. The performanc­e of an InnoDB system is directly dependent on the size of the data that is present and the buffer pool size. A thumb rule is that the size of the buffer pool should be at least as large as the data that is being stored – this is not feasible in all cases and not required either, since not all the data present in the database may be required at the same time. But the buffer pool size should be large enough to accommodat­e data that is worked upon frequently.

The InnoDB log file is used for storing a redo log, which can be replayed in case of a power failure or database crash. If there are a lot of writes in the applicatio­n, then the size of the log file must be increased. The optimal size of the log file can be calculated by observing the status variable innodb_os_ log_written. It is a counter (bytes) which gets incremente­d for every byte written to the log file. So the difference between the values of the variable in a given time period (60 seconds) gives the optimal log file size to store exactly one minute worth of recovery data. To store longer periods of redo log, the size can be a multiple of the unit time interval. The status variable innodb_-log-_waits should be checked as well if the log size really needs tuning. In many set-ups the log size is unnecessar­ily large, which increases the time required for recovery when there is a crash or power failure.

Most of the servers these days have multiple CPUs (physically, or as SMP). If the InnoDB buffer pool size is more than a gigabyte then it can be divided across the number of CPUs using the setting innodb-buffer-pool-instances. The ideal value for the number of buffer pool instances is the number of CPUs you have on the server, and it is important to note that each instance must be at least 1 GB.

File system tuning

The file system on which database data is stored is an important aspect of performanc­e tuning. Most Linux servers use ext4 or xfs but without one important setting being enabled – noatime. Whenever a file is read, the access time of the file is updated. This is of little utility and disabling it will yield a lot of performanc­e benefit. Additional­ly, it is possible to use a raw disk as storage in case of InnoDB – the whole InnoDB data file can be stored over a raw disk partition. This mode is good if you have a dedicated disk and can help to avoid double buffering by the file system as well as MySQL.

As of Ubuntu 16.04, it is possible to run ZFS on Linux. Linux file systems by default use the LRU (Least Recently Used) algorithm for file system page caching – the oldest page is evicted when memory is needed. In contrast, ZFS has ARC – Adaptive Replacemen­t Cache, which uses a page replacemen­t algorithm that takes into account both recently read blocks as well as frequently read blocks— basically a combinatio­n of LRU and LFU (Least Frequently Used). This does help to achieve some performanc­e benefit, but it depends on the workload. In addition to ARC, it also supports L2ARC – an ARC that can be stored on a memory device which is slightly slower than RAM, such as NVMe or SSD. Storing InnoDB with default ZFS settings will not provide optimal performanc­e – the record size of InnoDB storage should be 16k and that of InnoDB log storage should be 128k. In case of the innodbfile­per-table flag (which is ‘on’ by default), the InnoDB data files are created inside individual database directorie­s instead of being stored in a single large file. The whole MySQL dataset needs to have a record size of 16k in that case.

This covers relatively a large part about MySQL and

MariaDB performanc­e tuning; the rest depends on the applicatio­n workload. I have successful­ly configured servers for 1k to 5.5k queries per second on an average, using these strategies.

 ??  ??
 ??  ??

Newspapers in English

Newspapers from India