SM SKP ADS-1
Planet MySQL - http://www.planetmysql.org/
What part of NoSQL don't you understand?
In the word "NoSQL", the letters "No" are an acronym so the meaning is "Not Only SQL" rather than "No SQL". True or false?
Historically, it's false
The first NoSQL product was a classic DBMS which didn't happen to use SQL for a query language, featured in Linux Journal in 1999. Its current web page has a traffic-sign symbol of the word SQL with a bar through it, and a title "NoSQL: a non-SQL RDBMS".
For a meetup in June 2009 about "open source, distributed, non-relational databases" the question came up "What's a good name?" Eric Evans of Rackspace suggested NoSQL, a suggestion which he later regretted.
In October 2009 Emil Eifrem of NeoTechnology tweeted "I for one have tried REALLY HARD to emphasize that #nosql = Not Only SQL". This appears to be the earliest recorded occurrence of the suggestion. In November 2009 Mr Eifrem declared with satisfaction that the suggestion was "finally catching on".
In other words saying "No = Not Only" is an example of a backronym, and specifically it's a class of backronym known as False Acronyms. There are lots of these around -- like the idea that "cabal" stands for seventeenth-century English ministers named Clifford Arlington Buckingham Ashley and Lauderdale, or the myth that "posh" means "port out starboard home". False acronyms might go back at least 1600 years, if the earliest example is the "Jesus Fish". Anyway, backronyms are popular and fun, but this one's false.
Currently it's unaccepted
Regardless of etymology, NoSQL could currently mean Not Only SQL, if everyone accepted that. So, getting the new interpretation going is a matter of evangelism -- tell everyone that's what it means, and hope that they'll tell their friends. Is that working?
I think it has worked among NoSQL insiders, but it's tenuous. If it were generally believed, then the common spelling would be NOSQL rather than NoSQL, and the common pronunciation would be "En O Ess Cue Ell" rather than "No Sequel". Although my sample size is small, I can say that I've rarely seen that spelling and I've rarely heard that pronunciation.
And the general public knows that "no-starch" products don't contain more than starch, or that "no-goodniks" are in a class beyond goodness, and that the slogan "No means no!" is a response to proponents of sexual assaults. So, when confronted with the claim that no doesn't mean no, there's inevitable resistance or disgust.
But is it true?
It would still be okay to promote the idea that "NoSQL" equals "not only SQL" if the term described a group of products that are (a) SQL and (b) something other than SQL. This is where, in a rephrasing of the Wendy's "Where's The Beef?" commercial, the first question is: "Where's the SQL?"
Back in 2009 the answers would have been pathetic, but nowadays there are "SQL" query languages that come from NoSQL vendors. CQL would qualify as an SQL product if the bar was low, like saying "If it MOVEs, it's COBOL". Impala would qualify in a more serious way, in fact most of the core-SQL tick-box items are tickable. (I'm believing the manual, I didn't do vicious testing as I did with MySQL.) So, although they're a minority, there seem to be NoSQL vendors who supply SQL.
But the second question is: what can these products do that can't be done in SQL? Well, they can supply language extensions, and they can allow the SQL layer to be bypassed. However, so can MySQL or MariaDB -- think of the navigation with HANDLER or direct access with HandlerSocket. In that case MySQL would be a NoSQL DBMS, and if that were true then the term NoSQL wouldn't distinguish the traditional from the non-traditional products, and so it would be a useless word.
Therefore pretending that NoSQL means Not Only SQL is wrong in several ways, but insisting it still means No SQL is merciless. Perhaps an honest solution is to stop saying that the word's parts mean anything. It's just a name now, in the same way that "PostgreSQL" is just a name and you've forgotten what's it's a post of.
MySQL Performance: More in depth with LinkBench Workload on MySQL 5.7, MariaDB 10.1 and also Percona Server 5.6
This is the next chapter of the story started in my previous
article and related to the updated
results on LinkBench workload published by MariaDB..Keeping
in mind that the obtained results are completely opposite from both
sides, I've started to investigate then the same LinkBench-150GB 64
concurrent users workload from the "most favorable" possible test
conditions on the same 40cores-HT server as in my previous article:
InnoDB Buffer Pool (BP) = 150G (so, all the data may remain in memory)
innodb_flush_log_at_trx_commit=2 (so, REDO write are not flushed once
per second and not on every transaction)
no checksum, no double-write, no compression, no atomic IO, etc..
just a base-to-base comparison..
And the probe result was the following, with a still better TPS on MySQL
5.7 (left) than on MariaDB 10.1 (right) :NOTE:
I'm obtaining my TPS numbers from reported by MySQL server its Commit/sec
rate. I did not use for that the results from the LinkBench log file..
From what I can read in the log file, my TPS x 3 is less or more equal to
the level of operations/sec reported by LinkBench, but I'm not looking for
the moment for the best result in absolute, but rather a difference in
performance between MySQL and MariaDB engines..But well, my
surprises started then come one ofter one ;-)
first, discussing details with Fusion-io Team, I've discovered that my
flash card is not supporting atomic IO ;-)
so, all the previous tests with atomic IO on MariaDB you may ignore
(and indeed, this explains also why the results are exactly the same
with and without atomic IO ;-))
then I discovered that 32cores reported in MariaDB config testing are
in fact 16cores-HT (16cores with hyper-threading, 2 threads per
core).. - and this could play a significant role as having less CPU
cores may lower an overall concurrency during processing, and as the
result - a lower lock contention.. - but is it so? - have to check..
but the most killing one was the last one: from Jan's comments I
understood that MariaDB 10.1 is using XtraDB by default, and even this
version of XtraDB code was modified to have some features from
MariaDB's own changes on InnoDB, and InnoDB engine should be used than
as plugin (while I was pretty sure I'm using their InnoDB code as all
their new options were accepted)..
So far, I've then replayed the same test again, but within the following
MySQL server is bound to 20cores-HT only
MariaDB is using its own InnoDB plugin
EXT4 is used as filesystem to avoid any other potential surprises ;-)
1h or 1.5h test duration to avoid doubts about dramatically dropping
The result was very surprising, and I've finally added also Percona Server
5.6 into the game to clarify the things ;-)And here is what I've
observed, from left to right :
MariaDB 10.1 with XtraDB (default)
MariaDB 10.1 with InnoDB (plugin)
Percona Server 5.6
indeed, MariaDB with its InnoDB plugin is performing better then with
its XtraDB which is used by default..
on the same time Percona Server 5.6 is preforming way better than
MariaDB with its XtraDB..
so, I may just suspect that there are some problems in MariaDB with
while Percona Server is doing better than MariaDB with any engine
and MySQL 5.7 is doing better than anyone ;-)
What about lock contentions?The picture is completely different
for my big surprise, the index lock contention is going down on both
MariaDB and Percona Server
log_sys mutex contention is dominating for all..
So far, let's see if now running on full 40cores we have a better
performance?From left to right, the same test is executed on
20cores-HT and then on 40cores-HT on:
MariaDB 10.1 with InnoDB (plugin)
Percona Server 5.6
surprisingly a very small gain on MySQL 5.7 on move from 20 to
near no gain at all on MariaDB..
and yet more surprising - a regression on Percona Server..
indeed, there is something yet more to investigate as there is no
reason to not win at least 50% in TPS on 40cores with 64 concurrent
NOTE: I'm not blaming LinkBench here, as it's very possible that the
show-stopper is inside of the database code rather in the test code
And hope an analyze of the lock contentions will say us little bit more:Observations
as you can see, the log_sys contention is growing on the move from 20
to 40cores on all engines
while Percona Server is also hitting a growing lock_sys time..
But if this is only about the locking, then probably we can improve the
thing little bit by tuning the InnoDB spin wait delay?The same
test on MySQL 5.7, but with progressively increased spin wait delay :
innodb_spin_wait_delay= 12, 24, 48, 96, 192, ...
the best TPS level is reached with spin delay=48 : 35K TPS instead of
the initial 30K ;-)
on 96 TPS is less or more the same as with 48, but on 192 performance
is going down..
And it's well seen on the graph with contentions :Now,
let's replay the same test on the same 40cores-HT config, but with applied
innodb_spin_wait_delay=48 on all engines and adding one more for REDO log
the first test for each engine with innodb_flush_log_at_trx_commit=2
(trx2 : REDO fsync once per second)
and the second one with innodb_flush_log_at_trx_commit=1 (trx1 : REDO
fsync on every transaction)
there are several surprises here..
MySQL 5.7 : 35K TPS on trx2, but only 24.5K TPS on trx1..
MariaDB 10.1 : 28K TPS on trx2, and 24.5K TPS on trx1..
Percona 5.6 : only 27K TPS on trx2, but 28K TPS on trx1! ;-))
well, if on MariaDB the gap between trx2 and trx1 results is as
expected (redo writes fsync() on every transaction with trx1 cannot be
while on MySQL 5.7 it looks rather as regression (should be fixed)
and on Percona Server there is rather a regression on trx2 ;-)
Then, to understand why all this happens you have just to look on the
corresponding mutex contentions graph reported by PFS:However,
the most surprising for me here the drop on MySQL 5.7 from 35K TPS to
24.5K TPS just because we switched from innodb_flush_log_at_trx_commit=2
to innodb_flush_log_at_trx_commit=1.. - I'd not expect such a big
regression just because of REDO fsync, specially when we're using an
extremely fast flash storage.. And indeed, the story will likely change
for 5.7 very quickly -- the following graph representing the result on
exactly the same workload, but on the next-step-dev 5.7 version (after
DMR5), get a look on this:While
the result with innodb_flush_log_at_trx_commit=2 is slightly lower than
35K TPS (need some more tuning/testing to find an optimal internals
setting), but with innodb_flush_log_at_trx_commit=1 the result is even
little bit bigger than 35K TPS!!!All this means that in MySQL 5.7
we may be potentially not too far from having near the same level of
performance for innodb_flush_log_at_trx_commit=1 setting as for
innodb_flush_log_at_trx_commit=2 whenever it's possible.. - stay tuned! ;-)Ok,
now what will happen if the Buffer Pool size will be reduced to 75GB ?
(instead of 150GB)...The same test but with BP size=75G :Observations
there is not too much difference anymore between trx2 and trx1 on all
a huge wait time part now is attributed to the IO reads..
all engines are starting from the less or more the same TPS level on
the beginning (except MySQL 5.7 which is starting from a higher TPS)
but then over a time starting to decrease...
MySQL 5.7 is loosing less than any others, and remains the best from
all engines (and if one day I'll have a patience to run such a test
for 24h I'll do to see the end of the graph ;-))
However, it's easy to understand why MariaDB and Percona Server are
regressing over a time.. -- if you'll look on the lock wait graphs you'll
see a constantly increasing wait time on the index lock.. - but no more
for MySQL 5.7 ;-)Any
difference to expect if only 20cores will be user for database server?The
same test, but on 20cores-HT :On
20cores-HT the overall result just little bit slower, but very similar to
40cores-HT..And contentions levels are looking very similar as
well, including growing over a time the index lock waits on MariaDB and
of SUMMARY :
indeed, the plugin InnoDB in MariaDB 10.1 seems to run better than
their default XtraDB..
however, it looks like there is just something wrong with XtraDB
integration in MariaDB, because Percona Server itself is showing the
same or a better result than MariaDB running with its own InnoDB
as well I did not see any single point where MariaDB could be better
on base-to-base conditions comparing to MySQL 5.7
further investigation will come later once I'll have in my server the
Fusion-io flash card supporting atomic IO and all the latest NVMFS
more code changes in MySQL 5.7 are in pipe to better analyze these
strange scalability issues and more higher and bigger database
workloads.. - note: we're yet far here from reaching even a half of
potential performance of this flash storage, so there is definitively
a lot of things to improve ;-)
So, stay tuned ;-)Additional NOTES : (as you're asking)>>
HOWTO monitor Commit/sec in MySQL ?
you're getting on a regular interval (say every 10sec.) the value of
the "Com_commit", for ex. :
mysql> show global status like 'Com_commit';
then do a subtraction from the current value the previous one, and
divide it by your timeout interval (say 10sec)
that's all ;-)
>> HOWTO monitor mutex time waits in MySQL ?
you're starting MySQL server with PFS enabled and wait events
then on a regular time interval executing the following SQL query, and
getting a diff from the current values vs the previous ones and divide
them by your time interval
mysql> select EVENT_NAME, max(SUM_TIMER_WAIT)/1000000000000 as WaitTM
from events_waits_summary_by_instance group by 1 order by 2 desc limit 25;
| EVENT_NAME | WaitTM |
| wait/io/file/innodb/innodb_data_file | 791.3204 |
| wait/synch/mutex/innodb/fil_system_mutex | 25.8183 |
| wait/synch/rwlock/innodb/btr_search_latch | 5.2865 |
| wait/io/file/innodb/innodb_log_file | 4.6977 |
| wait/synch/rwlock/sql/LOCK_grant | 4.4940 |
25 rows in set (0.06 sec)
As well you can use MySQL
Enterprise Monitor (MEM) or use my tools from my site (dim_STAT @ http://dimitrik.free.fr)
to get all this and other stuff ready for action..
Oracle OpenWorld 2014 – Bloggers Meetup
Guess what? You all know that it’s coming, when it’s coming and where… That’s right! The Annual Oracle Bloggers Meetup, one of your top favourite events of OpenWorld, is happening at usual place and time.
What: Oracle Bloggers Meetup 2014
When: Wed, 1-Oct-2014, 5:30pm
Where: Main Dining Room, Jillian’s Billiards @ Metreon, 101 Fourth Street, San Francisco, CA 94103 (street view). Please comment with “COUNT ME IN” if coming — we need to know the attendance numbers.
Traditionally, Oracle Technology Network and Pythian sponsor the venue and drinks. We will also have some cool things happening and a few prizes.
In the age of Big Data and Internet of Things, our mingling activity this year will be virtual — using an app we wrote specifically for this event, so bring your iStuff and Androids to participate and win. Hope this will work! :)
As usual, vintage t-shirts, ties, or bandanas from previous meetups will make you look cool — feel free to wear them.
For those of you who don’t know the history: The Bloggers Meetup during Oracle OpenWorld was started by Mark Rittman and continued by Eddie Awad, and then I picked up the flag in 2009 (gosh… 6 years already?) The meetups have been a great success for making new friends and catching up with old, so let’s keep them this way! To give you an idea, here are the photos from the OOW08 Bloggers Meetup (courtesy of Eddie Awad) and OOW09 meetup blog post update from myself, and a super cool video by a good blogging friend, Bjorn Roest from OOW13.
While the initial meetings were mostly targeted to Oracle database folks, guys and gals from many Oracle technologies — Oracle database, MySQL, Apps, Sun technologies, Java and more join in the fun. All bloggers are welcome. We estimate to gather around 150 bloggers.
If you are planning to attend, please comment here with the phrase “COUNT ME IN”. This will help us ensure we have the attendance numbers right. Please provide your blog URL with your comment — it’s a Bloggers Meetup after all! Make sure you comment here if you are attending so that we have enough room, food, and (most importantly) drinks.
Of course, do not forget to blog and tweet about this year’s bloggers meetup. See you there!
MySQL upgrade best practices
MySQL upgrades are necessary tasks and we field a variety of questions here at Percona Support regarding MySQL upgrade best practices. This post highlights recommended ways to upgrade MySQL in different scenarios.Why are MySQL upgrades needed? The reasons are many and include: Access to new features, performance benefits, bug fixes…. However, MySQL upgrades can be risky if not tested extensively beforehand with your application because the process might break it, prevent the application from functioning properly – or performance issues could arise following the upgrade. Moreover, I suggest keeping an eye on new releases of MySQL and Percona Server – check what has changed in the most recent version. Perhaps the latest release has a fix for an issue that you have been experiencing.Upgrading one major version via SQL Dump:Upgrading between one major version covers upgrading from Percona Server 5.1 to 5.5 or Percona Server 5.5 to 5.6 and the same implies to Oracle MySQL.First of all, upgrading between one major version is neither straightforward nor risk-free. Initially you should read “Upgrading from Previous Series” documentation here and here. In that documentation, please place special attention to all of the sections marked “Incompatible Change” and check whether you may be affected by those changes. There might be configuration changes as well as variables renamed, a few older variables obsoleted and new variables introduced – so make sure that you adjust your my.cnf accordingly. For Percona Server specific changes please refer here and here for Percona Server 5.5 & Percona Server 5.6, respectively.Now there are several possible approaches you may take, where one may be more feasible than the other depending on the current replication topology and total data size – and one might also be safer than another. Let me show you an upgrade procedure… an example upgrading from Percona Server 5.5 to Percona Server 5.6.In general, there are two types of MySQL upgrades:In place, where you use existing datadir against the new MySQL major version, with just running mysql_upgrade after binaries are upgraded,SQL dump on an old version and then restore it on a new version (using mysqldump utility or alternatives, like mydumper).Also in general the second type is safer, but as you may expect a much slower MySQL upgrade process.Theoretically, the safest scenario is:Dump all user grants (using http://www.percona.com/doc/percona-toolkit/2.2/pt-show-grants.html)Dump all data (except the mysql database) from MySQL 5.5 into SQL dump and restore on MySQL 5.6Restore the user grantsHere’s a basic procedure (you should stop application writes before starting).1) Capture users and permissions information. This will backup all your existing user privileges.$ wget percona.com/get/pt-show-grants;
$ perl pt-show-grants --user=root --ask-pass --flush > /root/grants.sql2) Produce a logical dump of the 5.5 instance, excluding the mysql, information_schema and performance_schema databases:$ mysql -BNe "SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME NOT IN ('mysql', 'performance_schema', 'information_schema')" | tr 'n' ' ' > /root/dbs-to-dump.sql
$ mysqldump --routines --events --single-transaction --databases $(cat /root/dbs-to-dump.sql) > /root/full-data-dump.sql3) Stop 5.5 instance.$ service mysql stop
$ /etc/init.d/mysql stop4) Move old datadir (assuming /var/lib/mysql; edit accordingly to your setup):$ mv /var/lib/mysql/ /var/lib/mysql-555) Install 5.6 (simply as you would do when not upgrading). If you don’t use a package manager (yum/apt-get) then is likely that you need to run mysql_install_db and mysql_upgrade.6) Load the users back to new upgraded version of MySQL.$ mysql -uroot < /root/grants.sql7) Load the data back to new version of MySQL.$ mysql -e "SET GLOBAL max_allowed_packet=1024*1024*1024";
$ mysql -uroot -p --max-allowed-packet=1G < /root/full-data-dump.sql;At this point all tables have been re-created and loaded in MySQL 5.6 so every binary representation is native to MySQL 5.6. You’ve also completed the cleanest/most-stable upgrade path and your application can resume service – and for that reason it’s worth mentioning that this upgrade path is the same with either upgrading vanila MySQL or Percona Server. Further, you may upgrade from Oracle MySQL to Percona Server, for example, upgrading Oracle MySQL 5.5 to Percona Server 5.6. Again, the MySQL upgrade path as described would be the same as Percona Server, which is a drop-in replacement of Oracle MySQL.“SQL dump” is also known as a logical backup. It is safer in the sense that when restoring, all tables will be created using the format of the new MySQL binaries you’re using, which bypasses compatibility issues in general. Still for large data like data in terabytes, gigabytes… this may be a very time-consuming approach. On the other hand, by dumping/reloading such large data sets, it is possible that you will be able to recover a lot of free space on the disk as the InnoDB table spaces will be re-created from scratch, thus optimized and defragmented. If the data was often updated/deleted, the benefits may be significant.Minor version MySQL upgrade within the same major version via In-Place Upgrade:This implies to upgrading within the same series e.g. MySQL 5.5.35 to MySQL 5.5.38 or Percona Server 5.6.14 to latest Percona Server 5.6.20.This is known as an in-place upgrade, where you just install a newer binary package and then run mysql_upgrade script, which checks and updates system tables if necessary. Still, with the in-place upgrade we highly recommend checking release notes for new features, bug fixes, etc. For Percona Server 5.5 and Percona Server 5.6, release notes can be found here and here respectively.For Percona Server we have additional documents describing some details when it comes to upgrading Percona Server with a focus on Percona-specific features that can be found here and here. This also covers complete In-Place Upgrade procedure with the yum/apt package manager.Also, to be on safe side you can do the upgrade with a logical dump using the earlier described procedure via mysqldump or mydumper program – where the former does parallel backups and restore and logical backup – and is the safest approach for the upgrade.MySQL Upgrade directly to the latest version by skipping one major version in between:This includes upgrading from MySQL 5.0 to MySQL 5.5 by skipping version 5.1 in between or upgrading MySQL 5.1 to MySQL 5.6 by skipping version 5.5 in between. Further, this also includes upgrading to MySQL 5.6 directly from MySQL 5.0 although there should be very few users still using MySQL version 5.0. This also implies to Percona Server.For the topic, we would assume upgrading from Oracle MySQL or Percona Server 5.1 directly to version 5.6 by skipping one major version 5.5 in between.Before anything, this is a serious upgrade, and a huge step over one major MySQL version. That is, it’s risky. Upgrading by using just binaries update is not supported and it’s not safe skipping major versions in between, so you should never do this from 5.0->5.5, 5.1->5.6, and surely not for 5.0->5.6. One problem is that not all changes in MySQL versions are backwards compatible. Some differences were introduced that may affect both how the data is handled, but also how the server behaves including both SQL language and MySQL server and storage engines internals. Another thing is that between MySQL 5.0 and 5.6 versions, a number of default setting variables were changed, which may result in completely different, unexpected behavior. For example since MySQL 5.5 the default storage engine is InnoDB and since MySQL 5.6 by default InnoDB will use a separate tablespace for each table and GTID replication was also introduced. But there are many more details which I won’t list here. All of those changes are described in “Upgrading from Previous Series” documentation as described above.It’s worth mentioning that upgrading by skipping one major version is highly not recommended. Upgrading from MySQL 5.1 to 5.6 shouldn’t be done in one shot. Instead, I would suggest upgrading from version 5.1 to 5.5 and then from version 5.5 to 5.6 and running mysql_upgrade at each step. That will cope with the changes in formats as explained in the manual.MySQL Upgrade Precautions:MySQL upgrade precautions are an essential part of the upgrade itself. Before you upgrade make sure you have thoroughly tested all application parts with the desired version of MySQL. This is especially needed for an upgrade between major versions or if you are upgrading by skipping one major version in-between (e.g. upgrade from MySQL 5.1 to MySQL 5.6).Make sure you read release notes carefully and that you are aware of all the changes. You can find Oracle MySQL 5.5 and 5.6 release notes as follows:http://dev.mysql.com/doc/relnotes/mysql/5.5/en/ http://dev.mysql.com/doc/relnotes/mysql/5.6/en/While Percona Server specific release notes can be found below for same versions as described above.http://www.percona.com/doc/percona-server/5.5/release-notes/release-notes_index.html http://www.percona.com/doc/percona-server/5.6/release-notes/release-notes_index.htmlIf you are planning to upgrade to Oracle MySQL 5.6 or Percona Server 5.6 I would recommend first checking for existing critical bugs. Bugs you should aware of:http://bugs.mysql.com/bug.php?id=66546 http://bugs.mysql.com/bug.php?id=68953 http://bugs.mysql.com/bug.php?id=69444 http://bugs.mysql.com/bug.php?id=70922 http://bugs.mysql.com/bug.php?id=72794 http://bugs.mysql.com/bug.php?id=73820Upgrade Hierarchy:This is yet another important aspect of any MySQL upgrade. You should plan your upgrade along with an upgrade hierarchy. This is always recommend: upgrade your dev/QA servers first, then staging server’s before moving to production. In fact, you can spare upgraded instances where you have desired upgraded versions of MySQL and then test your application extensively.Once you are happy with the MySQL upgrade on your test servers, staging servers, etc., then you can begin the MySQL upgrade on your production servers. In replication environments we highly recommend upgrading the MySQL slaves first (one by one) and then finally upgrading the MySQL master. In reality, you can upgrade one of the slaves first and run it for few days to be on safe side – all the while closely monitoring its performance. If you don’t have a replication setup it may be worth creating a replica to test the newer version of MySQL on it first. Once you are happy with the results you can upgrade remaining the slaves and finally the master. How Percona software helps you in a MySQL upgrade:In any MySQL upgrade, Percona Toolkit comes to the rescue. Percona Tookit contains a number of tools that help a great deal.pt-upgrade is one of such tool. It allows you to test whether the new MySQL instance handles some specific queries at least as fast as old version. There may be some substantial differences as the MySQL query optimizer has changed a lot between versions 5.1 and 5.6 and also data statistics may be refreshed, hence the query plan may change. You can check further in the manual about optimizer changes.pt-query-digest is another great tool that might help you in the upgrade. You can replay your slow query log against existing and new desired MySQL versions for before and after query performance validation.You can also benefit from Percona Cloud Tools for MySQL which is a hosted service providing access to query performance insights for all MySQL uses. You can signup for free now because this service is in public beta. Percona Cloud Tools, among other things, allows you to visually check your queries performance after a MySQL upgrade.It’s highly recommended to backup your data before your MySQL upgrade. Percona XtraBackup is free and open source (like all Percona software). It’s a hot backup tool which backs-up your data online without scarifying read/write ability from the database and it will backup your data with minor impact.Last but not least, You will find this post pretty useful, too: “Upgrading MySQL.” It’s a few years old but still very relevant. And also take a look at this informative webinar, “Upgrading to MySQL 5.6: Best Practices.” Both are from Percona CEO Peter Zaitsev.Conclusion: A MySQL upgrade might look like a simple task – but actually it’s not. I’ve tried to cover most of the MySQL upgrade scenarios in this post that you will encounter. Again, I recommend to briefly test your application parts before pushing it “live,” otherwise it may break your application or part of it – or may minimize performance instead of a performance gain. Finally, I recommend having a downgrade plan in place before the MySQL upgrade just in case something goes wrong. Planning a proper downgrade procedure will minimize your app downtime when things go wrong. I’m looking forward to your comments and questions below. The post MySQL upgrade best practices appeared first on MySQL Performance Blog.
Some current MySQL Architecture writings
So, I’ve been looking around for a while (and a few times now) for any good resources that cover a bunch of MySQL architecture and technical details aimed towards the technically proficient but not MySQL literate audience. I haven’t really found anything. I mean, there’s the (huge and very detailed) MySQL manual, there’s the MySQL Internals manual (which is sometimes only 10 years out of date) and there’s various blog entries around the place. So I thought I’d write something explaining roughly how it all fits together and what it does to your system (processes, threads, IO etc).(Basically, I’ve found myself explaining this enough times in the past few years that I should really write it down and just point people to my blog). I’ve linked to things for more reading. You should probably read them at some point.
Years ago, there were many presentations on MySQL Architecture. I went to try and find some on YouTube and couldn’t. We were probably not cool enough for YouTube and the conferences mostly weren’t recorded. So, instead, I’ll just link to Brian on NoSQL – because it’s important to cover NoSQL as well.
So, here is a quick overview of executing a query inside a MySQL Server and all the things that can affect it. This isn’t meant to be complete… just a “brief” overview (of a few thousand words).
MySQL is an open source relational database server, the origins of which date back to 1979 with MySQL 1.0 coming into existence in 1995. It’s code that has some history and sometimes this really does show. For a more complete history, see my talk from linux.conf.au 2014: Past, Present and Future of MySQL (YouTube, Download).
At least of writing, everything here applies to MariaDB and Percona Server too.
The MySQL Server runs as a daemon (mysqld). Users typically interact with it over a TCP or UNIX domain socket through the MySQL network protocol (of which multiple implementations exist under various licenses). Each connection causes the MySQL Server (mysqld) to spawn a thread to handle the client connection.
There are now several different thread-pool plugins that instead of using one thread per connection, multiplex connections over a set of threads. However, these plugins are not commonly deployed and we can largely ignore them. For all intents and purposes, the MySQL Server spawns one thread per connection and that thread alone performs everything needed to service that connection. Thus, parallelism in the MySQL Server is gotten from executing many concurrent queries rather than one query concurrently.
The MySQL Server will cache threads (the amount is configurable) so that it doesn’t have to have the overhead of pthread_create() for each new connection. This is controlled by the thread_cache_size configuration option. It turns out that although creating threads may be a relatively cheap operation, it’s actually quite time consuming in the scope of many typical MySQL Server connections.
Because the MySQL Server is a collection of threads, there’s going to be thread local data (e.g. connection specific) and shared data (e.g. cache of on disk data). This means mutexes and atomic variables. Most of the more advanced ways of doing concurrency haven’t yet made it into MySQL (e.g. RCU hasn’t yet and is pretty much needed to get 1 million TPS), so you’re most likely going to see mutex contention and contention on cache lines for atomic operations and shared data structures.
There are also various worker threads inside the MySQL Server that perform various functions (e.g. replication).
Until sometime in the 2000s, more than one CPU core was really uncommon, so the fact that there were many global mutexes in MySQL wasn’t really an issue. These days, now that we have more reliable async networking and disk IO system calls but MySQL has a long history, there’s global mutexes still and there’s no hard and fast rule about how it does IO.
Over the past 10 years of MySQL development, it’s been a fight to remove the reliance on global mutexes and data structures controlled by them to attempt to increase the number of CPU cores a single mysqld could realistically use. The good news is that it’s no longer going to max out on the number of CPU cores you have in your phone.
So, you have a MySQL Client (e.g. the mysql client or something) connecting to the MySQL Server. Now, you want to enter a query. So you do that, say “SELECT 1;”. The query is sent to the server where it is parsed, optimized, executed and the result returns to the client.
Now, you’d expect this whole process to be incredibly clean and modular, like you were taught things happened back in university with several black boxes that take clean input and produce clean output that’s all independent data structures. At least in the case of MySQL, this isn’t really the case. For over a decade there’s been lovely architecture diagrams with clean boxes – the code is not like this at all. But this probably only worries you once you’re delving into the source.
The parser is a standard yacc one – there’s been attempts to replace it over the years, none of which have stuck – so we have the butchered yacc one still. With MySQL 5.0, it exploded in size due to the addition of things like SQL2003 stored procedures and it is of common opinion that it’s rather bloated and was better in 4.1 and before for the majority of queries that large scale web peeps execute.
There is also this thing called the Query Cache – protected by a single global mutex. It made sense in 2001 for a single benchmark. It is a simple hash of the SQL statement coming over the wire to the exact result to send(2) over the socket back to a client. On a single CPU system where you ask the exact same query again and again without modifying the data it’s the best thing ever. If this is your production setup, you probably want to think about where you went wrong in your life. On modern systems, enabling the query cache can drop server performance by an order of magnitude. A single global lock is a really bad idea. The query cache should be killed with fire – but at least in the mean time, it can be disabled.
Normally, you just have the SQL progress through the whole process of parse, optimize, execute, results but the MySQL Server also supports prepared statements. A prepared statement is simply this: “Hi server, please prepare this statement for execution leaving the following values blank: X, Y and Z” followed by “now execute that query with X=foo, Y=bar and Z=42″. You can call execute many times with different values. Due to the previously mentioned not-quite-well-separated parse, optimize, execute steps, prepared statements in MySQL aren’t as awesome as in other relational databases. You basically end up saving parsing the query again when you execute it with new parameters. More on prepared statements (from 2006) here. Unless you’re executing the same query many times in a single connection, server side prepared statements aren’t worth the network round trips.
The absolute worst thing in the entire world is MySQL server side prepared statements. It moves server memory allocation to be the responsibility of the clients. This is just brain dead stupid and a reason enough to disable prepared statements. In fact, just about every MySQL client library for every programming language ever actually fakes prepared statements in the client rather than trust every $language programmer to remember to close their prepared statements. Open many client connections to a MySQL Server and prepare a lot of statements and watch the OOM killer help you with your DoS attack.
So now that we’ve connected to the server, parsed the query (or done a prepared statement), we’re into the optimizer. The optimizer looks at a data structure describing the query and works out how to execute it. Remember: SQL is declarative, not procedural. The optimizer will access various table and index statistics in order to work out an execution plan. It may not be the best execution plan, but it’s one that can be found within reasonable time. You can find out the query plan for a SELECT statement by prepending it with EXPLAIN.
The MySQL optimizer is not the be all and end all of SQL optimizers (far from it). A lot of MySQL performance problems are due to complex SQL queries that don’t play well with the optimizer, and there’s been various tricks over the years to work around deficiencies in it. If there’s one thing the MySQL optimizer does well it’s making quick, pretty good decisions about simple queries. This is why MySQL is so popular – fast execution of simple queries.
To get table and index statistics, the optimizer has to ask the Storage Engine(s). In MySQL, the actual storage of tables (and thus the rows in tables) is (mostly) abstracted away from the upper layers. Much like a VFS layer in an OS kernel, there is (for some definition) an API abstracting away the specifics of storage from the rest of the server. The API is not clean and there are a million and one layering violations and exceptions to every rule. Sorry, not my fault.
Table definitions are in FRM files on disk, entirely managed by MySQL (not the storage engines) and for your own sanity you should not ever look into the actual file format. Table definitions are also cached by MySQL to save having to open and parse a file.
Originally, there was MyISAM (well, and ISAM before it, but that’s irrelevant now). MyISAM was non-transactional but relatively fast, especially for read heavy workloads. It only allowed one writer although there could be many concurrent readers. MyISAM is still there and used for system tables. The current default storage engine is called InnoDB. It’s all the buzzwords like ACID and MVCC. Just about every production environment is going to be using InnoDB. MyISAM is effectively deprecated.
InnoDB originally was its own independent thing and has (to some degree) been maintained as if it kind of was. It is, however, not buildable outside a MySQL Server anymore. It also has its own scalability issues. A recent victory was splitting the kernel_mutex, which was a mutex that protected far too much internal InnoDB state and could be a real bottleneck where NRCPUs > 4.
So, back to query execution. Once the optimizer has worked out how to execute the query, MySQL will start executing it. This probably involves accessing some database tables. These are probably going to be InnoDB tables. So, MySQL (server side) will open the tables, looking up the MySQL Server table definition cache and creating a MySQL Server side table share object which is shared amongst the open table instances for that table. See here for scalability hints on these (from 2009). The opened table objects are also cached – table_open_cache. In MySQL 5.6, there is table_open_cache_instances, which splits the table_open_cache mutex into table_open_cache_instances mutexes to help reduce lock contention on machines with many CPU cores (> 8 or >16 cores, depending on workload).
Once tables are opened, there are various access methods that can be employed. Table scans are the worst (start at the start and examine every row). There’s also index scans (often seeking to part of the index first) and key lookups. If your query involves multiple tables, the server (not the storage engine) will have to do a join. Typically, in MySQL, this is a nested loop join. In an ideal world, this would all be really easy to spot when profiling the MySQL server, but in reality, everything has funny names like rnd_next.
As an aside, any memory allocated during query execution is likely done as part of a MEM_ROOT – essentially a pool allocator, likely optimized for some ancient libc on some ancient linux/Solaris and it just so happens to still kinda work okay. There’s some odd server configuration options for (some of the) MEM_ROOTs that get exactly no airtime on what they mean or what changing them will do.
InnoDB has its own data dictionary (separate to FRM files) which can also be limited in current MySQL (important when you have tens of thousands of tables) – which is separate to the MySQL Server table definitions and table definition cache.
But anyway, you have a number of shared data structures about tables and then a data structure for each open table. To actually read/write things to/from tables, you’re going to have to get some data to/from disk.
InnoDB tables can be stored either in one giant table space or file-per-table. (Even though it’s now configurable), InnoDB database pages are 16kb. Database pages are cached in the InnoDB Buffer Pool, and the buffer-pool-size should typically be about 80% of system memory. InnoDB will use a (configurable) method to flush. Typically, it will all be O_DIRECT (it’s configurable) – which is why “just use XFS” is step 1 in IO optimization – the per inode mutex in ext3/ext4 just doesn’t make IO scale.
InnoDB will do some of its IO in the thread that is performing the query and some of it in helper threads using native linux async IO (again, that’s configurable). With luck, all of the data you need to access is in the InnoDB buffer pool – where database pages are cached. There exists innodb_buffer_pool_instances configuration option which will split the buffer pool into several instances to help reduce lock contention on the InnoDB buffer pool mutex.
All InnoDB tables have a clustered index. This is the index by which the rows are physically sorted by. If you have an INT PRIMARY KEY on your InnoDB table, then a row with that primary key value of 1 will be physically close to the row with primary key value 2 (and so on). Due to the intricacies of InnoDB page allocation, there may still be disk seeks involved in scanning a table in primary key order.
Every page in InnoDB has a checksum. There was an original algorithm, then there was a “fast” algorithm in some forks and now we’re converging on crc32, mainly because Intel implemented CPU instructions to make that fast. In write heavy workloads, this used to show up pretty heavily in profiles.
InnoDB has both REDO and UNDO logging to keep both crash consistency and provide consistent read views to transactions. These are also stored on disk, the redo logs being in their own files (size and number are configurable). The larger the redo logs, the longer it may take to run recovery after a crash. The smaller the redo logs, the more trouble you’re likely to run into with large or many concurrent transactions.
If your query performs writes to database tables, those changes are written to the REDO log and then, in the background, written back into the table space files. There exists configuration parameters for how much of the InnoDB buffer pool can be filled with dirty pages before they have to be flushed out to the table space files.
In order to maintain Isolation (I in ACID), InnoDB needs to assign a consistent read view to a new transaction. Transactions are either started explicitly (e.g. with BEGIN) or implicitly (e.g. just running a SELECT statement). There has been a lot of work recently in improving the scalability of creating read views inside InnoDB. A bit further in the past there was a lot of work in scaling InnoDB for greater than 1024 concurrent transactions (limitations in UNDO logging).
Fancy things that make InnoDB generally faster than you’d expect are the Adaptive Hash Index and change buffering. There are, of course, scalability challenges with these too. It’s good to understand the basics of them however and (of course), they are configurable.
If you end up reading or writing rows (quite likely) there will also be a translation between the InnoDB row format(s) and the MySQL Server row format(s). The details of which are not particularly interesting unless you’re delving deep into code or wish to buy me beer to hear about them.
Query execution may need to get many rows from many tables, join them together, sum things together or even sort things. If there’s an index with the sort order, it’s better to use that. MySQL may also need to do a filesort (sort rows, possibly using files on disk) or construct a temporary table in order to execute the query. Temporary tables are either using the MEMORY (formerly HEAP) storage engine or the MyISAM storage engine. Generally, you want to avoid having to use temporary tables – doing IO is never good.
Once you have the results of a query coming through, you may think that’s it. However, you may also be part of a replication hierarchy. If so, any changes made as part of that transaction will be written to the binary log. This is a log file maintained by the MySQL Server (not the storage engines) of all the changes to tables that have occured. This log can then be pulled by other MySQL servers and applied, making them replication slaves of the master MySQL Server.
We’ll ignore the differences between statement based replication and row based replication as they’re better discussed elsewhere. Being part of replication means you get a few extra locks and an additional file or two being written. The binary log (binlog for short) is a file on disk that is appended to until it reaches a certain size and is then rotated. Writes to this file vary in size (along with the size of transactions being executed). The writes to the binlog occur as part of committing the transaction (and the crash safety between writing to the binlog and writing to the storage engine are covered elsewhere – basically: you don’t want to know).
If your MySQL Server is a replication slave, then you have a thread reading the binary log files from another MySQL Server and then another thread (or, in newer versions, threads) applying the changes.
If the slow query log or general query log is enabled, they’ll also be written to at various points – and the current code for this is not optimal, there be (yes, you guess it) global mutexes.
Once the results of a query have been sent back to the client, the MySQL Server cleans things up (frees some memory) and prepares itself for the next query. You probably have many queries being executed simultaneously, and this is (naturally) a good thing.
There… I think that’s a mostly complete overview of all the things that can go on during query execution inside MySQL.