Home News Feeds Planet MySQL
Newsfeeds
Planet MySQL
Planet MySQL - http://www.planetmysql.org/

  • Careful how you monitor MySQL
    I was recently struck by a problem which is unusual. In order to keep an eye on the database server I use nagios, cacti, merlin and some local scripts to monitor the database instance and ensure that it is working properly.  That normally works fine.  The different monitor processes do various things, one of which is to monitor the replication status of a slave, and warn me if the replication is not working or if it’s behind. This is done with the command SHOW SLAVE STATUS. The server I was looking at runs some large local batch jobs aggregating data. Unfortunately, I was experiencing that replication was interfering with these batch jobs so decided to see if things would perform better if I stopped replication and let the batch jobs complete, restarting replication afterwards. So the command STOP SLAVE was sent to the server, and this took some time. In the meantime SHOW SLAVE STATUS hangs. The STOP SLAVE command waits for the SQL replication thread to finish it’s task but that thread was waiting on the tables being used by the batch process, which was already running. The consequence of this was that STOP SLAVE waited, all calls to SHOW SLAVE STATUS blocked and with nagios, cacti, merlin and the local scripts all doing periodic SHOW SLAVE STATUS commands which would hang I ended up with the mysqld running out of user connections.  Nasty! While I see if MySQL can do something about the behaviour of stopping the slave and showing the slave status I’m going to implement grants which limit the monitor users so they are only allowed to have a user concurrent connections open.  This is done using the syntax GRANT whatever ON whereever TO some_user WITH MAX_USER_CONNECTIONS 5; 5 looks like a good number for nagios, merlin and the local scripts and 10 is probably sufficient as I collect a lot of graph data for the mysql server and the cacti requests are often done in parallel. So if you monitor MySQL it may be worth you applying some sort of connection limit on your monitoring scripts, if not on other database users.

  • mycheckpoint (rev. 190): HTTP server; interactive charts
    Revision 190 of mycheckpoint, a MySQL monitoring solution, has been released. New and updated in this revision: HTTP server: mycheckpoint can now act as a web server. Point your browser and start browsing through HTML reports. See mock up demo. Interactive charts: HTML line charts are now interactive, presenting with accurate data as you move over them. See sample. Enhanced auto-deploy: now auto-recognizing failed upgrades. Reduced footprint: much code taken out of the views, leading to faster loading times. Better configuration file use: now supporting all command line options in config file. Remote host monitoring accessibility: now supporting complete configurable accessibility details. Bug fixes: thanks to the bug reporters! mycheckpoint is free, simple, easy to use (now easier with HTTP server) and useful. I encourage you to try it out: even compared with other existing and emerging monitoring tools, I believe you will find it a breeze; it’s low impact and lightness appealing; it’s alerts mechanism assuring; its geeky SQL-based nature with ability to drill down to fine details — geeky-kind-of-attractive. </encouragement> HTTP server You can now run mycheckpoint in http mode: bash$ mycheckpoint http mycheckpoint will listen on port 12306, and will present you with easy browsing through the reports of your mycheckpoint databases. The http server automatically detects those schemata used by mycheckpoint, and utilizes the existing HTML views, integrating them into the greater web framework. While in http mode, mycheckpoint does nothing besides serving web pages. It does not actively exercise monitoring: you must still use the usual cron jobs or other scheduled tasks by which you invoke mycheckpoint for monitoring. The http server is directed at a single MySQL server, as with the following example: bash$ mycheckpoint --host=slave1.localdomain --port=3306 --http-port=12306 http It is assumed that this server has the monitoring schemata. See mock up demo. The demo uses presents with real output from a mycheckpoint HTTP server; I haven’t got the means to put up a live demo. Interactive charts The openark line charts, used in the HTML reports, are now interactive. As you scroll over, the legend presents you with series values. No more “I have this huge spike once every 4 hours, which reduces all other values to something that looks like zero but is actually NOT”. Hover, and see the real values. See sample. Enhanced auto-deploy The idea with mycheckpoint is that it should know how to self upgrade the schema on version upgrade (much like automatic WordPress upgrades). mycheckpoint does bookkeeping of installed versions within the database, and upgrades by simple comparison. It now, following a couple of reported bugs, also recognizes failure of partial, failed upgrades. This adds to the automation of mycheckpoint‘s installation. Reduced footprint Some of mycheckpoint‘s views are complicated, and lead to a large amount of code in view declaration. This leads to increased table definition size (large .frm files). There has been some work to reduce this size where possible. Work is still ongoing, but some 30% has been taken off already. This leads to faster table (view) load time. Better configuration file use Any argument supported on the command line is now also supported in the config style. Much like is handled with MySQL. For example, one can issue: mycheckpoint --monitored-host=sql02.mydb.com  --monitored-user=monitor --monitored-password=123456 But now also: mycheckpoint With the following in /etc/mycheckpoint.cnf: [mycheckpoint] monitored_host = sql02.mydb.com monitored_user = monitor monitored_password = 123456 Rules are: If an option is specified on command line, it takes precedence over anything else. Otherwise, if it’s specified in the configuration file, value is read from file. Otherwise use default value is used. On command line, option format is xxx-yyy-zzz: words split with dash/minus character. On configuration file, option format is xxx_yyy_zzz: words split with underscore. Unlike MySQL configuration format, dashes cannot be used. If an option is specified multiple times on configuration file — well — I have the answer, but I won’t tell. Just don’t do it. It’s bad for your health. Future plans Work is going on. These are the non-scheduled future tasks I see: Monitoring InnoDB Plugin & XtraDB status. A proper man page. Anything else that interests me & the users. Try it out Try out mycheckpoint. It’s a different kind of monitoring solution. Simple monitoring (charting) is immediate. For more interesting results you will need basic SQL skills, and in return you’ll get a lot of power under your hands. Download mycheckpoint here Visit the project’s homepage Browse the documentation Report bugs mycheckpoint is released under the New BSD License.

  • Debugging Communication Link Failure exceptions in Connector/J
    Have you seen error messages similar to the following: Communications link failure – Last packet sent to the server was X ms ago. Judging from the forums, many people have had problems with this.  Here’s a brief overview of the causes, and potential solutions. Generally speaking, this error suggests that the network connection has been closed. There can be several root causes: Firewalls or routers may clamp down on idle connections (the MySQL client/server protocol doesn’t ping). The MySQL Server may be closing idle connections which exceed the wait_timeout or interactive_timeout threshold There’s a couple of useful diagnostic details which can be useful.  For starters, when a recent (5.1.13) version of Connector/J is used, you should see additional details around both the last packet sent and received.  Older versions may simply indicate the last time a packet was sent to the server, which is frequently zero ms ago.  That’s not terribly useful, and it may be that you just sent a packet, but haven’t received a packet from the server for 12 hours.  Knowing how long it’s been since Connector/J last received a packet from the server is useful information, so if you are not seeing this in your exception message, update your driver. The second useful diagnostic detail shows up when Connector/J notices that the time a packet was last sent/received exceeds the wait_timeout or interactive_timeout threshold.  It will attempt to notify you of this in the exception message. The following can be helpful in avoiding such problems, but ultimately network connections can be volatile: Ensure connections are valid when checked out of connection pool (use query which starts with “/* ping */” *exactly* to execute lightweight ping instead of full query) Minimize duration a Connection object is left idle while other application logic is executed Explicitly validate Connection before using after being left idle for extended period of time Ensure wait_timeout and interactive_timeout are set sufficiently high Ensure tcpKeepalive is enabled Ensure that any configurable firewall or router timeout setting accounts for maximum expected idle connection time. I’ve seen exception messages which indicate Connections being used after sitting idle for hours – sometimes days.  If you do this, make sure that you are explicitly testing the connection before using it after lengthy idle periods.  Network connections fail, and applications need to be prepared to handle that.  But expecting connections to survive extended periods where left idle and work magically when used again hours later is just asking for trouble.

  • Whoa! MyQuery 3.3.0 Beta Released at last!
    After a long wait, today I release MyQuery 3.3.0. For you who haven't seen MyQuery before, this is a Windows based interactive query tool for MySQL (sorry no Mac or Linux support this time) . The emphasis is much on script editing and running, and on the needs of the DBA, more than on the end user. Also, for the database designer, there is some cool stuff in here.MyQuery supports running scripts up to a specific point in the script, or starting from a specific point and supports a kind of "edit and continue", so you can develop a script, run it til it breaks, correct the error, and then continue where you left off.That is not all, after all these releases MyQuery has gained quite a few useful features. Script editing is done with the Scintilla color coded editor for examples, and there are tools for managing database objects such as tables, views and events.Every release of MyQuery has a theme, and MyQuery 3.3.0 has a very distinct theme: Extensibility. There is now support for adding your own favoutite SQL SELECT or Scripts to a user defined tools menu. And that is not all, there are many more types of tool that you can define, from opening a web link in a browser to running code in a specified DLL, using the new MyQuery Plugin API.The main new features in summary are:The User defined tools featureThe MyQuery Plugin APIA login dialog which now supports saved login settings for different serversMySQL Cluster / NDB monitoring through User tool pluginInnoDB Lock tree monitoring through a User tool pluginSeveral bugfixesThis is still a beta version. It is hence feature complete, but I know there is a bug or two, in particular the plugin API (which is an advanced feature which required a fair amount of refactoring to work). Feedback is welcome!I will write a few more blogposts here to describe some 3.3 details, such as the InnoDB Lock tree monitor, the NDB Monitor, the Plugin API and things like this, but for now, just download the thing and enjoy! As usual it's available on Sourceforge:http://sourceforge.net/projects/myquery/files/myquery//Karlsson

  • Beware of svctm in Linux’s iostat
    I’ve been studying the source of iostat again and trying to understand whether all of its calculations I explained here are valid and correct. Two of the columns did not seem consistent to me. The await and svctm columns are supposed to measure the average time from beginning to end of requests including device queueing, and actual time to service the request on the device, respectively. But there’s really no instrumentation to support that distinction. The device statistics you can get from the kernel do not provide timing information about device queueing, only a) begin-to-end timing of completed requests and b) the time accumulated by requests that haven’t yet completed. I concluded that the await is correct, but the svctm cannot be. I just looked at the sysstat website, and it has been updated recently to warn about this, too: svctm The average service time (in milliseconds) for I/O requests that were issued to the device. Warning! Do not trust this field any more. This field will be removed in a future sysstat version. Related posts:How Linux iostat computes its resultsHow to find per-process I/O statistics on LinuxSoutheast Linux Fest is around the cornerRecap of Southeast Linux Fest 2009How to auto-mount removable devices in GNU/Linux

Valid XHTML and CSS.