Our Bacula server has started failing when backing up its own database. This is a standard job called BackupCatalog. It showed an error like so:
27-Feb 23:29 bacula-dir JobId 6545: shell command: run BeforeJob "/etc/bacula/scripts/<a href="http://make_catalog_backup.pl" target="_blank">make_<wbr />catalog_backup.pl</a> MyCatalog" 27-Feb 23:35 bacula-dir JobId 6545: BeforeJob: mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table `File` at row: 2164125 27-Feb 23:35 bacula-dir JobId 6545: BeforeJob: mysqldump: Couldn't execute 'show table status like 'FileSet'': MySQL server has gone away (2006) ....
Reading the error, it would appear to be a mysqldump problem, rather than a Bacula configuration problem. The mysqldump process was running out of resources. This was because the mysql server wasn’t configured correctly and the server didn’t have enough memory or cpu. I run Bacula on a virtual server and it was therefore quite easy to give an extra two gigs of memory and some more CPU time.
However, to fix the mysql server configuration, I adjusted the following settings in MySQL. I did this by creating a file in the /etc/mysql/conf.d/ directory called bacula.cnf with the following settings and then restarted the server.
[mysqld] wait_timeout = 86400 max_allowed_packet=32M innodb_buffer_pool_size=1G sudo service mysql restart
Install mysqltuner
You might have different mysql configuration issues and therefore have different configurations to change. I would highly recommend mysqltuner to help you diagnose problem. It is designed to analysis your MySQL server and recommend options which could be changed based on your server usages.
Note: Your server needs to be running for over 24 hours before it’s recommendations are useful. So, if you have just restarted the server, it probably won’t be very useful.
apt-get install mysqltuner
Then run:
mysqltuner
And you get a lovely report like so. Note: The reports shown below is for server that has only been up for 15 minutes and the recommendations can’t really be trusted.
>> MySQLTuner 1.0.1 - Major Hayden <major@mhtx.net> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.5.35-0ubuntu0.12.04.2 [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in InnoDB tables: 413M (Tables: 25) [--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17) [!!] Total fragmented tables: 25 -------- Performance Metrics ------------------------------------------------- [--] Up for: 15m 36s (24K q [26.087 qps], 56 conn, TX: 353K, RX: 4M) [--] Reads / Writes: 0% / 100% [--] Total buffers: 192.0M global + 2.7M per thread (151 max threads) [OK] Maximum possible memory usage: 597.8M (10% of installed RAM) [OK] Slow queries: 0% (0/24K) [OK] Highest usage of available connections: 1% (3/151) [OK] Key buffer size / total MyISAM indexes: 16.0M/98.0K [!!] Key buffer hit rate: 80.0% (5 cached / 1 reads) [!!] Query cache efficiency: 17.0% (29 cached / 171 selects) [OK] Query cache prunes per day: 0 [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 3 sorts) [OK] Temporary tables created on disk: 19% (54 on disk / 275 total) [OK] Thread cache hit rate: 94% (3 created / 56 connections) [OK] Table cache hit rate: 24% (66 open / 273 opened) [OK] Open file limit used: 4% (48/1K) [OK] Table locks acquired immediately: 100% (24K immediate / 24K locks) [!!] InnoDB data size / buffer pool: 413.7M/128.0M -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Enable the slow query log to troubleshoot bad queries Variables to adjust: query_cache_limit (> 1M, or use smaller result sets) innodb_buffer_pool_size (>= 413M)
Top
I recommend using Top to diagnose the load problems on your server. For me, it was clear from the basic numbers that the server was using alot of swap and lots of cpu.
SAR
Finally, I recommend used the excellent sysstat / sar tools to see the history of the CPU and the iowait on the server.
I hope this helps someone.