Two-node MariaDB-galera-cluster
Caveats
- arbiter - 10.10.10.1
- node1 - 10.10.10.10
- node1 - 10.10.10.11
Installing the nodes
update /etc/hosts
to make sure all the machines are resolvable, perform this on all 3 hosts (2 active mysql nodes and the machine the arbiter resides on). Update the hostnames and addresses accordingly.
$ cat << EOF >> /etc/hosts
10.10.10.1 arbiter
10.10.10.10 node1
10.10.10.11 node2
EOF
Setup APT to use the MariaDB repositories
$ apt-get install python-software-properties
$ apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 0xcbcb082a1bb943db
$ cat > /etc/apt/sources.list.d/mariadb.list <<-EOF
deb http://mirror.1000mbps.com/mariadb/mariadb-5.5.34/repo/ubuntu precise main
deb-src http://mirror.1000mbps.com/mariadb/mariadb-5.5.34/repo/ubuntu precise main
EOF
Perform apt-get update on all machines
$ apt-get update
Installing mariadb-galera-cluster on the mysql nodes
On both the mysql nodes install mariadb-galera-server
and galera
. On the the machine thats going to be arbiter only galera
is neccesary.
$ apt-get install mariadb-galera-server galera
note: When performing these steps Ubuntu had newer version of certain packages, this generates installation errors. Get the mariadb versions and install them specificly
apt-cache show mysql-common | grep Version
apt-cache show libmysqlclient18 | grep Version
apt-get install mariadb-galera-server galera mariadb-client-5.5 libmysqlclient18=<version> mysql-common=<version>
When asked for a root password you can use the same password (The password from the instance we bootstrap from will be used).
Verify that mysql is started succesfully on both machines by performing ping (for example).
$ mysqladmin --defaults-file=/etc/mysql/debian.cnf ping
mysqld is alive
All good, now shut them down!
$ service mysql stop
If all is well move on the the next step.
Installing the Galera Arbitrator
On the machine thats going to be arbiter we install galera
from the mariadb repositories we added earlier.
$ apt-get install galera
Test that garbd
is availlable, perform the command and check if gives a FATAL error (which is normall because we didnt set any options)
$ garbd
FATAL: Group address not specified: 89 (Destination address required)
at garb/garb_config.cpp:Config():94
All good! movin on…
Configuring the mysql hosts
Below is an example config for the my.cnf
you can use. The values you might want to take a good look at are innodb_buffer_pool_size
and innodb_log_file_size
.
[client] port = 3306 socket = /var/run/mysqld/mysqld.sock [mysqld_safe] nice = 0 socket = /var/run/mysqld/mysqld.sock [mysqld] basedir = /usr bind-address = 0.0.0.0 binlog_format = ROW character_set_server = utf8 collation_server = utf8_general_ci datadir = /var/lib/mysql default-storage-engine = InnoDB expire_logs_days = 10 innodb_autoinc_lock_mode = 2 innodb_buffer_pool_size = 1G innodb_log_file_size = 512M innodb_doublewrite = 1 innodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 2 innodb_lock_wait_timeout = 60 innodb_locks_unsafe_for_binlog = 1 innodb_stats_on_metadata = 0 key_buffer = 256M lc-messages-dir = /usr/share/mysql lock_wait_timeout = 300 max_allowed_packet = 128M max_binlog_size = 128M max_connections = 64 myisam-recover = BACKUP myisam_sort_buffer_size = 64M net_buffer_length = 8K open-files-limit = 65535 pid-file = /var/run/mysqld/mysqld.pid port = 3306 query_cache_limit = 8M query_cache_size = 16M read_buffer_size = 8M read_rnd_buffer_size = 8M skip-external-locking socket = /var/run/mysqld/mysqld.sock sort_buffer_size = 16M table_cache = 2M table_definition_cache = 65535 table_open_cache = 65535 thread_cache_size = 8 thread_concurrency = 8 tmpdir = /tmp user = mysql [mysqldump] max_allowed_packet = 16M quick quote-names [mysql] [isamchk] !includedir /etc/mysql/conf.d/ key_buffer = 256M read_buffer = 16M sort_buffer_size = 256M write_buffer = 16M
Next to the normall config we also need a wsrep config, make sure node1 uses gcomm://node2 and the other way around. Also change the name to something usefull that identifies your cluster.
[mysqld] wsrep_provider=/usr/lib/galera/libgalera_smm.so wsrep_cluster_address="gcomm://node1?pc.wait_prim=no" wsrep_sst_method=rsync wsrep_cluster_name="usefull_name"
Bootstrappig and joining the cluster
Now its time to initalize the cluster. For this we need one instance that will bootstrap the cluster. We use mysqld command for this with an extra option. Perform this option on any of the 2 mysql nodes.
$ mysqld --wsrep_cluster_address=gcomm://
The terminal will be filled with startup output (normally this goes to syslog). Once this is done, go ahead and start mysql on the other node.
$ service mysql start
This time all the output will go to syslog, verify that everything went well. On the bootstrapping node you should see new output that a machine has joined the cluster. Members should be on 2/2 (joined/total)
...
members = 2/2 (joined/total),
...
Now, for us to test if all is well we need the contents of the debian.cnf
file from the bootstrapping node. Perform an scp or copy the contents to the newly added node’s debian.cnf
$ scp root@node1:/etc/mysql/debian.cnf /etc/mysql/debian.cnf
Verify that mysql is working on the newly added node
$ mysql --defaults-file=/etc/mysql/debian.cnf
You should be presented with a nice mariadb cli
MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | +--------------------+ 3 rows in set (0.00 sec) MariaDB [(none)]>
Also check the wsrep options
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_%'; +----------------------------+--------------------------------------+ | Variable_name | Value | +----------------------------+--------------------------------------+ | wsrep_local_state_uuid | random-generated-uuid | | wsrep_protocol_version | 4 | | wsrep_last_committed | 5 | | wsrep_replicated | 1 | | wsrep_replicated_bytes | 401 | | wsrep_received | 7 | | wsrep_received_bytes | 1111 | | wsrep_local_commits | 0 | | wsrep_local_cert_failures | 0 | | wsrep_local_bf_aborts | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_avg | 0.250000 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_avg | 0.000000 | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_sent | 0 | | wsrep_flow_control_recv | 0 | | wsrep_cert_deps_distance | 1.000000 | | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 1.000000 | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 1.000000 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_cert_index_size | 2 | | wsrep_causal_reads | 0 | | wsrep_incoming_addresses | ,10.10.10.11:3306,10.10.10.10:3306 | | wsrep_cluster_conf_id | 29 | | wsrep_cluster_size | 2 | | wsrep_cluster_state_uuid | random-generated-uuid | | wsrep_cluster_status | Primary | | wsrep_connected | ON | | wsrep_local_index | 2 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy | | wsrep_provider_version | 23.2.4(r147) | | wsrep_ready | ON | +----------------------------+--------------------------------------+ 40 rows in set (0.03 sec) MariaDB [(none)]>
All good, lets add the arbiter to the mix
On the arbiter node perform the following command (make sure to use the correct name)
garbd -a gcomm://10.10.10.10,10.10.10.11?pc.wait_prim=no -g usefull_name
Keep an eye on the syslog from the recently joined mysql node and the bootstrapping node. Total members should jump to 3/3
...
members = 3/3 (joined/total),
...
If all is well, stop the garbd process and edit the garb defaults file
$ cat > /etc/default/garb <<-EOF
# Copyright (C) 2012 Coedership Oy
# This config file is to be sourced by garb service script.
# A space-separated list of node addresses (address[:port]) in the cluster
GALERA_NODES="10.10.10.10:4567 10.10.10.11:4567"
# Galera cluster name, should be the same as on the rest of the nodes.
GALERA_GROUP="usefull_name"
# Optional Galera internal options string (e.g. SSL settings)
# see http://www.codership.com/wiki/doku.php?id=galera_parameters
GALERA_OPTIONS="pc.wait_prim=no"
# Log file for garbd. Optional, by default logs to syslog
LOG_FILE="/var/log/garbd.log"
EOF
Now you can start garbd with
$ service garbd start
Thats it! Now stop mysqld on the bootstrapping node, also make sure that the rsync daemon gets stopped, otherwise stop that aswell.
$ service mysql stop
$ ps aux | grep mysql | grep -v grep
If the ps aux returns nothing we are good to go, start mysql the old-fasion way.
$ service mysql start
In the syslog from the other node (or the log from the arbiter) you should see this node “rejoining”.
verify that mysql us working by performing the same mysql-commands you did earlier on the other node. Troubleshooting
When mysql wont start, take a look at the following:
- Check syslog for something obvious
- Make sure the mysqld.sock is not there (
/var/run/mysqld/mysqld.sock
), otherwise remove it - Make sure the rsync daemon is stopped, mysql will start it. If rsync daemon is running, stop it
When your entire cluster goes down, you have to bootstrap it again by performing the same steps in “Bootstrapping and joining the cluster”
Edit: October 09 2013 When mysql crashes, sometimes rsync keeps running which prevents the node to be started/re-added to the cluster. Adding the following to the /etc/init.d/mysql startup file can solve this:
rsync_checks() {
rsync_pid=$(ps aux | grep "rsync --daemon --port 4444" | grep -v grep | awk '{print $2}')
if [ ! -z $rsync_pid ]; then
log_warning_msg "rsync daemon is still running, probably because mysql was not shutdown nicely, killing $rsync_pid ..."
kill -9 $rsync_pid
while kill -0 $rsync_pid; do echo '.'; done
log_warning_msg "rsync daemon killed succesfully"
fi
}
...
case "${1:-''}" in
'start')
sanity_checks;
rsync_checks; # Add this function call
....