mysqld_safe Directory ‘/var/run/mysqld’ for UNIX socket file don’t exists.

This is an error I get when starting MySQL on a Galera cluster node after a reboot.

The result of this is that MySQL fails to start automatically after a reboot.  I have no idea why this is happening and the error appears to be extremely rare, as there (from what Googling I’ve done so far) is very little discussion of this issue, except for this guy.  I’ve reached out to him on Twitter; we’ll see what he says.

I would hope that there’s a way to make this fix permanent.  Because of it’s location I cannot expect that directory to be persistent across reboots.  I want to say that this is some issue with MySQL not doing what it needs to do start itself up properly (i.e. making the directory in /var/run that it needs to make) but perhaps the blame is on the OS (or, more likely, me).

I don’t know if this is a bug or not, but I know I don’t have this issue with MySQL sans-Galera.

Elsewhere, I’ve been suggested to examine this post for more information how I might remedy this issue.

I got to learn some (more) High Availability things

In the spirit of continuous improvement, I made another HA setup.

Same basic front-end with two webservers behind one HAProxy load balancer, but with way more fault tolerance behind them.

I used the guide I mentioned in the previous HA post to create this new setup.

Webservers look to for a database connection.  This IP can exist on one of three ProxySQL servers thanks to keepalived.  keepalived does two things (at least in my setup): it checks to see if the ProxySQL process is running, and it talks to other keepalived servers.  On the keepalived server that is the master, keepalived puts on an interface.  On keepalived slave servers, is not assigned.  On the master, if the ProxySQL process ceases to exist, keepalived removes and one of the slave servers puts on one of their interfaces.  This happens in a matter of seconds. goes away on the server that had it almost as soon as I kill ProxySQL and then magically shows up on one of the other ProxySQL servers.  The technical term for this is a “virtual IP” but I think “conditional IP” is more appropriate.  A servers ability to have that IP is conditional on 1) ProxySQL is running and 2)  no other server has that IP.  If both of those conditions exist, then that server can have that IP.

Because I’m using Galera instead of master-slave replication, I can distribute all queries from the webservers across all databases.  In production, depending on the application, this may not be appropriate, but as an exercise for me, this is OK.  From the little research I’ve done, it is possible to confuse a Galera cluster, but only under very high load, considerable latency between the nodes, and with conflicting queries.  For my purposes, distributing my dinky queries across all three nodes hasn’t produced any issues (i.e. my test wordpress refreshes just fine).

My next idea is to apply keepalived treatment to the front end.  That is, have two HAProxy load balancers in front of the webservers, and have keepalived pass the public IP back and forth between them as required.

Test wp on the new setup with

Galera test cluster created

Well that was freakishly easy.  Thanks again Digital Ocean.  Seemed easier to set up than master-slave replication and this is way more featureful in that you can write to any node in the cluster, not just the master.

There may be a way to override this, but its like every Galera node is the same server.  With a master-slave setup, you have to be careful about individual databases.  I suppose this could be useful if you wanted to have a master put databaseA on slave1, and databaseB on slave2, but from the Digital Ocean tutorial on Galera, this is way simpler.  Really blown away by how easy this was to set up.

I got to learn some High Availability things

As part of a job opportunity, the interviewer assigned some homework.  The expectations were as follows:

  • Redundant webservers
  • Redundant database servers
  • Read-write splitting between the webservers and the database servers

And ~24 hours later, I have achieved this.  This is spread across six virtual machines:

  • HAProxy server
  • Apache server 1
  • Apache server 2
  • ProxySQL server
  • MySQL master
  • MySQL slave

Setting up the webservers was simple enough (apt install apache2).

To setup HAProxy, I followed this guide:

To set up master-slave replication, I used this guide:

To set up ProxySQL load balancing and read/write splitting, I followed this guide: and to a lesser extent

I’m sure there’s still some tuning and ironing out to do to make things super smooth, but the mechanics work as expected. loads a page that shows which webserver you’re using.  If you refresh like crazy and the IP doesn’t change, that’s to be expected.  Sessions stickiness is implemented so if you access the page within a certain interval, you’ll continue to talk to the same webserver. has more information about the cookie you’re getting.  Both this and the previous URL are products of the Digital Ocean guide on setting up HAProxy.  Its set using the “Cookie insert method.” is a vanilla WordPress install sending data through the ProxySQL server to both the master and slave databases.

This screencap shows the unbalanced send/receive traffic going to and from the master and slave MySQL servers:

There were a few F5’s between issuing those two commands.  Queries to the master went up by 14, while queries to the slave went up by 259.

Current limitations:  In this topology, ProxySQL represents a single point of failure.  This guide describes a setup involving multiple ProxySQL servers, and a Galera database cluster.  While this was mentioned but not required for my assignment, this setup makes use of virtual IP’s (which at the moment I do not fully understand).  The idea is that you can specify a single IP in the web app (i.e. the host that the web application looks to for database services), but depending on what’s happening, that virtual IP could refer to different hosts.  The goal is that, in the event of one ProxySQL server going down, the other one takes over, but the web app continues to have database access like nothing happened.

The other limitation involves the master/slave setup and how my ProxySQL server does query routing.  I’m not sure if this can be conditional (ex. if the slave goes down, use the master for everything).  If either the master or slave go down, nothing works (though this could be partially mitigated by having multiple slaves, but if the master goes down, I’m screwed).  Galera seems to be an appropriate solution to this problem, though I haven’t played with Galera yet.  In my setup the slave is meant to be mainly read-only, and the master gets the writes.  From what little I’ve read about Galera, every db server in the cluster is a master that could be used for reads and writes.

These are technologies I could use for my own self, but considering that everything I have in Dallas is on the same physical server, investing a ton of time into HA for my own services would not be a very fruitful endeavor.

Replacing VM’s with containers would drastically reduce space requirements and allow for easier backup in the event poo does hit the fan.  Downtime for me isn’t a huge deal.  Having to set stuff up again is a bigger deal.  So far so good though.  My ZFS array is healthy and things have been ridiculously stable (knock on wood) so I’m not so worried about lack-of-availability.

Changelog 20170513

MariaDB VM updated

Openfire VM updated

Plex VM updated

Seafile VM updated

TF2 VM updated

Streisand  VM updated

WWW VM updated

Zabbix server VM updated

TO-DO: iRedMail (something in SOGo breaks, and I don’t know why), pfSense (which I’m not using, and my password isn’t being accepted? so I’ll mess with that later)

Zabbix Email and Jabber/XMPP Notifications

FINALLY have Zabbix sending alerts.  Should have taken care of this ages ago.  Emails send properly through Gmail, Jabber/XMPP notifications go through my Openfire VM.  Couldn’t use the bulit-in Jabber functionality in Zabbix.  Had to make a script.  Some issue with TLS handshake somewhere.  sendxmpp works just fine though!