OpenHab crashing with Z-Wave FIX IT FIX IT FIX IT FIX IT!!

So,  my openhab system periodically decides to leave the building.  Appears there is a problem from time to time when the z-wave binding loses communication to the z-wave stick it gets upset and tells openhab to take a hike.

This is bad.  Once because it exposed something I missed in my fault tolerance.   I had compensated for network issues and full machine failover.  But the actual process going belly up…. ooops.  My Bad.

Soooo I see it crash while at the gym today and the only thing in my head….

gSHIj

So I appear to have done that.

Let me bring you up to speed on the current state of my home automation.  After the great NAS failing of 2015 I was forced to reduce some of my virtual environment.   I have not brought my secondary HA controller back online yet.  However, it appears that still using keepalived I am able to help address this random problem.

I have added in a new option in my keepalived.conf

 


vrrp_script chk_hahealth {
    script "/usr/local/sbin/healthcheck.sh"
    interval 10 # check every 10 seconds
    fall 2 # require 2 failures for KO
    rise 2 # require 2 successes for OK
}

vrrp_instance VI_1 {
   state MASTER
   interface eth0
   virtual_router_id 220
   priority 150
   notify /usr/local/sbin/notify-keepalived.sh
   advert_int 1
   authentication {
        auth_type PASS
        auth_pass fakepass
   }

   virtual_ipaddress {
      192.168.2.90
   }
   track_script {
     chk_hahealth
   }
}

So what this does is add a keepalived health check.   Every 10 seconds keepalived runs the script /usr/local/sbin/healthcheck.sh and gets an exit code of 0 or 1.  0 if all is good.  1 if the world fell apart.

Environmental concept. Some images in montage provided by NASA (http://visibleearth.nasa.gov/)

The code for this script is


#!/bin/sh
SERVICE=openhab;

if ps ax | grep -v grep | grep $SERVICE > /dev/null
then
 echo "$SERVICE service running, everything is fine"
 /usr/bin/logger "$SERVICE service running, everything is fine"
 exit 0
else
 echo "$SERVICE is not running"
 /usr/bin/logger "$SERVICE is not running"
 /etc/init.d/openhab restart
 exit 1
fi


Explanation:

So this script just checks to see if the openhab process is running.  If its good, exit 0.  If its not, exit 1 but go ahead and try to restart openhab.  When keepalived gets the exit 1 code it keeps track of it.  You will see in the config that there is a fall 2 line.  That means that if there are 2 exit 1 status’s keepalived will go into a failed state.  When the second HA box is back online this will force openhab to move over to the other one.  However, I have not seen this happen so far as openhab loads pretty quick so since there is 10 seconds between the checks the second check comes back with an exit 0 and resets the fall count.

 

 

 

 

Home Automation Quick Update

o since I made the home automation system failover its been great!…. except I kind of would like to know which box its running on it.  So I made a quick change.  A new Item on my dash board

String  Server "Server [%s]" {exec="<[/bin/cat@@/etc/hostname:60000:]"}

I have that in my items file.  Then in my sitemap I added

Text label="Currently Running on [%s]" item=Server

I have that at the bottom of my sitemap.

image

Home Automation Move (part 2)

So, with a friday and a saturday worth of work on my home automation move here is what is complete.

  1. install server os on vm
  2. install openhab and all bindings currently in use
  3. move openhab configs over to new vm
  4. shutdown old openhab-pi
  5. configure raspberry pi with virtualhere server to share usb
  6. configure new vm server to connect to raspberry pi to communicate with z-wave stick
  7. install keepalived on new server
  8. configure virtual ip as my new primary ip for openhab access
  9. configure scripts to run to start openhab and connect to shared usb
  10. clone server to secondary vm for failover
  11. reconfigure keepalived to make second box slave
  12. test failover

So let me show you my keepalived settings and my scripts.

/etc/keepalived/keepalived.conf

vrrp_instance VI_1 {
     state MASTER
     interface eth0
     virtual_router_id 220
     priority 150
     notify /usr/local/sbin/notify-keepalived.sh
     advert_int 1
     authentication {
           auth_type PASS
           auth_pass fakepass
     }
     virtual_ipaddress {
         192.168.2.90
     }
}

 

See the “notify” line?    That script is pretty simple.

#!/bin/bash
TYPE=$1
NAME=$2
STATE=$3
case $STATE in
        "MASTER") sleep 30
                  /usr/local/sbin/usb-connect.sh
                  /usr/sbin/service openhab start;;
        "BACKUP") /usr/sbin/service openhab stop
                  /usr/local/sbin/usb-disconnect.sh;;
        "FAULT")  /usr/sbin/service openhab stop
                  /usr/local/sbin/usb-disconnect.sh
                  exit 0
                  ;;
        *)        /sbin/logger "unknown state"
                  exit 1
                  ;;
esac

 

So what that does it every time there is a keepalived state change it notifies that script.   That script then runs additional scripts based on the state.   So when it goes to “MASTER” or at boot time (which is why i have to put that sleep statement in there) it runs usb-connect.sh.  Which just has a couple commands

/sbin/vhclient &
sleep 10
/sbin/vhclient -t "USE,4294967409"

I’ll walk you through this one.

  1. runs the virtualhere usb client.
  2. waits a few seconds so the client can detect shared usb on the network
  3. sends a command to the running client “-t = command” specifying to “USE” the device with the id “4294967409”

The usb-disconnect.sh is a single line!

pkill vhclient

Thats it.  Just shutdown the client.   So now when the box boots up openhab1 becomes “MASTER” for openhab.  It then executes the scripts to connect to the shared USB.  Then starts openhab.

Once the second box is in place, all the same scripts and all will be put in place with 1 single change.

 

vrrp_instance VI_1 {
     state MASTER
     interface eth0
     virtual_router_id 220
     priority 200
     notify /usr/local/sbin/notify-keepalived.sh
     advert_int 1
     authentication {
           auth_type PASS
           auth_pass fakepass
     }
     virtual_ipaddress {
         192.168.2.90
     }
}

Notice line 5, the priority is a higher number than in openhab1.  This means that when the boxes communicate they will negotiate who gets to be master.   Then either box can start openhab and whoever is running openhab gets the USB z-wave stick.

Home Automation Move (part 1)

I have been running openhab for over a year now on a raspberry pi. I also run mosquitto mqtt broker with a great piece of software called mqttwarn.   On my phone I run owntracks.

FullSizeRender

So lets do a brief summary of what these pieces of software do for me and what I currently have configured.

Openhab – “a vendor and technology agnostic open source automation software for your home.”   That is what the site says openhab is and they are right.   Openhab is a core home automation system that has been designed to work with MANY different vendors and systems to make your home smart.   I have friends that got stuck with z-wave because they spent a bunch of money on z-wave and don’t want to replace it.  But there is also belkin stuff, wifi modules, home built devices (this is a BIG problem for third party controllers.)   Later on in this post you will see that I have used this agnostic approach to home automation to my benefit.

Mosquitto – “is an open source (BSD licensed) message broker that implements the MQ Telemetry Transport protocol versions 3.1 and 3.1.1.”  Okay, not quite as cut and dry as openhab.  So I’ll see if I can help.   MQTT (MQ Telemetry Transport) is a system in which devices and services can connect to a central system and communicate via very small, very efficient messages back and forth.  This efficiency helps in speed and bandwidth.  A device can connect and just wait for commands, a service can send a message to the mqtt server which in turn immediately passes it to the device connected.  MQTT is EXTREMELY more robust, but that is a simple paraphrase description of it.

MQTTWarn – “a pluggable MQTT notifier.”   Hmm, not as helpful, but now that you know what mqtt is you probably understand this a little better.   Jan-Piet Mens, the creator of mqttwarn and a very nice guy (I have personally had experience working with him in trying to accomplish various things) created this wonderful middleman piece of software.  Its almost like grand central station, or your telephone switch board, or the traffic cop.   Its a beautiful thing, I use this personally to do the following.

  • update dashboards in my office with current battery power on a couple devices using owntracks to get the battery data and pushing info to dashing dashboard. 2015-11-13 16_53_22-My super sweet dashboard
  • Show current bandwidth usage from my router (python service I wrote to query snmp data from my router and publish via mqtt and then pushing info to dashing)
    bandwidth
  • Pushing various alerts to prowl/growl
  • Push notices of events to my kodi installations

OwnTracks – “Your location companion.”   OwnTracks is an application that can run on android and IOS devices that uses the internal GPS info and reports back to your mqtt server.  This allows for a little better sense of privacy about our tracking info, but lets be honest if you have your phone on you, you are probably being tracked.  But it doesn’t mean we have to give our info to everyone, so we use our own systems to track us.  This also reports battery info with the location data it sends back to the system.  Why would you want this?  Well I use it to detect when I am home for presence detection.  I also use it to have my home automation system know when I am leaving the office each day.

 

For devices I have integrated, I have..

  • a couple z-wave devices
  • 2 phillips hue lights
  • 2 belkin wemo switches
  • Logitech Media Server (squeezebox server) for media
  • 2 Max2Play raspberry pi setups.
  • 3 Kodi installs
  • 2 mobile devices via owntracks

 

So what am I doing moving this?!?!   Well, this is a good question.  The system does work in its current setup.  However, as every project must, there has to be a very high WAF (Wife Acceptance Factor) if the project should ever be allowed to leave your workspace…  Since I am pushing very hard to start putting some bigger pieces in the rest of the house (home built IR blasters, wall panels, in-ceiling speakers for voice notifications) I needed to increase the WAF.  So, I have found that there are 2 things that make this much easier.

  1. Must be easy to use.  UI is key here, if others in the house can’t use it, its junk.
  2. Must work.   If the wife can’t turn on a light because something is down, its junk.

Those 2 key pieces of info really fit and help.  So UI I believe I already have covered with a nice easy touch interface available on EVERY device in this house including some tangible remotes.   So number 2 is the obstacle to tackle.  It must work.  So redundancy/fault tolerant is key.  I have 2 seperate esxi environments in my setup here at home so I am going to place 2 openhab systems in virtual environment with heartbeat, failover and a virtual IP.  To share the physical z-wave stick I am taking the raspberry pi and using virtualhere to share the 1 USB device with both controllers.  This does still present a single point of failure for z-wave stuff.  But it is not all the system so its better.  If anyone has any thoughts on how to add redundancy to that Im all ears.

So here is what my goal is for my controller setup.

Openhab Layout - New Page (1)