See RHBZ
1311005 and
1247303. In short: sometimes when a controller
node gets fenced, rabbitmq is unable to rejoin the cluster. To fix this
we need two steps:
1) The fix for the RA in BZ
1247303
2) Add notify=true to the meta parameters of the rabbitmq resource on
fresh installs and updates
Note that if this change is applied on systems that do not
have the fix for the rabbitmq resource agent, no action is taken.
So when the resource agent will be updated, the notify
operation will start to work as soon as the first monitor
action will take place.
Fixes RH Bug #
1311005
Change-Id: I513daf6d45e1a13d43d3c404cfd6e49d64e51d5a
# mongod start timeout is higher, setting only stop timeout
pcs -f $pacemaker_dumpfile resource update mongod op start timeout=370s op stop timeout=200s
+ echo "Making sure rabbitmq has the notify=true meta parameter"
+ pcs -f $pacemaker_dumpfile resource update rabbitmq meta notify=true
+
echo "Applying new Pacemaker config"
if ! pcs cluster cib-push $pacemaker_dumpfile; then
echo "ERROR failed to apply new pacemaker config"
ocf_agent_name => 'heartbeat:rabbitmq-cluster',
resource_params => 'set_policy=\'ha-all ^(?!amq\.).* {"ha-mode":"all"}\'',
clone_params => 'ordered=true interleave=true',
+ meta_params => 'notify=true',
require => Class['::rabbitmq'],
}