In previous blogs, I talked about some basis and presented some possible architectures for Alfresco and I talked about the Clustering setup for the Alfresco Repository. In this one, I will work on the Alfresco Share layer. Therefore, if you are using another client like a CMIS/REST client or an ADF Application, it won’t work that way, but you might or might not need Clustering at that layer, it depends how the Application is working.

The Alfresco Share Clustering is used only for the caches, so you could technically have multiple Share nodes working with a single Repository or a Repository Cluster without the Share Clustering. For that, you could disable the caches on the Share layer because if you kept it enabled, you would have, eventually, faced issues. Alfresco introduced a Share Clustering which is used to keep the caches in sync, so you don’t have to disable it anymore. When needed, cache invalidation messages are sent from one Share node to all others, that include runtime application properties changes as well as new/existing site/user dashboards changes.

Just like for the Repository part, it’s really easy to setup the Share Clustering so there is really no reasons not to. It’s also using Hazelcast but it’s not based on properties that you need to configure in the alfresco-global.properties (because it’s a Share configuration), this one must be done in an XML file and there is no possibilities to do that in the Alfresco Admin Console, obviously.

All Share configuration/customization are put in the “$CATALINA_HOME/shared/classes/alfresco/web-extension” folder, this one is no exception. There are two possibilities for the Share Clustering communications:

  • Multicast
  • Unicast (TCP-IP in Hazelcast)

 

I. Multicast

If you do not know how many nodes will participate in your Share Cluster or if you want to be able to add more nodes in the future without having to change the previous nodes’ configuration, then you probably want to check and opt for the Multicast option. Just create a new file “$CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml” and put this content inside it:

[alfresco@share_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans 
       xmlns_xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns_hz="http://www.hazelcast.com/schema/spring"
       xsi_schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                           http://www.hazelcast.com/schema/spring
                           http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

  <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="share_hz_test"/>
  <hz:hazelcast id="webframework.cluster.slingshot">
    <hz:config>
      <hz:group name="slingshot" password="Sh4r3_hz_Test_pwd"/>
      <hz:network port="5801" port-auto-increment="false">
        <hz:join>
          <hz:multicast enabled="true" multicast-group="224.2.2.5" multicast-port="54327"/>
          <hz:tcp-ip enabled="false">
            <hz:members></hz:members>
          </hz:tcp-ip>
        </hz:join>
        <hz:interfaces enabled="false">
          <hz:interface></hz:interface>
        </hz:interfaces>
      </hz:network>
    </hz:config>
  </hz:hazelcast>

  <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
    <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
    <property name="hazelcastTopicName">
      <value>share_hz_test</value>
    </property>
  </bean>

</beans>
[alfresco@share_n1 ~]$

 

In the above configuration, be sure to set a topic name (matching the hazelcastTopicName’s value) as well as a group password that is specific to this environment, so you don’t end-up with a single Cluster with members coming from different environments. For the Share layer, it’s less of an issue than for the Repository layer but still. Be sure also to use a network port that isn’t in use, it will be the port that Hazelcast will bind itself to in the local host. For Alfresco Clustering, we used 5701 so here it’s 5801 for example.

Not much more to say about this configuration, we just enabled the multicast with an IP and a port to be used and we disabled the tcp-ip one.

The interfaces is disabled by default but you can enable it, if you want to. If it’s disabled, Hazelcast will list all local interfaces (127.0.0.1, local_IP1, local_IP2, …) and it will choose one in this list. If you want to force Hazelcast to use a specific local network interface, then enable this section and add that here. In can use the following nomenclature (IP only!):

  • 10.10.10.10: Hazelcast will try to bind on 10.10.10.10 only. If it’s not available, then it won’t start
  • 10.10.10.10-11: Hazelcast will try to bind on any IP within the range 10-11 so in this case 2 IPs: 10.10.10.10 or 10.10.10.11. If you have, let’s say, 5 IPs assigned to the local host and you don’t want Hazelcast to use 3 of these, then specify the ones that it can use and it will pick one from the list. This can also be used to have the same content for the custom-slingshot-application-context.xml on different hosts… One server with IP 10.10.10.10 and a second one with IP 10.10.10.11
  • 10.10.10.* or 10.10.*.*: Hazelcast will try to bind on any IP in this range, this is an extended version of the XX-YY range above

 

For most cases, keeping the interfaces disabled is sufficient since it will just pick one available. You might think that Hazelcast may bind itself to 127.0.0.1, technically it’s possible since it’s a local network interface but I have never seen it do so, so I assume that there is some kind of preferred order if another IP is available.

Membership in Hazelcast is based on “age”, meaning that the oldest member will be the one to lead. There is no predefined Master or Slave members, they are all equal, but the oldest/first member is the one that will check if new members are allowed to join (correct config) and if so, it will send the information to all other members that joined already so they are all aligned. If multicast is enabled, a multicast listener is started to listen for new membership requests.

 

II. Unicast

If you already know how many nodes will participate in your Share Cluster or if you prefer to avoid Multicast messages (there is no real need to overload your network with such things…), then it’s preferable to use Unicast messaging. For that purpose, just create the same file as above (“$CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml“) but instead, use the tcp-ip section:

[alfresco@share_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans 
       xmlns_xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns_hz="http://www.hazelcast.com/schema/spring"
       xsi_schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                           http://www.hazelcast.com/schema/spring
                           http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

  <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="share_hz_test"/>
  <hz:hazelcast id="webframework.cluster.slingshot">
    <hz:config>
      <hz:group name="slingshot" password="Sh4r3_hz_Test_pwd"/>
      <hz:network port="5801" port-auto-increment="false">
        <hz:join>
          <hz:multicast enabled="false" multicast-group="224.2.2.5" multicast-port="54327"/>
          <hz:tcp-ip enabled="true">
            <hz:members>share_n1.domain,share_n2.domain</hz:members>
          </hz:tcp-ip>
        </hz:join>
        <hz:interfaces enabled="false">
          <hz:interface></hz:interface>
        </hz:interfaces>
      </hz:network>
    </hz:config>
  </hz:hazelcast>

  <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
    <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
    <property name="hazelcastTopicName">
      <value>share_hz_test</value>
    </property>
  </bean>

</beans>
[alfresco@share_n1 ~]$

 

The description is basically the same as for the Multicast part. The main difference is that the multicast was disabled, the tcp-ip was enabled and there is therefore a list of members that needs to be set. This is a comma separated list of hostname or IPs that the Hazelcast will try to contact when it starts. Membership in case of Unicast is managed in the same way except that the oldest/first member will listen for new membership requests on the TCP-IP. Therefore, it’s the same principle, it’s just done differently.

Starting the first Share node in the Cluster will display the following information on the logs:

Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n1.domain' to address(es): [127.0.0.1, 10.10.10.10]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n2.domain' to address(es): [10.10.10.11]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [share_n1.domain/10.10.10.10, share_n2.domain/10.10.10.11, share_n1.domain/127.0.0.1]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Prefer IPv4 stack is true.
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Picked Address[share_n1.domain]:5801, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5801], bind any local is true
Jul 28, 2019 11:45:36 AM com.hazelcast.system
INFO: [share_n1.domain]:5801 [slingshot] Hazelcast Community Edition 2.4 (20121017) starting at Address[share_n1.domain]:5801
Jul 28, 2019 11:45:36 AM com.hazelcast.system
INFO: [share_n1.domain]:5801 [slingshot] Copyright (C) 2008-2012 Hazelcast.com
Jul 28, 2019 11:45:36 AM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n1.domain]:5801 [slingshot] Address[share_n1.domain]:5801 is STARTING
Jul 28, 2019 11:45:36 AM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n1.domain]:5801 [slingshot] Connecting to possible member: Address[share_n2.domain]:5801
Jul 28, 2019 11:45:36 AM com.hazelcast.nio.SocketConnector
INFO: [share_n1.domain]:5801 [slingshot] Could not connect to: share_n2.domain/10.10.10.11:5801. Reason: ConnectException[Connection refused]
Jul 28, 2019 11:45:37 AM com.hazelcast.nio.SocketConnector
INFO: [share_n1.domain]:5801 [slingshot] Could not connect to: share_n2.domain/10.10.10.11:5801. Reason: ConnectException[Connection refused]
Jul 28, 2019 11:45:37 AM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n1.domain]:5801 [slingshot]

Members [1] {
        Member [share_n1.domain]:5801 this
}

Jul 28, 2019 11:45:37 AM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n1.domain]:5801 [slingshot] Address[share_n1.domain]:5801 is STARTED
2019-07-28 11:45:37,164  INFO  [web.site.ClusterTopicService] [localhost-startStop-1] Init complete for Hazelcast cluster - listening on topic: share_hz_test

 

Then starting a second node of the Share Cluster will display the following (still on the node1 logs):

Jul 28, 2019 11:48:31 AM com.hazelcast.nio.SocketAcceptor
INFO: [share_n1.domain]:5801 [slingshot] 5801 is accepting socket connection from /10.10.10.11:34191
Jul 28, 2019 11:48:31 AM com.hazelcast.nio.ConnectionManager
INFO: [share_n1.domain]:5801 [slingshot] 5801 accepted socket connection from /10.10.10.11:34191
Jul 28, 2019 11:48:38 AM com.hazelcast.cluster.ClusterManager
INFO: [share_n1.domain]:5801 [slingshot]

Members [2] {
        Member [share_n1.domain]:5801 this
        Member [share_n2.domain]:5801
}

 

 

Other posts of this series on Alfresco HA/Clustering: