Oracle MVA

Tales from a Jack of all trades

Archive for June 2016

On Exalogic, OTD and Multicast

with 2 comments

Oracle Traffic Director is Oracle’s software loadbalancing product that you can use on Exalogic. When you deploy OTD on Exalogic, you can choose to configurge high availability. How this works is fully described within manuals and typically works all nice when you try this on your local testsystems (e.g. in VirtualBox). Additional quircks that you have to be aware of are described also, e.g. on Donals Forbes his blog here and here. I encourage you to read all of that.

However when deploying such a configuration I kept on running into issues with my active/passive failover groups. To describe the issue in somewhat more detail, let me first show you how a typical architecture looks. A typical setup with OTD and an application looks like the image depicted below:
OTD HA

There is a public network, in this case it is collored green. The public network runs on a bonded network interface, identified by 1. This is the network that your clients use to access the environment. Secondly there is an internal network that is non-routable and only available within the Exalogic. This network is collored red and is running via bonded interface identified as 2. The OTD sits in the middle and basically proxies traffic comming in on 1 and forward the traffic non-transparent for the client via interface 2 to the backend weblogic servers.

When you setup a active/passive failover group, the VIP you want to run is mounted in interface 1 (public network. Again see Donals Forbes blog for implementation again. If you create such a configuration via tadm (or in the GUI) what happens under the covers, is that keepalived is configured to use VRRP. You can find this configuration in the keepalived.conf configuration file that is stored with the instance.

This configuration looks something like this:

vrrp_instance otd-vrrp-router-1 {
        priority 250
        interface bond1
        virtual_ipaddress {
                XXX.XXX.XXX.XXX/XX
        }
        virtual_router_id 33
}

On the second OTD node you would see the same configuration, however the priority will be different. Based on priority the VIP is mounted on either one or the other OTD node.

As you can see in this configuration file, only only interface 1 is into play currently. This means that all traffic regarding OTD is send over interface 1. This is public network. The problem with this is two-fold:

  1. Multicast over public network doesn’t always work
  2. Sending cluster traffic over public network is a bad idea from security perspective, especially since OTD’s VRRP configuration does not require authentication

When I look at the architecture picture, I prefer to send cluster traffic over the private network (via interface 2) instead of via public. In my last endeavor the external switches didn’t allow any multicast traffic, so actually the OTD nodes weren’t able to find each other and both mounted the VIP. I found that multicast traffic was dropped by performing a tcpdump on the network interface (no multicast packets from other hosts arrived). Since tcpdump puts the network interface in a promiscuous mode, I get called by the security team after every time I perform a tcpdump. Therefore I typcally stay away from tcpdump and simply read the keepalived output in /var/log/messages when both OTD nodes are up. If you can see that one node is running as backup and one as master you are okay. Also you can see this by checking the network interfaces: if the VIP is mounted on both nodes you are in trouble.

The latter was the case for me: trouble. The VIP was mounted on both OTD nodes. This somehow did not lead to IP conflicts, however when the second OTD node was stopped the ARP table was not updated and hence traffic was not forwarded to the remaining OTD.

After a long search on Google, My Oracle Support and all kinds of other sources I almost started crying: no documentation how to configure this was to be found. Therefore I started fiddling with the configuration, just to see if I could fix this. Here’s what I found:

The directive interface in the keepalived.conf is the interface that you use for clustering communication. However you can run a VIP on every interface by adding a dev directive to the virtual_ipaddress configuration. So here’s my corrected configuration:

vrrp_instance otd-vrrp-router-1 {
#   Specify the default network interface, used for cluster traffic
    interface bond2
#   The virtual router ID must be unique to each VRRP instance that you define
    virtual_router_id 33
    priority 250
    virtual_ipaddress {
       # add dev to route traffic via a non-default interface
       XXXX.XXXX.XXXX.XXXX/XX dev bond1
    }
}

So what this does, is send all keepalived traffic (meaning: cluster traffic) via bond2, however the VIP is mounted on bond1. If you also want to introduce authentication, the directive advert_int 1 is your new best friend. Example snippet to add to keepalived.conf within the otd_vrrp-router configuration:

    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1066
    }

Hope this helps.

Advertisements

Written by Jacco H. Landlust

June 6, 2016 at 9:29 am