Oracle MVA

Tales from a Jack of all trades

Archive for the ‘High Availabillity’ Category

On Exalogic, OTD and Multicast

with 2 comments

Oracle Traffic Director is Oracle’s software loadbalancing product that you can use on Exalogic. When you deploy OTD on Exalogic, you can choose to configurge high availability. How this works is fully described within manuals and typically works all nice when you try this on your local testsystems (e.g. in VirtualBox). Additional quircks that you have to be aware of are described also, e.g. on Donals Forbes his blog here and here. I encourage you to read all of that.

However when deploying such a configuration I kept on running into issues with my active/passive failover groups. To describe the issue in somewhat more detail, let me first show you how a typical architecture looks. A typical setup with OTD and an application looks like the image depicted below:
OTD HA

There is a public network, in this case it is collored green. The public network runs on a bonded network interface, identified by 1. This is the network that your clients use to access the environment. Secondly there is an internal network that is non-routable and only available within the Exalogic. This network is collored red and is running via bonded interface identified as 2. The OTD sits in the middle and basically proxies traffic comming in on 1 and forward the traffic non-transparent for the client via interface 2 to the backend weblogic servers.

When you setup a active/passive failover group, the VIP you want to run is mounted in interface 1 (public network. Again see Donals Forbes blog for implementation again. If you create such a configuration via tadm (or in the GUI) what happens under the covers, is that keepalived is configured to use VRRP. You can find this configuration in the keepalived.conf configuration file that is stored with the instance.

This configuration looks something like this:

vrrp_instance otd-vrrp-router-1 {
        priority 250
        interface bond1
        virtual_ipaddress {
                XXX.XXX.XXX.XXX/XX
        }
        virtual_router_id 33
}

On the second OTD node you would see the same configuration, however the priority will be different. Based on priority the VIP is mounted on either one or the other OTD node.

As you can see in this configuration file, only only interface 1 is into play currently. This means that all traffic regarding OTD is send over interface 1. This is public network. The problem with this is two-fold:

  1. Multicast over public network doesn’t always work
  2. Sending cluster traffic over public network is a bad idea from security perspective, especially since OTD’s VRRP configuration does not require authentication

When I look at the architecture picture, I prefer to send cluster traffic over the private network (via interface 2) instead of via public. In my last endeavor the external switches didn’t allow any multicast traffic, so actually the OTD nodes weren’t able to find each other and both mounted the VIP. I found that multicast traffic was dropped by performing a tcpdump on the network interface (no multicast packets from other hosts arrived). Since tcpdump puts the network interface in a promiscuous mode, I get called by the security team after every time I perform a tcpdump. Therefore I typcally stay away from tcpdump and simply read the keepalived output in /var/log/messages when both OTD nodes are up. If you can see that one node is running as backup and one as master you are okay. Also you can see this by checking the network interfaces: if the VIP is mounted on both nodes you are in trouble.

The latter was the case for me: trouble. The VIP was mounted on both OTD nodes. This somehow did not lead to IP conflicts, however when the second OTD node was stopped the ARP table was not updated and hence traffic was not forwarded to the remaining OTD.

After a long search on Google, My Oracle Support and all kinds of other sources I almost started crying: no documentation how to configure this was to be found. Therefore I started fiddling with the configuration, just to see if I could fix this. Here’s what I found:

The directive interface in the keepalived.conf is the interface that you use for clustering communication. However you can run a VIP on every interface by adding a dev directive to the virtual_ipaddress configuration. So here’s my corrected configuration:

vrrp_instance otd-vrrp-router-1 {
#   Specify the default network interface, used for cluster traffic
    interface bond2
#   The virtual router ID must be unique to each VRRP instance that you define
    virtual_router_id 33
    priority 250
    virtual_ipaddress {
       # add dev to route traffic via a non-default interface
       XXXX.XXXX.XXXX.XXXX/XX dev bond1
    }
}

So what this does, is send all keepalived traffic (meaning: cluster traffic) via bond2, however the VIP is mounted on bond1. If you also want to introduce authentication, the directive advert_int 1 is your new best friend. Example snippet to add to keepalived.conf within the otd_vrrp-router configuration:

    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1066
    }

Hope this helps.

Written by Jacco H. Landlust

June 6, 2016 at 9:29 am

new SOA HA paper

leave a comment »

Today I was pointed at a brand new SOA HA paper on OTN (thanks Simon. Although I didn’t give any direct input for the paper, it discusses the architecture I designed for my largest customer. I am very happy that Oracle recognizes that customers rely on active/active configurations.

Written by Jacco H. Landlust

August 26, 2013 at 10:09 pm

Active Data Guard & Fusion Middleware Repositories.

with one comment

Last year while working on a POC Rob den Braber noticed the following in Disaster Recovery for Oracle Elastic Cloud with Oracle ExaData Database Machine on page 13:

Currently, Oracle Fusion Middleware does not support configuring Oracle Active Data Guard for the database repositories that are a part of the Fusion Middleware topology. However, Active Data Guard can be configured if your custom applications are designed to leverage the technology.
Today this came up in a discussion with Simon Haslam , and he didn’t hear from this support issue before. So it seems that it is not that well know that Active Data Guard and Oracle Fusion Middleware is not a supported combination.
This makes this blog post a reminder from what is already in documentation (unless someone can comment and tell me that currently in the quote is not so currently anymore).
Hope this helps.
UPDATE:
While reading this brand new SOA HA paper I found this quote today:

The Oracle Active Data Guard Option available with Oracle Database 11g Enterprise Edition enables you to open a physical standby database for read-only access for reporting, for simple or complex queries, or sorting while Redo Apply continues to apply changes from the production database. Oracle Fusion Middleware SOA does not support Oracle Active Data Guard because the SOA components execute and update information regarding SOA composite instances in the database as soon as they are started.

Written by Jacco H. Landlust

April 26, 2013 at 4:43 pm

setting up EDG style HA administration server with corosync and pacemaker

with 5 comments

Most of the Enterprise Deployment Guide’s (EDG) for Fusion Middleware products consider setting up WebLogic’s Administration Server HA. All of these EDG’s describe a manual failover. None of my clients find that a satisfactory solution. Usually I advise my clients to use Oracle’s clustering software to automate failover (Grid Infrastructure / Cluster Ready Services). This works fine if your local DBA is managing the WebLogic layer, although the overhead is large for something “simple” like an HA administration server. Also this requires the failover node to run in the same subnet (network wise) as the primary node. All this led me into investigating other options. One of the viable options I POC’d is Linux clustering with CoroSync and PaceMaker. I considered CoroSync and PaceMaker because this seems to be the RedHat standard for clustering nowadays.

This example is configured on OEL 5.8. The example is not production ready, please don’t install this on production without thorough testing (and some more of the usual disclaimers 🙂 ) I will assume basic knowledge of clustering and linux for this post, not all details will be configured in great depth.

First you need to understand a little bit about my topology. I have a small linux server running a software loadbalancer (Oracle Traffic Director) which is also functioning as NFS server. When configuring this for an enterprise these components will most likely be provided for you (F5’s or Cisco with some NetAPP or alike). In this specific configuration the VIP for the administration server runs on the loadbalancer. The NFS server on the loadbalancer server provides shared storage that hosts the domain home. This NFS share is mounted on both the servers that will run my administration server.

Back to the cluster. To install CoroSync and PaceMaker, first install the EPEL repository for packages that don’t exist in vanilla Redhat/CentOS and add the cluster labs repository.

rpm -ivh http://mirror.iprimus.com.au/epel/5/x86_64/epel-release-5-4.noarch.rpm
wget -O /etc/yum.repos.d/pacemaker.repo http://clusterlabs.org/rpm/epel-5/clusterlabs.repo

Then install Pacemaker 1.0+ and CoroSync 1.2+ via yum

yum install -y pacemaker.$(uname -i) corosync.$(uname -i)

When all software and dependencies are installed, you can configure CoroSync. My configuration file is rather straight forward. I run a cluster over network 10.0.0.0

cat /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
	version: 2
	secauth: on
	threads: 0
	interface {
		ringnumber: 0
		bindnetaddr: 10.0.0.0
		mcastaddr: 226.94.1.1
		mcastport: 5405
	}
}

logging {
	fileline: off
	to_stderr: no
	to_logfile: yes
	to_syslog: yes
	logfile: /var/log/cluster/corosync.log
	debug: off
	timestamp: on
	logger_subsys {
		subsys: AMF
		debug: off
	}
}

amf {
	mode: disabled
}

quorum {
           provider: corosync_votequorum
           expected_votes: 2
}

aisexec {
        # Run as root - this is necessary to be able to manage resources with Pacemaker
        user:        root
        group:       root
}

service {
    # Load the Pacemaker Cluster Resource Manager
    name: pacemaker
    ver: 0
}

Now, you can start CoroSync and check the configuration of the cluster.

service corosync start
corosync-cfgtool -s
Printing ring status.
Local node ID 335544330
RING ID 0
	id	= 10.0.0.20
	status	= ring 0 active with no faults


crm status
============
Last updated: Sun Mar  3 21:30:42 2013
Stack: openais
Current DC: wls1.area51.local - partition with quorum
Version: 1.0.12-unknown
2 Nodes configured, 2 expected votes
0 Resources configured.
============

Online: [ wls2.area51.local wls1.area51.local ]

For production usage you should configure stonith, which is beyond this example. So for testing purposes I disabled stonith

crm configure property stonith-enabled=false
crm configure property no-quorum-policy=ignore

Also I configure resources not to fail back when the resource running the resource comes back online.

crm configure rsc_defaults resource-stickiness=100

Now your cluster is ready, although it doesn’t run WebLogic yet. There is no WebLogic cluster resource, so I wrote one myself. To keep it separated from other cluster resources I setup my own OCF resource tree (just mkdir and you are done). A OCF resource requires certain functions to be in the script, the The OCF Resource Agent Developer’s Guide can help you with that one.

Here’s my example WebLogic cluster resource:

cat /usr/lib/ocf/resource.d/area51/weblogic 
#!/bin/bash
#
# Description:  Manages a WebLogic Administration Server as an OCF High-Availability
#               resource under Heartbeat/LinuxHA control
# Author:	Jacco H. Landlust <jacco.landlust@idba.nl>
# 		Inspired on the heartbeat/tomcat OCF resource
# Version:	1.0
#

OCF_ROOT=/usr/lib/ocf

. ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
#RESOURCE_STATUSURL="http://127.0.0.1:7001/console"

usage()
{
	echo "$0 [start|stop|status|monitor|migrate_to|migrate_from]"
	return ${OCF_NOT_RUNNING}
}

isrunning_weblogic()
{
        if ! have_binary wget; then
		ocf_log err "Monitoring not supported by ${OCF_RESOURCE_INSTANCE}"
		ocf_log info "Please make sure that wget is available"
		return ${OCF_ERR_CONFIGURED}
        fi
        wget -O /dev/null ${RESOURCE_STATUSURL} >/dev/null 2>&1
}

isalive_weblogic()
{
        if ! have_binary pgrep; then
                ocf_log err "Monitoring not supported by ${OCF_RESOURCE_INSTANCE}"
                ocf_log info "Please make sure that pgrep is available"
                return ${OCF_ERR_CONFIGURED}
        fi
        pgrep -f weblogic.Name > /dev/null
}

monitor_weblogic()
{
        isalive_weblogic || return ${OCF_NOT_RUNNING}
        isrunning_weblogic || return ${OCF_NOT_RUNNING}
        return ${OCF_SUCCESS}
}

start_weblogic()
{
	if [ -f ${DOMAIN_HOME}/servers/AdminServer/logs/AdminServer.out ]; then
		su - ${WEBLOGIC_USER} --command "mv ${DOMAIN_HOME}/servers/AdminServer/logs/AdminServer.out ${DOMAIN_HOME}/servers/AdminServer/logs/AdminServer.out.`date +%Y-%M-%d-%H%m`"
	fi
	monitor_weblogic
	if [ $? = ${OCF_NOT_RUNNING} ]; then
		ocf_log debug "start_weblogic"
		su - ${WEBLOGIC_USER} --command "nohup ${DOMAIN_HOME}/bin/startWebLogic.sh > ${DOMAIN_HOME}/servers/AdminServer/logs/AdminServer.out 2>&1 &"
		sleep 60
		touch ${OCF_RESKEY_state}
	fi
	monitor_weblogic
	if [ $? =  ${OCF_SUCCESS} ]; then
		return ${OCF_SUCCESS}
	fi
}

stop_weblogic()
{
#	monitor_weblogic
#	if [ $? =  $OCF_SUCCESS ]; then
		ocf_log debug "stop_weblogic"
		pkill -KILL -f startWebLogic.sh
		pkill -KILL -f weblogic.Name
		rm ${OCF_RESKEY_state}
#	fi
	return $OCF_SUCCESS
}

meta_data() {
        cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="weblogic" version="0.9">
	<version>1.0</version>
	<longdesc lang="en"> This is a WebLogic Resource Agent </longdesc>
	<shortdesc lang="en">WebLogic resource agent</shortdesc>
        
	<parameters>
		<parameter name="state" unique="1">
			<longdesc lang="en">Location to store the resource state in.</longdesc>
			<shortdesc lang="en">State file</shortdesc>
			<content type="string" default="${HA_VARRUN}{OCF_RESOURCE_INSTANCE}.state" />
		</parameter>
		<parameter name="statusurl" unique="1">
			<longdesc lang="en">URL for state confirmation.</longdesc>
			<shortdesc>URL for state confirmation</shortdesc>
			<content type="string" default="" />
		</parameter>
		<parameter name="domain_home" unique="1">
			<longdesc lang="en">PATH to the domain_home. Should be a full path</longdesc>
			<shortdesc lang="en">PATH to the domain.</shortdesc>
			<content type="string" default="" required="1" />
		</parameter>
		<parameter name="weblogic_user" unique="1">
			<longdesc lang="en">The user that starts WebLogic</longdesc>
			<shortdesc lang="en">The user that starts WebLogic</shortdesc>
			<content type="string" default="oracle" />
		</parameter>
	</parameters>   
        
	<actions>
		<action name="start"        timeout="90" />
		<action name="stop"         timeout="90" />
		<action name="monitor"      timeout="20" interval="10" depth="0" start-delay="0" />
		<action name="migrate_to"   timeout="90" />
		<action name="migrate_from" timeout="90" />
		<action name="meta-data"    timeout="5" />
	</actions>      
</resource-agent>
END
}

# Make the resource globally unique
: ${OCF_RESKEY_CRM_meta_interval=0}
: ${OCF_RESKEY_CRM_meta_globally_unique:="true"}

if [ "x${OCF_RESKEY_state}" = "x" ]; then
        if [ ${OCF_RESKEY_CRM_meta_globally_unique} = "false" ]; then
                state="${HA_VARRUN}${OCF_RESOURCE_INSTANCE}.state"
                
                # Strip off the trailing clone marker
                OCF_RESKEY_state=`echo $state | sed s/:[0-9][0-9]*\.state/.state/`
        else
                OCF_RESKEY_state="${HA_VARRUN}${OCF_RESOURCE_INSTANCE}.state"
        fi
fi

# Set some defaults
RESOURCE_STATUSURL="${OCF_RESKEY_statusurl-http://127.0.0.1:7001/console}"
DOMAIN_HOME="${OCF_RESKEY_domain_home}"
WEBLOGIC_USER="${OCF_RESKEY_weblogic_user-oracle}"

# MAIN
case $__OCF_ACTION in
	meta-data)      meta_data
       	         exit ${OCF_SUCCESS}
	                ;;
	start)          start_weblogic;;
	stop)           stop_weblogic;;
	status)		monitor_weblogic;;
	monitor)        monitor_weblogic;;
	migrate_to)     ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE} to ${OCF_RESKEY_CRM_meta_migrate_to}."
			stop_weblogic
			;;
	migrate_from)   ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE} to ${OCF_RESKEY_CRM_meta_migrated_from}."
			start_weblogic
			;;
	usage|help)     usage
			exit ${OCF_SUCCESS}
	       	         ;;
	*)		usage
			exit ${OCF_ERR_UNIMPLEMENTED}
			;;
esac
rc=$?

# Finish the script
ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc"

exit $rc

Please mind that you would have to copy the script to both nodes of the cluster.

Next up is configuring the WebLogic resource in the cluster. In my example I mounted the NFS share with domain homes on /domains and the domain is called ha-adminserver. My WebLogic is running as oracle and the administration server listens on all addresses at port 7001. Therefore the parameter weblogic_user is left at default (oracle) and the status_url to check if the administration server is running is left at default too (http://127.0.0.1:7001/console). The domain home is parsed to the cluster.

crm configure primitive weblogic ocf:area51:weblogic params domain_home="/domains/ha-adminserver" op start interval="0" timeout="90s" op monitor interval="30s"

When the resource is added, the cluster starts is automatically. Please keep in mind this takes some time, therefore you might not see results instantly. Also the scripts has a sleep configured, if your administration server takes longer to boot you might want to fiddle with the values. For an example like in this blogpost it works.

Next you can check the status of your resource:

crm status
============
Last updated: Sun Mar  3 21:35:47 2013
Stack: openais
Current DC: wls1.area51.local - partition with quorum
Version: 1.0.12-unknown
2 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ wls2.area51.local wls1.area51.local ]

 weblogic	(ocf::area51:weblogic):	Started wls2.area51.local

To check the configuration of your resource, run the configure show command

crm configure show
node wls1.area51.local
node wls2.area51.local
primitive weblogic ocf:area51:weblogic \
	params domain_home="/domains/ha-adminserver" \
	op start interval="0" timeout="90s" \
	op monitor interval="30s" \
	meta target-role="Started"
property $id="cib-bootstrap-options" \
	dc-version="1.0.12-unknown" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	no-quorum-policy="ignore" \
	last-lrm-refresh="1362338797"
rsc_defaults $id="rsc-options" \
	resource-stickiness="100"

You can test failover by either stopping CoroSync, killing the linux node, etc. etc.

Other useful commands:

# start resource
crm resource start weblogic

# stop resource
crm resource stop weblogic

# Cleanup errors
crm_resource --resource weblogic -C

# Move resource to other node, mind you: that means pinning and taking control of the cluster. Fail-back is automatically introduced
crm resource move weblogic wls1.area51.local
# give authority over resource back to cluster
crm resource unmove weblogic

# delete cluster resource
crm configure delete weblogic

If you want your WebLogic administration server resource to be bound to a vip, just google for setting up an HA apache on PaceMaker. There is plenty information about that on the web, e.g. this site, which helped me setting up the cluster too.

Well, I hope this helps for anyone that is trying to setup an HA administration server.

Written by Jacco H. Landlust

March 3, 2013 at 11:39 pm

BEA-000362, incomplete error

with 4 comments

While setting up Service Migration in a small test setup on my laptop, I ran into this error:

<BEA-000362> <Server failed. Reason: There is either a problem contacting the database or there is another instance of ManagedServer_2 running>

It took me some time to figure out what the exact problem was. If the message was complete like this, problem solving would have been easier:

<BEA-000362> <Server failed. Reason: There is either a problem contacting the database or there is another instance of ManagedServer_2 running or the leasing table is missing from the database>

You can find the DDL for the default leasing table, called active, in a file called leasing.ddl which is located at $MW_HOME/wlserver_10.3/db/oracle/920 . If you happened to have changed the name for the leasing table, you obviously have to modify the leasing.ddl script accordingly.

Hope this helps.

Written by Jacco H. Landlust

January 9, 2013 at 1:20 am

Configuring Fusion Middleware JDBC Data Sources Correctly

leave a comment »

The out of the box settings for a data source in a random Fusion Middleware product (SOA, WebCenter, OIM, etc. they are all alike) JDBC properties contains guesses about your environment and usage. Same goes for the settings required by RCU when installing a repository.

For a customer I recently wrote a document explaining which settings to set on the database and in WebLogic when configuring data sources for a Fusion Middleware product for production usage while connected to a RAC database.

The document assumes you are running a 11.2 RAC and WebLogic 10.3.4 or newer. Here’s the document:

Configure JDBC data sources for RAC

Hope this helps.

BTW: if you already downloaded the document, please download it again. Seems I made an error in the distributed lock area.

Written by Jacco H. Landlust

November 17, 2012 at 1:13 am

UCM, mod_wl_ohs and http response

with one comment

Some extensive testing, “maybe” some code decompiling, and some talking to an great ACS consultant about nice error pages for UCM when using a HTTP front-end ended in this statement:

If we add “HttpSevereErrorFirstLine=HTTP/1.1 400 Bad Request” in the config.cfg and restart the Content Server, the actual error message is seen instead of the bridge error. This undocumented parameter overrides the default 503 response sent by the Content Server in case of an error to 400.

Apache complains about the bridge error when a 503 response is sent and doesn’t when it’s something like HTTP/1.1 400 Bad Request.

This feature was tested on Universal Content Manager (UCM) 11.1.1.4.

hope this helps 🙂

Written by Jacco H. Landlust

August 22, 2011 at 10:52 pm

WLS, nodemanager and startup.properties

with 2 comments

It’s been a while since I blogged, been way to busy working on a couple of production systems. Anyway, while running an SR with Oracle about the nodemanager and some crash recovery issues  (a blog post will follow as soon as a solution is found) I ran into yet another documentation “feature”.

The Fusion Middleware documentation contains lots of “practices” (I wouldn’t call them best 🙂 ) which have little to do with the technical functioning of the product and everything to do with personel preferences (i.e. “it worked for me”). Some engineer setting up a fusion middleware environment for some customer and promoting his personal notes to be best practices is not the type of “Best Practice” or manual I would like to see from Oracle. A population of one (1) is not a valid sample for a “Best Practice”.

As an example, this part of documentation says:

Step 7: Define the Administration Server Address Make sure that a listen address is defined for each Administration Server that will connect to the Node Manager process. If the listen address for an Administration Server is not defined, when Node Manager starts a Managed Server it will direct the Managed Server to contact localhost for its configuration information.

I think this is incorrect because the nodemanager checks a file when it starts up a managed server. This file can be found at $DOMAIN_HOME/servers/$SERVER_NAME/data/nodemanager/startup.properties. An example of this file from one of my testservers is:

#Server startup properties
#Sat Feb 05 10:41:39 CET 2011
Arguments=-Djava.net.preferIPv4Stack\=true -Dsb.transports.mq.IgnoreReplyToQM\=true -Xmanagement\:ssl\=false,authenticate\=false,port\=7091 -Djavax.management.builder.initial\=weblogic.management.jmx.mbeanserver.WLSMBeanServerBuilder -Djava.security.egd\=file\:/dev/./urandom -Djava.security.jps.config\=/u01/app/oracle/user_projects/domains/base_domain/config/fmwconfig/jps-config.xml -Xms5g -Xmx5g -XXtlaSize\:min\=2k,preferred\=512k -XXcompaction\:percentage\=20
SSLArguments=-Dweblogic.security.SSL.ignoreHostnameVerification\=true -Dweblogic.ReverseDNSAllowed\=false
RestartMax=2
RestartDelaySeconds=0
RestartInterval=3600
AdminURL=http\://192.168.6.1\:7001
AutoRestart=true
AutoKillIfFailed=false

It contains the AdminURL (192.168.6.1 resolves to the AdminServer of my test setup). This property file is setup upon first startup of the managed server. When you boot the managed server this leads to the following startup parameter for the jvm (found in the .out file of the managed server):
-Dweblogic.management.server=http://192.168.6.1:7001

So I don’t agree that the managed server checks localhost if the AdminServer has no listen-address. I think that line in de docs should be corrected as a documentation error (at best it’s incomplete)

When you learn more about the startup.properties, you also know that the statement that you should always need to use the startWebLogic.sh script to start the AdminServer after domain creation is false. Yes you get an error when you start the AdminServer from the nodemanager if it’s the first time you boot this AdminServer, but if you manually create the startup.properties file and optionally the boot.properties (if you run in production mode) you can start the AdminServer from WLST (which helps when you script your deployments).

hope this helps.

Written by Jacco H. Landlust

February 7, 2011 at 11:23 pm

when AFCS crashes….

leave a comment »

Today three out of five nodes of a cluster crashed while a loadtest was running on two of the nodes of this cluster. The cluster is a ACFS cluster with OSB and SOA productions on top of it. It uses the ACFS disk for logging and configuration. All binaries are on local disk. The version of GI used is 11.2.0.1 running on 64-bit OEL 5.5.

This blogpost is mostly a note to myself, but I might help some other people with the content.

While looking in the logfiles of CRS for the cause of nodefailure I found this error:


view /u01/app/grid/log/some_server/agent/crsd/oraagent_oracle/oraagent_oracle.l01


2010-10-12 10:21:05.060: [ora.DGGRID.dg][1536899392] [check] InstConnection::connectInt (2) Exception OCIException
2010-10-12 10:21:05.060: [ora.DGGRID.dg][1536899392] [check] Exception type=2 string=ORA-01034: ORACLE not available
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 1
Additional information: 491521
Additional information: 8
Process ID: 0
Session ID: 0 Serial number: 0

The ASM instance had 1 GB set for both memory_target as well as memory_max_target. So somehow ACFS uses more memory while on heavy load. I am not aware of any formula’s or best practice to calculate the memory_target for an ASM instance that is just running ACFS. The 1 GB was a guesstimate based on 11.1 knowledge. If anyone has some handles for me regarding memory settings for ASM with just ACFS, please comment on this blogpost.

Some more checking, in this case of Linux (OEL 5) showed some more:


dmesg


[Oracle ACFS] FSCK-NEEDED set for volume /dev/asm/v_disk-170 . Internal ACFS Location: 916 .
[Oracle ACFS] A problem has been detected with
[Oracle ACFS] the file system metadata in /dev/asm/v_disk-170 .
[Oracle ACFS] Normal operation can continue, but it is advisable
[Oracle ACFS] to run fsck on the file system as soon as it is
[Oracle ACFS] feasible to do so.  See the Storage Admin
[Oracle ACFS] Guide for more information about FSCK-NEEDED.

Now this seems like trouble, so I stopped all nodes of the cluster (*AIKS*) and started up an fsck. This ran for ages, just do this:


lseek(4, 9781714944, SEEK_SET) = 9781714944
read(4, "\202\1\6P\17dG\26o\315\324_\3262\363 \tG\2"..., 4096) = 4096
lseek(4, 9781706752, SEEK_SET) = 9781706752
read(4, "\202\1\6P\17dG\26o\315\324(\6\363\23\tG\2"..., 4096) = 4096
lseek(4, 9781649408, SEEK_SET) = 9781649408
read(4, "\202\1\6P\17dG\26o\315\324\262\226\352\20 \10G\2"..., 4096) = 4096
lseek(4, 9781739520, SEEK_SET) = 9781739520
read(4, "\202\1\5P\17dG\26o\315\324\257\27=\233\200\tG\2"..., 4096) = 4096
lseek(4, 2315993088, SEEK_SET) = 2315993088
read(4, "\202\1\5P\17dG\26o\315\324\340O\10\310@\v\212"..., 4096) = 4096

Now I’m no C programmer, nor an filesystem specialist so I don’t exactly know what’s going on (yet). After 4 hours I did decide that waiting longer was futile, it’s just a freaking 10 GB disk!

I started fsck again, only this time with some extra parameters:


$ fsck -a -v -y -t acfs /dev/asm/v_disk-170


OfsCheckOnDiskGBM entered
fsck.acfs: OfsReadMeta at offset: 67112960 (0x4001000)    size: 327680 (0x50000)
OfsCheckFileEntry entered for:
ACFS Internal File: [ACFS Snap Map]
fenum: 19 (0x13)   disk offset: 79360 (0x13600)


fsck.acfs: OfsReadMeta at offset: 79360 (0x13600)    size: 512 (0x200)
OfsCheckFileExtents entered for:
ACFS Internal File: [ACFS Snap Map]
fenum: 19 (0x13)   disk offset: 79360 (0x13600)


fsck.acfs: OfsReadMeta at offset: 67440640 (0x4051000)    size: 512 (0x200)


Checking if any files are orphaned...


Phase 1 Orphan check...


fsck.acfs: OfsReadMeta at offset: 81920 (0x14000)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 82432 (0x14200)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 82944 (0x14400)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 83456 (0x14600)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 83968 (0x14800)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 84480 (0x14a00)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 84992 (0x14c00)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 85504 (0x14e00)    size: 512 (0x200)


Phase 2 Orphan check...


fsck.acfs: OfsReadMeta at offset: 81920 (0x14000)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 82432 (0x14200)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 82944 (0x14400)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 83456 (0x14600)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 83968 (0x14800)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 84480 (0x14a00)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 84992 (0x14c00)    size: 512 (0x200)
fsck.acfs: OfsReadMeta at offset: 85504 (0x14e00)    size: 512 (0x200)


0 orphans found


fsck.acfs: fsck.acfs: Checker completed with the following results:
File System Errors:   2
Fixed:            2
Not Fixed:        0

This caused fsck to finish in a couple of minutes, after which I could mount the ACFS disk on the cluster again.

Written by Jacco H. Landlust

October 12, 2010 at 4:49 pm

Creating a failover disk using ascrs

leave a comment »

I certainly hope that installing CRS is no magic anymore for most DBA’s. If it is, please refer to Tim Hall’s website and follow the guide. Certain things in the CRS installation are less documented though:

  1. X-forwarding has been a nuisance for a lot of DBA’s. Refer to this post for some more information about x-forwarding.
  2. I notice that most guides on VMWare ask you to reboot after adding disks. Refer to this post to see how to scan your bus without rebooting.
  3. Shared disks and VMWare Workstation are a pain in the behind.  Obviously someone else felt the same pain too.

I installed CRS on my laptop running VMWare Workstation. The machines are called wls1 and wls2 (guess what this will be in when I’m done 😉 ) After installing CRS, I installed ascrs. ascrs is delivered through the Companion CD of Oracle Fusion Middleware 11g. It installs by just unzipping the ascrs.zip file in your CRS-tree. Next simply call the configure script in the $CRS_HOME/ascrs/bin directory. When you want to use ascrs on all nodes of the cluster, you need to unzip the file on all nodes.

Read the rest of this entry »

Written by Jacco H. Landlust

August 18, 2009 at 3:36 pm