Monday, December 27, 2010

Concurrent Manager Setup (PCP) in 11i(RAC) Environment

PURPOSE
-----------------------------
Configuring parallel concurrent processing allows you to distribute concurrent
managers, and workload across multiple nodes in a cluster, or networked environment.
PCP can also be implemented in a RAC environment in order to provide automated
failover of workload should the primary (source), or secondary (target) concurrent
processing nodes, or RAC instances fail. There are several different failure
scenarios that can occur depending on the type of Applications Technology Stack
implementation that is performed.

The basic failure scenarios are:

1. The database instance that supports the CP, Applications, and Middle-Tier
processes such as Forms, or iAS can fail.
2. The Database node server that supports the CP, Applications, and Middle-Tier
processes such as Forms, or iAS can fail.
3. The Applications/Middle-Tier server that supports the CP (and Applications)
base can fail.

The concurrent processing tier can reside on either the Applications, Middle-Tier,
or Database Tier nodes. In a single tier configuration, non PCP environment, a
node failure will impact Concurrent Processing operations due to any of these
failure conditions. In a multi-node configuration the impact of any these types
of failures will be dependent upon what type of failure is experienced, and how
concurrent processing is distributed among the nodes in the configuration. Parallel
Concurrent Processing provides seamless failover for a Concurrent Processing
environment in the event that any of these types of failures takes place.

In an Applications environment where the database tier utilizes Listener
(server) load balancing, and in a non-load balanced environment, there
are changes that must be made to the default configuration generated by
Autoconfig so that CP initialization, processing, and PCP functionality are
initiated properly on their respective/assigned nodes. These changes are
described in the next section - Concurrent Manager Setup and Configuration
Requirements in an 11i RAC Environment.

The current Concurrent Processing architecture with Global Service Management
consists of the following processes and communication model, where each process
is responsible for performing a specific set of routines and communicating with
parent and dependent processes.

Internal Concurrent Manager (FNDLIBR process) - Communicates with the Service
Manager.

The Internal Concurrent Manager (ICM) starts, sets the number of active processes,
monitors, and terminates all other concurrent processes through requests made to
the Service Manager, including restarting any failed processes. The ICM also
starts and stops, and restarts the Service Manager for each node. The ICM will
perform process migration during an instance or node failure. The ICM will be
active on a single node. This is also true in a PCP environment, where the ICM
will be active on at least one node at all times.

Service Manager (FNDSM process) - Communicates with the Internal Concurrent Manager,
Concurrent Manager, and non-Manager Service processes.

The Service Manager (SM) spawns, and terminates manager and service processes (these
could be Forms, or Apache Listeners, Metrics or Reports Server, and any other process
controlled through Generic Service Management). When the ICM terminates the SM that
resides on the same node with the ICM will also terminate. The SM is ‘chained’ to
the ICM. The SM will only reinitialize after termination when there is a function it
needs to perform (start, or stop a process), so there may be periods of time when the
SM is not active, and this would be normal. All processes initialized by the SM
inherit the same environment as the SM. The SM’s environment is set by APPSORA.env
file, and the gsmstart.sh script. The TWO_TASK used by the SM to connect to a RAC
instance must match the instance_name from GV$INSTANCE. The apps_ listener must
be active on each CP node to support the SM connection to the local instance. There
should be a Service Manager active on each node where a Concurrent or non-Manager
service process will reside.

Internal Monitor (FNDIMON process) - Communicates with the Internal Concurrent
Manager.

The Internal Monitor (IM) monitors the Internal Concurrent Manager, and restarts any
failed ICM on the local node. During a node failure in a PCP environment the IM will
restart the ICM on a surviving node (multiple ICM's may be started on multiple nodes,
but only the first ICM started will eventually remain active, all others will
gracefully terminate). There should be an Internal Monitor defined on each node
where the ICM may migrate.

Standard Manager (FNDLIBR process) - Communicates with the Service Manager and any
client application process.

The Standard Manager is a worker process, that initiates, and executes client requests
on behalf of Applications batch, and OLTP clients.

Transaction Manager - Communicates with the Service Manager, and any user process
initiated on behalf of a Forms, or Standard Manager request. See Note 240818.1
regarding Transaction Manager communication and setup requirements for RAC.

SCOPE & APPLICATION
-----------------------------
This article is provided for Applications development, product management, system
architects, and system administrators involved in deploying and configuring Oracle
Applications in a RAC environment. This document will also be useful to field
engineers and consulting organizations to facilitate installations and configuration
requirements of Applications 11i in a RAC environment.

Concurrent Manager Setup and Configuration Requirements in an 11i RAC Environment
-----------------------------
In order to set up Setup Parallel Concurrent Processing Using AutoConfig with GSM,
follow the instructions in the 11.5.8 Oracle Applications System Administrators Guide
under Implementing Parallel Concurrent Processing using the following steps:

1. Applications 11.5.8 and higher is configured to use GSM. Verify the configuration
on each node (see WebIV Note 165041.1).
2. On each cluster node edit the Applications Context file (.xml), that resides
in APPL_TOP/admin, to set the variable ON .
It is normally set to OFF. This change should be performed using the Context
Editor.
3. Prior to regenerating the configuration, copy the existing tnsnames.ora,
listener.ora and sqlnet.ora files, where they exist, under the 8.0.6 and iAS
ORACLE_HOME locations on the each node to preserve the files (i.e./
directory>/ora/$ORACLE_HOME/network/admin//tnsnames.ora). If any of
the Applications startup scripts that reside in COMMON_TOP/admin/scripts/
have been modified also copy these to preserve the files.
4. Regenerate the configuration by running adautocfg.sh on each cluster node as
outlined in Note 165195.1.
5. After regenerating the configuration merge any changes back into the tnsnames.ora,
listener.ora and sqlnet.ora files in the network directories, and the startup
scripts in the COMMON_TOP/admin/scripts/ directory. Each nodes tnsnames.ora
file must contain the aliases that exist on all other nodes in the cluster. When
merging tnsnames.ora files ensure that each node contains all other nodes
tnsnames.ora entries. This includes tns entries for any Applications tier nodes
where a concurrent request could be initiated, or request output to be viewed.
6. In the tnsnames.ora file of each Concurrent Processing node ensure that there is
an alias that matches the instance name from GV$INSTANCE of each Oracle instance
on each RAC node in the cluster. This is required in order for the SM to establish
connectivity to the local node during startup. The entry for the local node will
be the entry that is used for the TWO_TASK in APPSORA.env (also in the
APPS_.env file referenced in the Applications Listener [APPS_]
listener.ora file entry "envs='MYAPPSORA=/APPS_.env)
on each node in the cluster (this is modified in step 12).
7. Verify that the FNDSM_ entry has been added to the listener.ora file under
the 8.0.6 ORACLE_HOME/network/admin/ directory. See WebiV Note 165041.1 for
instructions regarding configuring this entry. NOTE: With the implementation of
GSM the 8.0.6 Applications, and 9.2.0 Database listeners must be active on all PCP
nodes in the cluster during normal operations.
8. AutoConfig will update the database profiles and reset them for the node from
which it was last run. If necessary reset the database profiles back to their
original settings.
9. Ensure that the Applications Listener is active on each node in the cluster where
Concurrent, or Service processes will execute. On each node start the database
and Forms Server processes as required by the configuration that has been
implemented.
10. Navigate to Install > Nodes and ensure that each node is registered. Use the node
name as it appears when executing a ‘nodename’ from the Unix prompt on the server.
GSM will add the appropriate services for each node at startup.
11. Navigate to Concurrent > Manager > Define, and set up the primary and secondary
node names for all the concurrent managers according to the desired configuration
for each node’s workload. The Internal Concurrent Manager should be defined on the
primary PCP node only. When defining the Internal Monitor for the secondary
(target) node(s), make the primary node (local node) assignment, and assign a
secondary node designation to the Internal Monitor, also assign a standard work
shift with one process.
12. Prior to starting the Manager processes it is necessary to edit the APPSORA.env
file on each node in order to specify a TWO_TASK entry that contains the
INSTANCE_NAME parameter for the local nodes Oracle instance, in order to bind
each Manager to the local instance. This should be done regardless of whether
Listener load balancing is configured, as it will ensure the configuration
conforms to the required standards of having the TWO_TASK set to the instance
name of each node as specified in GV$INSTANCE. Start the Concurrent Processes
on their primary node(s). This is the environment that the Service Manager
passes on to each process that it initializes on behalf of the Internal
Concurrent Manager. Also make the same update to the file referenced by the
Applications Listener APPS_ in the listener.ora entry "envs='MYAPPSORA=
/APPS_.env" on each node.
13. Navigate to Concurrent > Manager > Administer and verify that the Service Manager
and Internal Monitor are activated on the secondary node, and any other
addititional nodes in the cluster. The Internal Monitor should not be active on
the primary cluster node.
14. Stop and restart the Concurrent Manager processes on their primary node(s), and
verify that the managers are starting on their appropriate nodes. On the target
(secondary) node in addition to any defined managers you will see an FNDSM
process (the Service Manager), along with the FNDIMON process (Internal Monitor).

Failover Considerations
-----------------------------
In order to have log and output files available to each node during an extended node
failure, each log and out directory needs to be made accessible to all other CP nodes
in the cluster (placed on shared disk).

In order to view log and output files from a failed node during an outage/node
failure, the FNDFS_ entry in tnsnames.ora under the 8.0.6 ORACLE_HOME location
on each node should be configured using an ADDRESS_LIST following the example below,
to provide connect time failover. This entry should be placed in the tnsnames.ora
file on each node that supports Concurrent Processing, with the local node first in
the ADDRESS_LIST entry.

FNDFS_coe-srv5-pc=(DESCRIPTION=
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=coe_svr5_pc)(PORT=1231))
(ADDRESS=(PROTOCOL=tcp)(HOST=coe_svr7_pc)(PORT=1232)))
(CONNECT_DATA=(SID=FNDFS))
)

See Bug 3259441 for issues related to the TNS alias length exceeding 255 characters.

Without configuring connect time failover using the 8.0.6 ORACLE_HOME tnsnames.ora
by specifying the ADDRESS_LIST entry the only other alternative is to perform a manual
update of the fnd_concurrent_requests table for each request, in order to reflect the
change in outfile_node_name, and logfile_node_name from the failed to the surviving
node.

Determine the failed node name, and update the out and log entry to the
surviving node:

SQL> select outfile_node_name from fnd_concurrent_requests
2 where request_id=166273;

COE-SVR7-PC

SQL> select logfile_node_name from fnd_concurrent_requests
2* where request_id=166273

COE-SVR7-PC

Update both outfile and logfile_node_name from the failed to the surviving instance:

SQL> update fnd_concurrent_requests set outfile_node_name = 'COE-SVR5-PC'
2* where request_id =166273

SQL> update fnd_concurrent_requests set logfile_node_name = 'COE-SVR5-PC'
2* where request_id =166273

Using the ADDRESS_LIST rather than updating fnd_concurrent_reqests is the recommended
method to estabilsh failover access for Concurrent Manager log and output files.


Configuration Examples:
-----------------------------
Finding the instance name -

column host_name format a20;
select host_name, instance_name from gv$instance;

HOST_NAME INSTANCE_NAME
----------- ----------------
coe-svr7-pc APRA7
coe-svr5-pc APRA5

Modifying the APPSORA.env & APPS_.env file for node coe-svr5-pc -

> cd $APPL_TOP
> cat APPSORA.env
:
# $Header: APPSORA_ux.env 115.4 2003/03/01 01:02:35 wdgreene ship $
# =============================================================================
# NAME
# APPSORA.env
#
# DESCRIPTION
# Execute environment for Oracle and APPL_TOP
#
# NOTES
#
# HISTORY
#
# =============================================================================
#
# ###############################################################
#
# This file is automatically generated by AutoConfig. It will be read and over.
# If you were instructed to edit this file, or if you are not able to use the ss
# created by AutoConfig, refer to Metalink document 165195.1 for assistance.
#
# ###############################################################
#
. /oralocal/apraora/8.0.6/APRA.env
. /oralocal/apraappl/APRA.env
TWO_TASK=APRA5
export TWO_TASK

TNS alias definition in tnsnames.ora on each node -

cd $TNS_ADMIN
pwd
/oralocal/apraora/8.0.6/network/admin/APRA

APRA5 = (DESCRIPTION=
(ADDRESS=(PROTOCOL=tcp)(HOST= coe-svr5-pc)(PORT=1523))
(CONNECT_DATA=(INSTANCE_NAME=APRA5)(SERVICE_NAME=apradb))
)
APRA7 = (DESCRIPTION=
(ADDRESS=(PROTOCOL=tcp)(HOST= coe-svr7-pc)(PORT=1523))
(CONNECT_DATA=(INSTANCE_NAME=APRA7)(SERVICE_NAME=apradb))
)

FNDFS entry in listener.ora on each node -

cd $TNS_ADMIN
pwd
/oralocal/apraora/8.0.6/network/admin/APRA

( SID_DESC = ( SID_NAME = FNDFS )( ORACLE_HOME = /oralocal/apraora/8.0.6 )
( PROGRAM = /oralocal/apraappl/fnd/11.5.0/bin/FNDFS )
( envs='EPC_DISABLED=TRUE,NLS_LANG=AMERICAN_AMERICA.WE8ISO8859
1,LD_LIBRARY_PATH=/usr/dt/lib:/usr/openwin/lib:/oralocal/apraora/8.0.6/lib,SHLI
_PATH=/usr/lib:/usr/dt/lib:/usr/openwin/lib:/oralocal/apraora/8.0.6/lib,LIBPATH
/usr/dt/lib:/usr/openwin/lib:/oralocal/apraora/8.0.6/lib' )
)
)