Search this Blog

Thursday, January 20, 2011

Copying Archived Logfile from ASM to a file system

Following OTN tread http://forums.oracle.com/forums/thread.jspa?threadID=2154801&tstart=0 about copying archived logfiles from ASM to a UNIX filesystem, I am posting a script we used in the past.
Nowadays we use RMAN to get the files from ASM directly into a TSM diskpool, but back then we used another method for putting the archived logfiles on backup, which resulted in the need for this script:


# Script to copy archived logfiles from ASM to a file system on UNIX
# In this case all archived logfiles found in v$archived_log are copied to a staging mount point /opt/asmprd/ar01
# The username/password used to access the database need to be set before use of this script.
# For sake of security I removed these lines from this script
#
# The script needs to be run as owner of the ASM instance
#
#
export ORACLE_SID=WCSPRD
export FSFULL=FALSE
export ORACLE_BASE=/opt/$LOGNAME/ora
export CRS_HOME=/opt/crsprd/ora/11.1.0
export ORA_CRS_HOME=$CRS_HOME
export ASM_HOME=/opt/asmprd/ora/11.1.0
export ORA_ASM_HOME=$ASM_HOME
export ORACLE_HOME=$ORACLE_BASE/11.1.0
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$CRS_HOME/lib:$ORACLE_HOME/lib32:$CRS_HOME/lib32
export LIBPATH=$LD_LIBRARY_PATH
export PATH=$ORACLE_HOME/bin:$CRS_HOME/bin:$PATH
export TNS_ADMIN=/opt/oraadmin/network/admin
export ORACLE_SID=+ASM
export HOST=`hostname`
export LOG_FILE=/var/adm/log/StageArchives_${HOST}.log
export A_FILE_COPIED=FALSE
#
# Optionally turn Debug on to see what the script does
export Debug=false
#
# Functions
#
Log() {
echo "`date +%Y/%m/%d` (`date +%H:%M:%S`) : $1" | tee -a $LOG_FILE
} #End Log
#
#
Continue()
{
echo "Continue....? [n]\c"; read ans
case $ans in
y|Y) echo "\n\nContinuing...."
break;;
*) exit 0
break;;
esac
unset ans
}
#
#
GetArchiveList()
{
echo "set termout off heading off pages 9999 lines 70 feedback off" > /tmp/getArchives.$$.sql
echo "column name format A70;" >> /tmp/getArchives.$$.sql
echo "spool /tmp/archivelist.lst;" >> /tmp/getArchives.$$.sql
echo "select trim(name) from v\$archived_log where deleted='NO' and archived='YES';" >> /tmp/getArchives.$$.sql
echo "quit;" >> /tmp/getArchives.$$.sql
$ORACLE_HOME/bin/sqlplus -S ${WCSPRD_SYS}@WCSPRD as sysdba @/tmp/getArchives.$$.sql

if [ "$Debug" = "true" ]
then
cat /tmp/getArchives.$$.sql
Continue
fi
rm -f /tmp/getArchives.$$.sql
}


StoreArchivesInFS()
{
ArchName=$1
ArchShortName=`basename $1`
ArchDest=/opt/asmprd/ar01/${ArchShortName}
AvailFS=`df -gv /opt/asmprd/ar01 | grep -v Free| awk '{print $5}' | cut -d'%' -f1`
Log "Archiving $ArchDest"

if [ "$Debug" = "true" ]
then
echo "AvailFS = $AvailFS"
echo "Commando = $ASM_HOME/bin/asmcmd cp $ArchName $ArchDest"
Continue
fi
if [ $AvailFS -lt 90 ]
then
echo "$ASM_HOME/bin/asmcmd cp $ArchName $ArchDest"
$ASM_HOME/bin/asmcmd cp $ArchName $ArchDest >/dev/null 2>&1

if [ -f "$ArchDest" ]
then
export A_FILE_COPIED=TRUE
$ASM_HOME/bin/asmcmd rm $ArchName >/dev/null 2>&1
Log "$ArchName has been deleted from ASM."
else
echo "ArchDest=$ArchDest"
ls -l $ArchDest
fi
else
export FSFULL=TRUE
fi
}


RmanCrosscheck()
{
# Crosscheck and clean repository
#Log "Crosscheck and clean repository"
$ORACLE_HOME/bin/rman target ${WCSPRD_SYS}@WCSPRD rcvcat ${TSMORA_TSMCAT}@tsmora cmdfile StageArchives.cmd
d
}


# Main Program
PROG_PID=$$
RUNNING=`ps -ef| grep StageArchives.sh | grep -v ${PROG_PID} | grep -v grep | awk '{print $2}'`
if [ -n "$RUNNING" ]
then
Log "Old Running session(s) found with PID(s): $RUNNING "
ps -ef| grep ${RUNNING} | grep -v grep >> $LOG_FILE
for tpid in $RUNNING
{
kill $tpid
}
fi
GetArchiveList
cat /tmp/archivelist.lst | while read ARCHFILE
do
if [ ! "$ARCHFILE" = "" ]
then
StoreArchivesInFS $ARCHFILE
if [ "$FSFULL" = "TRUE" ]
then
#
# Insert some action here to clean up the destination file system
fi
fi
done
#
#
#

#
# Use Rman to update the recovery catalog
RmanCrosscheck

unset FSFULL ARCHFILE
#rm -f /tmp/archivelist.lst


The script contains a call to 'rman' using the command file 'StageArchives.cmd'
This file looks like:

allocate channel for maintenance type disk;
crosscheck archivelog all;
delete noprompt expired archivelog all;
quit;


Please note that I am not able to give any support to anyone who wants to use this script.
You are allowed to copy and use it, but for the remainder you're on your own.


Success!

Tuesday, January 11, 2011

Service XDB is down alert in Grid Control

Problem:
GridControl is reporting the XDB service as being down for RAC database.
In this thread it is investigated why this alert is thrown. We are using a RAC database called RACTST to show all settings and SQL queries.
It must be said that we haven't seen this alert being thrown for single-server databases. The Metric isn't even available in Grid Control there.


Background:
When creating a database with 'dbca' one can choose whether to use shared server setup. Even when shared server is disabled during database creation, afterwards the parameter 'dispatchers' is still set.
For anyone who is unfamiliar with the shared server term, consult the Oracle Database Administrators Guide at: http://download.oracle.com/docs/cd/B28359_01/server.111/b28310/manproc003.htm#i1010000


What do we see:
The default value for the dispatchers parameter looks like:
dispatchers='(PROTOCOL=TCP) (SERVICE=RACTSTXDB)'

In this parameter the service 'RACTSTXDB' is configured. This service then is automatically made available. The service availability can be checked by querying the gv$services table:

SQL> select inst_id,name from gv$services where name='RACTSTXDB';
INST_ID NAME
---------- ----------------------------------------------------------------
1 RACTSTXDB
2 RACTSTXDB

Because this is a 2-node RAC it is clearly shown that the service is available on both nodes.


When does the alert occur ?
Most DBA's disable the 'dispatchers' parameter whenever they choose to not use shared servers. Disabling the dispatcher is easily done by resetting the parameter from the spfile:

SQL> alter system reset dispatchers scope=spfile sid='*';

System altered.

It is also possible to turn of the dispatchers dynamically afterwards, by per using:

SQL> alter system reset dispatchers scope=memory sid='';

One can also restart the instance(s) to turn of the parameter after the spfile change.


Now wait for approx. 5 minutes, and voila, there is the alert:






Clicking on the alert shows a bit more detail, about when the alert occured:












A check in gv$services confirms the service being 'down' ( in fact it is gone):

SQL> select inst_id,name from gv$services where name='RACTSTXDB';

no rows selected.


The service is however still available in the database. A select from dba_services, still shows is as being known:

SQL> select name from dba_services where name='RACTSTXDB';
NAME
----------------------------------------------------------------
RACTSTXDB
1 row selected.


Solution:
Disabling the alert in GridControl would be a permanent "solution".
But that's more like a non-DBA workaround. You wouldn't be warned about any other real service being down.

There are three solutions to get rid of the alert:
- Either re-enable the dispatcher, or
- Remove the service from the database, or
- Learn to live with the alert being there.

The first is easy. Put back the 'dispatchers' parameter in the spfile with 'scope=both', and you're done. But stubborn as we are, DBA's never give up.

Well the second step, as said in the 'Problem' section, the alert is only thrown for RAC databases. For single instances, the Metric isn't even available in Grid Control.
In the RAC-FAQ on Oracle Support it says:

When I look at ALL_SERVICES view in my database I see services I did not create, what are they for?
You will always see a default database service that is the same name as your database. This service is available on all instances in the cluster. You will also see two services used by the database SYS$BACKGROUND (for background processes) and SYS$USERS (users who connect via BEQ or without using a service_name). You may also see services that end with XDB which are created for the XML DB feature and you will not be able to manage these services.


The meaning of the last phrase is that this service is not manageable with srvctl. It is unknown as a cluster resource. Is also says that belongs to the XML DB feature, so before proceeding with this thread consult MOS Note 416312.1 and the Oracle Forums thread: http://forums.oracle.com/forums/thread.jspa?threadID=410714&tstart=0


If you're sure you're not using XML-Db features, then you are OK to drop the XDB service.
Use the following command to do so:

SQL> exec dbms_service.delete_service('RACTSTXDB');

PL/SQL procedure successfully completed.

This will drop the service. When done the alert in Grid Control will subsequently be cleared.
And ofcourse as a real DBA we first test this on our TEST systems, don't we ??


The last option 'Learn to live with it' is valid for those users that do use the XML-Db feature and are not able to drop the service. For them I would almost recommend to follow solution 1. It is easier to cope with a running but unused dispatcher, then with an Grid Control alert that sits in your face every day.



Disclaimer:
I nowhere found a reason why we should not be able to drop this service if we are not using XML-Db features. Therefore the real impact is not completely known. If anyone can elaborate about this subject, you're most welcome to do so.
And eh... if you mess up your production system by following the actions in this thread, start blaming yourself, not me. All above is done on a TEST server.