Search this Blog

Friday, November 5, 2021

CIO mount on AIX and Oracle 19c

Today we faced a really strange issue. 
We are in the midst of upgrading all our databases to the Oracle 19c release, or patching existing installations to the latest patch level.

After working through a lot of databases on Redhat 7.x without any problems, the first database to be upgraded on IBM AIX7.2 was next.
Release installation went fine, patching to the latest level also went without issues.

However when creating an empty CDB to house/upgrade the 12.1 PDB, we ran into strange problems. DBCA threw an error ORA-00205: error identifying control file:


Looking at the file system location where the databases was planned to be, control files were present as expected. 
Oracle support suggested to run a DBverify on the file, but that also showed no errors:







$ORACLE_HOME/bin/dbv file=/opt/hisd01/db01/control01.ctl blocksize=16384

DBVERIFY: Release 19.0.0.0.0 - Production on Fri Nov 5 08:54:12 2021

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /opt/hisd01/db01/control01.ctl

DBVERIFY - Verification complete

Total Pages Examined         : 880
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 0
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 880
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 0

DBverify did not show any problems with the file. All pages were empty, but that is explained by the fact that the control files were newly created.

So, then what is the problem?
Why does the same action on Redhat 7.2 work like a charm, but throws this strange error on IBM AIX ?

For the answer we need to go back in time:

With the older Oracle releases running on IBM AIX, it was common to mount the file systems that contained database files and/or redo-log files with the CIO option. This enabled Concurrent I/O for the Oracle database, bypassing the file system cache.
As of Oracle 10g this requirement has changed.
I must admit that we never spend much attention on this change, as our UNIX system-admins were still delivering us systems with file systems that were CIO-mounted and performance was OK on all servers.

A very detailed explanation about CIO on AIX can be found here:


The first time we hit a problem regarding CIO mounted file systems, was when we copied an existing database-server running on Oracle 11.2.0.4 to a new IBM Storage System (FS7200). As this was a full clone, were nothing on server side changed, we were puzzled by the fact that we could not start the database anymore, after the copy.
Similar errors appeared about the control files that suddenly would be corrupted. 
Because it was a full clone, we never suspected the database itself. It had to be in the move to the new storage system. We then decided to remove the CIO option from the mount-point which solved the problem.

The whole problem is related to the fact that the default value of the filesystemio_options parameter has changed with the release of 19c. In Oracle 12c ( 12.1.0.2) the value of this parameter was default set to "async", which enables asynchronous I/O where possible.

In release 19c the value of the filesystemio_options parameter is default set to "none" (on  Linux), which disables all I/O performance enhancements.
When upgrading our first databases from Oracle 12c to 19c on Redhat 7.x, we ran into a huge degrade of I/O performance, because of this. The databases suddenly went from Asynchronous I/O to Synchronous I/O.
Investigating the issue, it revealed that this parameter had changed with the new release. We then decided to set it to the recommended value of "setall", which enables Direct I/O and Asynchronous I/O where possible. 
I/O speed on Redhat immediately increased back to what we expected, resulting in the fact that we decided to make this value the default for all our databases.


And there the problem with IBM AIX was originating. As the database want to open a file using Direct I/O and the file system is mounted with CIO, the files can not be read. 

Conclusion:
- When using Oracle set the filesystemio_options parameter to "setall"
- When using Oracle on AIX, do not mount the file systems with the CIO option (anymore)