As announced, I have been speaking for Credit Suisse employees last week. The objective was to give the database project and team leaders the keys arguments for going or not going to RAC.
First : what is RAC ?
Oracle Real Application Cluster.
You have a shared storage, and two instances accessing the same database.
The challenges:
In a single instance model, there is a read-write consistency. This may not change in RAC. Somehow, if one read a table and at the same time someone else write the table, it must offer the same level of consistency.
In single instance, if one instance dies (shutdown abort / kill -9 / server crash), there is an instance recovery which reads the commited and uncommited transaction in the redo logs. In RAC, each instance has an UNDO tablespace and a separate thread of redo. So if one instance crash (server crash), the other instances will do the crash recovery. If all instances crashes at the same time (ex: disaster), the one who restart will have to recover all transactions.
You have more OS, more network traffic, but you should achieve a comparable performance as for multiple processor. Some Oracle Marketing slides shows 2cpus-4nodes-cluster which performs better than a 8cpu-server. This is a challenge
This is rather related to Linux. If you have Sparc/Solaris, you can scale on a single server to more than 100 cpus and 1Tb memory. But if you have Linux, RAC offers additional scalability. However, it is not guarantee that your cluster will scale over 4 nodes.
This is THE challenge. It is also the main reason to go RAC. If you cannot afford the downtime of a FAILOVER (a few minutes, 1-30), you probably need RAC.
It must not be more difficult to maintain RAC than single instance database. This is a very difficult question. On failover cluster, there are lots of in-house scripts to do the failover. A big mess most of the time. In RAC, there is no failover. If one instance crashes, it just need to be restarted one day, that’s it! However, the configuration of the cluster is twice as difficult, because you have not 1 but 2 instances to configure. In case of tuning / troubleshooting, you have also 2 instances to tune, but also the interconnect and the cache fusion, I talk about the later below. If you have a SLA (server level agreement) per Database, going to RAC is a pain. If you have a SLA per instance, you will have more instance to maintain, so you may increase your productivity.
You have (typically) two servers (or more). On each server you have memory (buffer cache / shared pool / pga), cpu (load balance / parallelism) and processes (pmon / dbwr / dispatchers / arch). If the servers have different size, you can have different setting. One instance can have a bigger cache, more cpu used for parallel query, more dispatches, more db writers.
The spfile has the capability of configuring multiple instance in a single shared parameter file.
alter system set shared_pool_size=400M sid='INST01';
The dictionary has view to let you monitor either a single instance (V$SESSION) or globally (GV$SESSION).
select inst_id,sid,serial# from gv$session;
Note that if you want to kill a session, you must be connect as a privileged user on that instance, or wait for the next release of RAC.
There are also performance view about cache fusion and interconnect.
The cache fusion. In a single instance, you can read data from your cache safely. In RAC, if you have a block in the cache, another instance may update this block, so there must be a mechanism to insurance cache invalidation. This mechanism already existed in Oracle 6 Parallel Server (OPS) and later. Here the principle : one instance is the lock master. The first who started. Not configurable. Any time one instance read or write a block, the block can be cached. If one instance has the block in cache, another can read it.
Access time :
local cache << remote cache << disk
Let’s take an example:
INSTA: hello lock master, I want to read block 1008
LOCKM: nobody has this block, please read from disk
LOCKM updates his table, he knows INSTA has the block
INSTA reads the block
INSTB: hello lock master, I want to read block 1008
LOCKM: wait, INSTA has the block, I tell INSTA to send you that block
LOCKM knows both INSTA and INSTB have the block
INSTA sends the block to INSTB
INSTB: hello lock master, I want to update block 1008
LOCKM inform INSTA that his version of the block 1008 is now invalid
LOCKM knows INSTB has the block 1008 in cache (valid)
LOCKM: please do your update
This is cache fusion. If the lock master crashes, a new global cache table is built on another instance, voted as lock master.
This mechanism has performace impact. You need more CPU. If you have 100% cpu usage (ex: batch job), it will have dramatic performance impact. So you need suffisant CPU (cost of license :twisted:). Having 100% cpu in rac is not good.
Also if you have bigger buffer cache (OLTP), the probably to read a block from the network will increase, so it will generate network traffic. Having network traffic is good, because interconnect is faster than disk. But having network bottleneck is terrible.
If you have more nodes, you will have also more traffic. If you have some latches contention on single instance, going to RAC will increase your contention. RAC will not solve your single instance performance problem. If you have a package application, be sure it supports RAC and read the benchmarks for that particular application.
I had a lot of questions about planned downtime. Here are some facts:
If you have a standard os upgrade/patch, you may achieve 0 downtime with RAC. However your application must support connection failover (a challenge 8) ).
If you have an interim patch, like a Critical Patch update, RAC may offer 0 downtime by applying the patch locally in a rolling upgrade manner. Not all versions. Not all interim patches.
If you have a patchset (10.2.0.2 to 10.2.0.3) or major version upgrade (10gR1 to 10gR2), you need a logical standby database to achieve a downtime smaller than one minute. RAC will not help. In the opposite, you have more servers and more oracle homes to patch, so eventually it will increase the downtime in comparison to single instance.
Also the cost of the infrastructure was of biggest interest. If you have RAC, you may need a different cluster infrastructure (ex: infiniband) and some more efficient protocol than ethernet over IP (ex: Reliable Datagramm Socket). It is way more expensive than single instance failover. And the more node you add, the faster your network must perform.
One student asked : “Why invest time and money on RAC when the application / webserver is not redundant?”. This is a good question. You need to offers HA all over the service, from DBA to client. If one server crashes, you need another server. If one network crashes, you need another network. If one application server crashes, you need another application server. If one operator is ill, you need an other operator to do his job. Oracle Application Server offers HA with farms of J2EE services. If your application does not support TAF (transparent application failover), you need to change your application first !
Finally about the skills of the RAC dba. Lot’s of tasks are almost as easy or even easier than Failover cluster. runinstaller / opatch / database configuration assistant (dbca) / enterprise manager are cluster aware. Installing an additional node with OEM or ORACLE_HOME/oui/bin/add_node.sh and adding an additional instance with dbca or OEM is simple. However, the DBA must have some Cluster skills, and it is less easy, at least in Switzerland, to find a DBA with RAC experience than a “normal” DBA