Category Archives: RAC

SPARC Supercluster

Oracle buys Sun was an exciting accouncement 20 months ago.

What did change in the Solaris/Oracle Database world?

First, Oracle delivered Exadata on Sun Hardware (x86_64).
Second, they delivered Exadata on Sun Solaris Operating System (x86_64).

But now, they announced a combination of software and hardware that will run Oracle database faster than anything ever before.

I am happy to read Oracle is still investing on R&D on the Sparc processors server line !

RAC exam

I have attend the RAC beta exam this afternoon. 181 questions in 3.5 hours, it is a lot of questions! I prepared by reading the 2-day dba RAC document. There are also some questions about Maximum Availability Architecture (MAA) with physical and logical standby. Also some question which are Linux speci

I have been short on time, the question were very long to read, and 181 questions in 210 minutes, I did not have the time to review all.

Now I must wait about 10 weeks to get the result, but I am quite optimistic 8-)

RAC workshop

As announced, I have been speaking for Credit Suisse employees last week. The objective was to give the database project and team leaders the keys arguments for going or not going to RAC.

First : what is RAC ?
Oracle Real Application Cluster.
You have a shared storage, and two instances accessing the same database.

The challenges:

    The consistency

In a single instance model, there is a read-write consistency. This may not change in RAC. Somehow, if one read a table and at the same time someone else write the table, it must offer the same level of consistency.

    Crash recovery

In single instance, if one instance dies (shutdown abort / kill -9 / server crash), there is an instance recovery which reads the commited and uncommited transaction in the redo logs. In RAC, each instance has an UNDO tablespace and a separate thread of redo. So if one instance crash (server crash), the other instances will do the crash recovery. If all instances crashes at the same time (ex: disaster), the one who restart will have to recover all transactions.

    Performance

You have more OS, more network traffic, but you should achieve a comparable performance as for multiple processor. Some Oracle Marketing slides shows 2cpus-4nodes-cluster which performs better than a 8cpu-server. This is a challenge

    Scalability

This is rather related to Linux. If you have Sparc/Solaris, you can scale on a single server to more than 100 cpus and 1Tb memory. But if you have Linux, RAC offers additional scalability. However, it is not guarantee that your cluster will scale over 4 nodes.

    High Availability (HA)

This is THE challenge. It is also the main reason to go RAC. If you cannot afford the downtime of a FAILOVER (a few minutes, 1-30), you probably need RAC.

    Maintainability

It must not be more difficult to maintain RAC than single instance database. This is a very difficult question. On failover cluster, there are lots of in-house scripts to do the failover. A big mess most of the time. In RAC, there is no failover. If one instance crashes, it just need to be restarted one day, that’s it! However, the configuration of the cluster is twice as difficult, because you have not 1 but 2 instances to configure. In case of tuning / troubleshooting, you have also 2 instances to tune, but also the interconnect and the cache fusion, I talk about the later below. If you have a SLA (server level agreement) per Database, going to RAC is a pain. If you have a SLA per instance, you will have more instance to maintain, so you may increase your productivity.

You have (typically) two servers (or more). On each server you have memory (buffer cache / shared pool / pga), cpu (load balance / parallelism) and processes (pmon / dbwr / dispatchers / arch). If the servers have different size, you can have different setting. One instance can have a bigger cache, more cpu used for parallel query, more dispatches, more db writers.

The spfile has the capability of configuring multiple instance in a single shared parameter file.

alter system set shared_pool_size=400M sid='INST01';

The dictionary has view to let you monitor either a single instance (V$SESSION) or globally (GV$SESSION).
select inst_id,sid,serial# from gv$session;
Note that if you want to kill a session, you must be connect as a privileged user on that instance, or wait for the next release of RAC.
There are also performance view about cache fusion and interconnect.

The cache fusion. In a single instance, you can read data from your cache safely. In RAC, if you have a block in the cache, another instance may update this block, so there must be a mechanism to insurance cache invalidation. This mechanism already existed in Oracle 6 Parallel Server (OPS) and later. Here the principle : one instance is the lock master. The first who started. Not configurable. Any time one instance read or write a block, the block can be cached. If one instance has the block in cache, another can read it.
Access time :
local cache << remote cache << disk
Let’s take an example:

INSTA: hello lock master, I want to read block 1008
LOCKM: nobody has this block, please read from disk
LOCKM updates his table, he knows INSTA has the block
INSTA reads the block
INSTB: hello lock master, I want to read block 1008
LOCKM: wait, INSTA has the block, I tell INSTA to send you that block
LOCKM knows both INSTA and INSTB have the block
INSTA sends the block to INSTB
INSTB: hello lock master, I want to update block 1008
LOCKM inform INSTA that his version of the block 1008 is now invalid
LOCKM knows INSTB has the block 1008 in cache (valid)
LOCKM: please do your update

This is cache fusion. If the lock master crashes, a new global cache table is built on another instance, voted as lock master.

This mechanism has performace impact. You need more CPU. If you have 100% cpu usage (ex: batch job), it will have dramatic performance impact. So you need suffisant CPU (cost of license :twisted:). Having 100% cpu in rac is not good.

Also if you have bigger buffer cache (OLTP), the probably to read a block from the network will increase, so it will generate network traffic. Having network traffic is good, because interconnect is faster than disk. But having network bottleneck is terrible.

If you have more nodes, you will have also more traffic. If you have some latches contention on single instance, going to RAC will increase your contention. RAC will not solve your single instance performance problem. If you have a package application, be sure it supports RAC and read the benchmarks for that particular application.

I had a lot of questions about planned downtime. Here are some facts:
If you have a standard os upgrade/patch, you may achieve 0 downtime with RAC. However your application must support connection failover (a challenge 8) ).
If you have an interim patch, like a Critical Patch update, RAC may offer 0 downtime by applying the patch locally in a rolling upgrade manner. Not all versions. Not all interim patches.
If you have a patchset (10.2.0.2 to 10.2.0.3) or major version upgrade (10gR1 to 10gR2), you need a logical standby database to achieve a downtime smaller than one minute. RAC will not help. In the opposite, you have more servers and more oracle homes to patch, so eventually it will increase the downtime in comparison to single instance.

Also the cost of the infrastructure was of biggest interest. If you have RAC, you may need a different cluster infrastructure (ex: infiniband) and some more efficient protocol than ethernet over IP (ex: Reliable Datagramm Socket). It is way more expensive than single instance failover. And the more node you add, the faster your network must perform.

One student asked : “Why invest time and money on RAC when the application / webserver is not redundant?”. This is a good question. You need to offers HA all over the service, from DBA to client. If one server crashes, you need another server. If one network crashes, you need another network. If one application server crashes, you need another application server. If one operator is ill, you need an other operator to do his job. Oracle Application Server offers HA with farms of J2EE services. If your application does not support TAF (transparent application failover), you need to change your application first !

Finally about the skills of the RAC dba. Lot’s of tasks are almost as easy or even easier than Failover cluster. runinstaller / opatch / database configuration assistant (dbca) / enterprise manager are cluster aware. Installing an additional node with OEM or ORACLE_HOME/oui/bin/add_node.sh and adding an additional instance with dbca or OEM is simple. However, the DBA must have some Cluster skills, and it is less easy, at least in Switzerland, to find a DBA with RAC experience than a “normal” DBA :mrgreen:

Fit for RAC

I will be speaking tomorrow in German and Wednesday in English in Credit Suisse Bank.

The workshop is about the differences between single instance and RAC

Some of them : architecture, availability, scalability, manageability, skills required, price. I will talk also about the operational aspects, upgrade, dictionary, parameter file.

rac automation

Werner Puschitz talked yesterday about Automated Oracle Real Application Clusters Deployment: How Dell Does IT.
The result of the automation is, install a 6 nodes cluster in 30 minutes. Which is quite impressive.

Werner divided the automation in three RPM packages.

1) configure the host: set up private and virtual interface (according to a naming convention, for example private is hostname-priv, vip is hostname-vip). Set up the system configuration (shared memory, etc). Set up the SAN, shared devices. Set up ssh. Define a cronjob to check if all nodes are configured and accessible. Write an email to the sysadmin when all nodes are ready

2) install the clusterware. the RPM is build with a response file + the oracle software. The response file is recorded with ./runInstaller -record.

3) install the database software. the RPM is build with a response file too

Werner said it is rather worth to build those packages for large companies, where you have many RAC installations. In my opinion, even if you have only one or two rac, you can benefit from this. It provides you an easy way to reinstall the system, upgrade it to a new version, create two systems (dev+prod) with the same configuration.

Thanks Werner for this very good one!

Suse10 10gR2 Laptop RAC

Hi,
[EDIT]
I did read Dizwell article about Linux 10g Laptop RAC. The article is no longer available
[/EDIT]

I have just installed RAC 10gR2 on Suse10.

The document above guided me thru the steps.

Specific Suse10 :
– raw devices are configured by editing /etc/raw with values like raw1:loop1 and must be activated with rcraw start (+ chkconfig -a raw for persitence)

To take care :
– read point 8 to survive reboot…

Specific 10gR2 :
In 10gR2, MAXINSTANCES must not be specified by create controlfiles. In 10gR2, the controlfile can grow dynamically. Even if I created my controlfile with maxinstances=1, I can do alter database add logfile instance 'RAC2' ; alter database add logfile instance 'RAC2' ; alter database enable instance 'RAC2';, which is simply great! CREATE CONTROLFILE just belongs to the past!

Installation:
I had to run vipca manually as root at the end of the root script of the crs installation. I also removed my existing oracle_homes and rebooted before installing database software again. The VIP interface must be configured in /etc/hosts and DOWN (ifconfig down eth0:2).

Conclusion:
The doc written by Amit Poddar on Howad site is usefull. Enjoy rac’ing on your pocket desk calculator!

rac again

I remembered the time in 9iR2 when load/balancing was just dicing between the instance. In 10gR2, I noticed that load balancing is much better. I did a test, start 28 sessions from an external client, 12 went to node 1, 12 to node 2. Very impressive.

In 10gR2 RAC, the listener gets statistics collected by MMON, and balances load much more efficiently…

The Failover is also working correctly. I just powered off Doug PC (we work in team in this lab), and after second try it worked. The TNSNAMES must be configured correctly, with FAILOVER and virtual ip.

I also tried to install RAC/Linux/vmware on my notebook, but I stupidely remove the linux grub boot partition and I cannot boot anymore… I need a doctor soon :-)