Accelerating Oracle with Symantec Storage Foundation

Andreas Almroth

Andreas Almroth

Andreas is a senior consultant in the storage, data protection and high availability areas. In the past 22 years he has worked with many of the hardware and software technologies in use to deliver and secure business continuity. Today he focuses primarily on Symantec's products such NetBackup, Storage Foundation/HA together with Oracle, VMWare, and Cisco.Certifications:
Authorized Symantec Consultant (ASC) in Data Protection, Storage Management, and High Availability
Symantec Technical Specialist (STS) for NetBackup 7, Storage Foundation, and Veritas Cluster ServerAwards:
Symantec Pioneer Award 2008 - Development of MySQL option for NetBackup

Affiliations:
AAAI - Assocation for the Advancement of Artifical Intelligence
ACM - Assocation for Computing Machinery
IEEE - The Institute of Electrical and Electronics Engineers
PMI - Project Management Institute

Andreas Almroth

Latest posts by Andreas Almroth (see all)

99 Flares Twitter 7 Facebook 3 LinkedIn 89 Google+ 0 99 Flares ×

Symantec_logo_vertical_2010

An Oracle database can almost never get enough IOPS, or low latency, hence it is important to address these factors while drawing the design for an Oracle database. However, providing high-throughput storage in large heterogeneous data centres does impose some problems. Thus, this article looks at how we can maximize the throughput, utilizing Symantec Storage Foundation in large shared storage area networks (SAN).

Introduction

In order to achieve the absolute best possible performance for an Oracle database, there are many factors that need to be addressed; some of them are: computing power (such as speed of processors), high bandwidth memory, the network bandwidth and the storage. This article will focus on the storage.

Performance considerations

When drawing the design for an Oracle database, it is important to understand the type of load the storage must be able to meet. There are many types of load, and no application has the same load pattern, but we can at least differentiate two typical loads: online transaction processing (OLTP) and decision support systems (DSS).

Typically OLTP system use fast indexes to achieve fast results for the many concurrent SQL queries. Thus, the primary goal is to use the fastest possible storage for indexes. It is important to understand how many indexes and their sizes to accommodate for in the storage.

DSS systems typically handle large quantities of data and the SQL queries are often dynamic and not known in advance. Thus the reliance on indexes is less important as many SQL queries will end up doing sequential table scans. Thus, the DSS system storage is typically configured to meet long running scans.

So we effectively have two completely different requirements; OLTP needs small quantity of very fast storage, for the indexes with slower storage for the actual data, whereas DSS typically only need large quantity of slower performing storage.

However in both cases, using Symantec Storage Foundation will increase the aggregated throughput by using optimized I/O policies over multiple paths (VxDMP), extent based file system (VxFS), support for Oracle Disk Manager (ODM) and caching (SmartIO and memory cache).

Another factor which has significant impact on the general performance is Oracle’s requirement of atomicity, consistency, isolation, and durability (ACID). This means that each commit transaction must be acknowledged through the whole chain, from DB write, ODM, file system, HBA, disk storage system. ACID does impose a huge performance impact in many cases as cache cannot be used, but primarily on write operations.

Storage management

Storage management consist of many components, such as disk storage systems, SAN infrastructure, OS disk devices used for either raw or formatted access, or file systems.

Tiering

Today a typical enterprise disk storage system consists of multiple disks with different performance. We usually categorize them into tiers:

Tier 0 Fastest possible disk available. This could be SSD disks in shared storage systems, or local SSD disks in the host systems. This tier can also be flash memory cards connected directly to the I/O bus on the host systems
Tier 1 Fastest spinning disks available. Today it would typically be either SAS 15K or FC 15K disks.
Tier 2 Fast spinning disks; typically SAS 10K or FC 10K diskc.
Tier 3 Nearline disks, typically SATA 7.2K disks. Used primary for usage of large data quantities, such as disk-based backup data, or archived data.

Table 1 – Tier levels

Many disk storage system use tier 2 disks instead of tier 1 disks as they combine them with global cache in the storage controllers. However, the latency will be higher than tier 1 disks, as not all data can be cached. But in many cases the lower price for tier 2 wins over using tier 1.

The combination of tier 0 and tier 1 or 2 is winning terrain, as the aggregated throughput will come to an acceptable level by carefully crafting your provisioned storage type to the Oracle databases. This is where Symantec Storage Foundation really shows its benefits, as it simplifies the storage management.

For automatic tiering on the host, Symantec Storage Foundation provides two technologies: SmartTier and SmartIO.

SmartTier

SmartTier is a VxFS feature that enables you to allocate file storage space from different storage tiers. It leverages two technologies: multi-volume file systems and automatic policy-based placement of files. A multi-volume file system occupies two or more virtual storage volumes of different tiers and thereby enables a single file system to span across multiple, possibly heterogeneous, physical storage devices.

SmartIO

SmartIO is a new feature which enables efficient use of SSD or PCIe Flash cards through data caching. SmartIO improves the I/O performance by reducing the number of I/O operations to the back-end storage. It supports read and write caching for VxFS file systems. SmartIO also has database I/O load template framework which can be used to further improve the performance of an Oracle database.

RAID protection

Another important factor is the type of RAID being used to protect the data. RAID is used to provide redundancy if a disk should fail. There are many types of RAID levels and RAID combinations. Below are the RAID levels relevant for Oracle listed:

RAID0 Striped block layout with no redundancy. The data blocks are distributed over multiple disks in chunks. RAID0 is used to achieve better performance as more disk spindles are in use.
RAID1 Mirrored block layout. All data blocks are mirrored to 1 or more disks for redundancy. RAID1 adds some overhead for write operations, as at least two copies of the block is written. Read operations can however perform better, as some storage system can load-balance the read between the disks in the mirror.
RAID0+1 Combination of striping and mirroring. Write operations are first striped over multiple disks, followed by being mirrored to the other set of striped disks. More difficult re-construction if a disk fails
RAID1+0 Combination of mirroring and striping. Write operations are first mirrored, followed striping. RAID1+0 is easier to reconstruct, and is usually the preferred combination, as it gives good redundancy, better performance, and simpler to rebuild.
RAID5 Striping on block level with distributed parity block. Provides redundancy for single drive failure, and good performance for read operations, but with higher latency.
RAID50 RAID0 striped over two or more RAID5 groups. Used to combine multiple RAID5 groups into large logical volumes.
RAID6 Striping on block level with distributed two parity blocks. Provides redundancy for dual drive failure. Slower read performance than RAID5.
RAID60 RAID0 striped over two or more RAID6 groups. Used to combine multiple RAID6 groups into large logical volumes.

Table 2 – RAID levels

Today, most storage vendors recommend the use of RAID60, as very large pools of RAID6 groups can be combined, thus enabling provisioning of very large logical volumes to the hosts. Many enterprise storage vendors also support automatic storage tier pools. Meaning, that data being accessed the most will be moved up from slower tiers to the fastest available.

disk devices

Once the logical disks have been provisioned to the host, they show up as disk devices in the OS layer. If multi-pathing is being used the same disks will show up multiple times, leading to more complex management.

To handle this, Symantec Storage Foundation provide dynamic multi-pathing option (VxDMP), which has support for many disk storage systems currently available on the market. By providing support for specific systems, VxDMP optimizes the I/O access in accordance to the storage vendor recommendations.

Once the disks have been claimed by VxDMP, the logical DMP disk devices can be put into disk groups in Veritas Volume Manager (VxVM). If the disk systems does not provide RAID protection, VxVM can be used to provide host-based RAID 0, 1, and 5. However, most storage systems are configured with RAID, so the most common use is to concatenate multiple disks (RAID0 concat), in order to build larger logical volumes. With VxVM, file systems can be extended on the fly, by just adding a disk to the VxVM disk group. As VxVM supports this operation, downtime of databases can be avoided.

Oracle Automatic Storage Management

Automatic Storage Management (ASM) is Oracle’s implementation of volume manager and file system. ASM supports both single-instance databases as well as Oracle Real Application Clusters (RAC).

ASM can only be used with raw devices, and therefore removes the need for 3rd party file systems as ASM provides the internal Oracle file system distributed across all the disks available in the ASM disk groups.

Symantec Storage Foundation fully supports ASM, as ASM can use VxVM managed volumes, and thus the Oracle databases can achieve better performance as VxVM and VxDMP optimizes the throughput.

Although Oracle promotes the use of ASM, there are still OS related issues, such as disk track alignment, and the limited flexibility once put into production. No two disk systems has the same track alignment offset, and manually trying to find the offset tend to be difficult. With VxVM, the track alignment is automatically managed, as VxVM use its database of all supported disk systems, so select the appropriate alignment.

VxFS file system

Symantec Storage Foundation provides an extent-based file system (VxFS) which is highly optimized for many types of I/O load.

With its integration into Oracle Disk Manager, Oracle I/O throughput can be greatly improved. VxFS also provide caching which also can be combined with ODM.

The benefits of using Oracle disk Manager are many; asynchronous I/O, efficient file system layout by building contiguous files, no double buffering, direct I/O support, to mention a few. By using ODM and VxFS, the ACID compliance can be assured, even though the I/O as such is asynchronous, which leads to much better performance.

Putting it all together…

So with that bit of technical background, we are ready to look at the best possible designs for Oracle databases.

OLTP

As we know, OLTP systems require very fast access to the indexes, where as the actual data can reside on slower tiers. So in this case we would like to use tier 0 disks for indexes, tier 1 for control files, redo logs, and rollback/undo data files. General recommendation is to use the fastest RAID level, and here a RAID1+0 most likely the best fit.

Local SSD disks or Flash cards, if they are used only as read cache using SmartIO does not necessarily require RAID protection, as we do have the backend data on disks which will be used if the local cache fails. However, if we use SSD disks without using SmartIO, it is highly recommended to build RAID1+0.

So far it is has been a somewhat simple decision. But when looking at data files, temporary files, and archive redo log files, it gets more complex. If we use SmartTier or the disk storage systems own automatic storage tiering, the block locations will be automatically managed But if we want to control it more statically, we need to choose the tiering level, but still keeping in mind the RAID levels.

Type of data Tier/RAID level
Control files, redo logs, rollback/undo logs Tier 1 with RAID1+0
Indexes Tier 0 with RAID0
Data files (active partitions) Tier 1 with RAID1+0
Temporary files Tier 1 with RAID1+0
Data files (readonly/archived partitions) Tier 2-3 with RAID6

Table 3 – OLTP preferred tier and RAID levels

If we can control which RAID level is being used in the shared disk storage system, we can use disks from RAID1+0 disk pools. But more typically this is not the case. The host will be one of many hosts, and the storage system is typically using RAID50, or RAID60. In the latter case, we can typically only choose tiering level. So if possible tier 1 is preferred. This gives us low latency, although we have worse performance than RAID1+0.

Depending on whether the OLTP database has a lot of static data, and not being accessed very often, then tier 2, or even tier 3 can become relevant in the design as well, as the lower throughput, and higher latency most likely can be accepted.

DSS

As mentioned previously, DSS typically does not benefit as much by using fast indexes, as an OLTP database would. We also expect long sequential table scans. Although we can generally use slower tier levels, we should use tier 1 for control files, redo logs, indexes, and rollback/undo data files. A general recommendation is to use the fastest RAID level, and here a RAID1+0 most likely the best. The indexes do not have to be on tier 0 storage.

Type of data Tier/RAID level
Control files, redo logs, rollback/undo logs Tier 1 with RAID1+0
Indexes Tier 1 with RAID1+0
Data files (active partitions) Tier 1 with RAID6
Temporary files Tier 1 with RAID6
Data files (read-only partitions) Tier 2 with RAID6
Data files (archived partitions) Tier 3 with RAID6

Table 4 – DSS preferred tier and RAID levels

As DSS databases typically also are much larger in size, we need to be more relaxed on the tier and RAID levels. Rule of thumb, active partitions on tier 1 with RAID6, and read-only or archived partitions on tier 2 and 3 respectively, using RAID6.

ASM or VxFS

Choosing ASM over VxFS is typically more a policy decision, rather than performance decision. Oracle recommends the use of ASM of course, and it can be managed with Oracle tools and Oracle Enterprise Manager. VxFS fully supports ODM and provides much better throughput, as shown below. Symantec Storage Foundation also supports the use of Oracle Enterprise Manager for visibility and reporting on storage utilization.

The perfect design

So our best design is thus to use a combination of Oracle Database and Symantec Storage Foundation, where ODM uses the VxFS plugin to optimize the throughput, with VxVM for disk management and VxDMP for multipathing with the optional use of SmartIO.

These components tied together with the preferred storage tier levels and RAID protection will leverage the throughput and lower the latency, which leads to much faster transaction handling in the Oracle database instance.

Performance evaluation

So does Storage Foundation actually make a difference for your Oracle database? The short answer is yes! From a comparison of the various alternatives, we can clearly see that Storage Foundation does add value as the general performance is several times better than with Oracle ASM and native OS I/O kernel modules.

The performance tests have been conducted on an Intel-based HP DL380G8 server at 3GHz, 512GB memory, running Redhat Enterprise Linux 6.3 and Oracle 11gR2. The underlying storage is provisioned as logical LUNs from a tier 2 logical disks distributed on many RAID6 groups. Thus these tests are for a DSS database, and not a typical OLTP, although in many cases it would also be using RAID6. The local flash cache is a PCIe XtremSF card from EMC.

There are four types of tests:

  1. Oracle ASM with native Linux multipathd
  2. Oracle ASM with VxDMP
  3. Oracle ODM with VxFS (implicit VxDMP)
  4. Oracle ODM with VxFS and SmartIO (implicit VxDMP)

The four types of tests are using two types of I/O page size; 8K (normal Oracle page size), and 1M (big pages). All tests are read-only I/O operations.

In order to simulate Oracle ASM load, the ORION test tool from Oracle has been used. For simulating Oracle type of load on file systems, the FIO tool has been used. The support for ODM in VxFS has been enabled.

Throughput

Storage Foundation Performance 1 & 2

Looking at the 8K results, we can already see the benefits of Storage Foundation; from ~27MB/s using Oracle ASM with Linux multipathing, we get almost three times better throughput with ASM/VxDMP. If we then look at using VxFS, we see a boost up to 364-392MB/s.

The 1M results are more evenly distributed, except when using SmartIO. With SmartIO, we get more than three times better throughput as all the blocks are read from the local flash PCIe card.

Latency

Latency is very important as Oracle requires ACID committed operations, and the read/write operations must have been committed through the whole chain before next operation is allowed. So obviously we look at the lowest possible latencies. With many disk spindles, the latency is seldom a problem though.

Storage Foundation Performance 3 & 4

Just as with the throughput, we can see some gains in using Storage Foundation. However, as the latencies measured are so low, the difference is quite marginal, the exception being the I/O from the PCIe flash. But even a small difference as 250µs would most likely give a better overall throughput.

 

Conclusions

Although we have only looked at how to accelerate an Oracle database using disk tuning, we can clearly see a major performance boost by using Storage Foundation with VxVM, VxFS, VxDMP, and SmartIO. The disk used in the tests, are provisioned from a very large shared disk storage system in the enterprise class, utilizing hundreds of disks to build the disk pools, and servicing I/O to several hundred hosts in a heterogeneous data centre.

By using Storage Foundation, we significantly improve throughput and latency figures. The main reason is that Storage Foundation is written to provide the best possible I/O performance, with multipathing I/O policies and modules for many storage vendors, leading to optimized performance. Storage Foundation also provides integration into Oracle Disk Manager for improved performance, as well as Oracle Enterprise Manager for visibility and reporting on storage utilization.

Furthermore, by using the SmartIO feature with local SDD or PCIe flash, we can minimize the number of I/O operations having to travel all the way to the shared storage systems, leading to faster throughput and lower latencies, and less reliance on the shared storage system.

Andreas Almroth

Om Andreas Almroth

Andreas is a senior consultant in the storage, data protection and high availability areas. In the past 22 years he has worked with many of the hardware and software technologies in use to deliver and secure business continuity. Today he focuses primarily on Symantec's products such NetBackup, Storage Foundation/HA together with Oracle, VMWare, and Cisco.Certifications: Authorized Symantec Consultant (ASC) in Data Protection, Storage Management, and High Availability Symantec Technical Specialist (STS) for NetBackup 7, Storage Foundation, and Veritas Cluster ServerAwards: Symantec Pioneer Award 2008 - Development of MySQL option for NetBackup Affiliations: AAAI - Assocation for the Advancement of Artifical Intelligence ACM - Assocation for Computing Machinery IEEE - The Institute of Electrical and Electronics Engineers PMI - Project Management Institute

En tanke om “Accelerating Oracle with Symantec Storage Foundation

  1. Hi, thanks for interesting article. SmartIO is definitely cool technology, as caching is best done when closest to host CPUs.

    Regarding your statements, can you please elaborate on sentence “By using ODM and VxFS, the ACID compliance can be assured, even though the I/O as such is asynchronous” ? I assume this stands for read/write workload, not only read-only, is that correct? If yes, is it beacuse of journaling on volume and filesystem levels or why?

    Why do you think is there big step in performance difference in Figure 1 (synchronous vs. asynchronous in read only test?)? How big was active dataset on which benchmark was done (RAM on host was 512GB)? How big was cache on disk array from which was this dataset served? Did you cleaned buffer cache on host between each benchmark run(for example by reboot)? Measured latencies in Figures 3 and 4 are provided as best values of all values, or they are representing some percentile of values, in which case how big was this percentile?

    I suspect whole dataset was served right from disk array’s cache, as in figure 3, latency is under milisecond in all tests, and in basic random-read test these 8K pages couldnt be served from disks at all (fastest non-ssd disk has latency about 4 miliseconds), or just minimum of them.

    I just can imagine doing little better benchmark to really see SmartIO value (read&write workload + bigger active dataset, maybe 100 times bigger than host’s RAM size or disk array’s cache size, 2 flash devices in Raid 1 configured as SmartIO).

    Anyway thanks again, I like SFHA and always like thinking about performance or availability problems.