June 16, 2008

All about storage: SAN, RAID, DISKS – SQL Server DBA

One of the easiest ways to improve the lifetime performance of a SQL Server database is proper setup of the physical and logical drives. While it's an easy technique, proper disk subsystem configuration is often overlooked or handled by a member of the IT staff other than a SQL Server DBA.

Storage specific terms that everyone should know:
.RAID – Redundant Array of Inexpensive Disks, also known as Redundant Array of Independent Disks.
.Disk subsystem – A general term that refers to the disks on the server.
.Spindle – Spindles are another way to refer to the physical disk drives that make up the RAID array.
· I/O Ops – Input/Output operations, usually measured per second.
· Queuing – Number of I/O Ops that are pending completion by the disk subsystem.
· SAN – Storage area networks are collections of storage devices and fibre switches connected together along with the servers that access the storage on the device. SAN has also become a generic term, which refers to the physical storage drives such as EMC, 3PAR and Hitachi.
· LUN – Logical Unit Number – This is the identification number assigned to a volume when created on a SAN device.
· Physical drive – How Windows sees any RAID array, single drive or LUN that is attached to the server.
· Logical drive – How Windows presents drives to the user (C:, D:, E:, etc.).
· Block size – The amount of data read from the spindles in a single read operation. This size varies per vendor from 8 KB to 256 MB.
· Hardware array – A RAID array created using a physical RAID controller.
· Software array – A RAID array created within Windows using the computer management snap-in.
· Hot spare – A spindle that sits in the drive cage and is added to the array automatically in the event of a drive failure. While this does not increase capacity, it does reduce the amount of time that the array is susceptible to data loss because of a second failed drive.
· Recovery time – Amount of time needed for the RAID array to become fully redundant after a failed drive has been replaced, either manually or automatically via a hot spare.

RAID system:
RAID 10 – Mirrored Strip Sets. A RAID 10 array is most useful for high read or high write operations. RAID 10 is extremely fast; however, it is also extremely expensive (compared to the other RAID levels available). In basic terms, a RAID 10 array is several RAID 1 arrays stripped together for performance. As with a RAID 1 array, as data is written to the active drive in the pair, it is also written to the secondary drive in the pair. A RAID 10 array can survive several drive failures so long as no two drives in a single pair are lost.

RAID Level 0: "Disk Striping" RAID 0: Also known as "Disk Striping", is technically not a RAID level since it provides no fault tolerance. (Parity) Data is written in blocks across multiple drives, so one drive can be writing or reading a block while the next is seeking the next block. The advantages of RAID 0 is it’s high level of I/O performance because the I/Os are spread across multiple channels and drives. Because RAID 0 offers no fault tolerance, if one drive should fail, this will result a loss of all your data. This should never be used in a environment where data is mission critical. RAID 1 – Mirror. A RAID 1 array is most useful for high write files, such as the page file, transaction logs and tempdb database. A RAID 1 array takes two physical disks and creates an exact duplicate of the primary drive on the backup drive. There is no performance gain or loss when using a RAID 1 array. This array can survive a single drive failure without incurring any data loss.

SAN considerations for your SQL Server environment
· The SANs sole function is to store data and offer high reliability and high performance access to this data. It is a network which provides high-speed, highly reliable transportation of data for multiple servers which generally connect through a high speed optical network called fibre. A SAN consists of many high performance hard drives (typically several hundred) aggregated together with high performance controllers and caching; these hard drives are virtualized so that the consumer does not know which hard drives a SQL Server or other device connected to the SAN will access. A SAN administrator will present blocks of storage to servers using the SAN and these blocks of storage can consist of a single hard drive, multiple hard drives or portions of hard drives in a logical unit called a LUN (Logical Unit Number).

SANS have several advantages over locally attached storage:
· There is a distance limitation for SCSI connections which the SAN fibre network overcomes.
· Most SANs provide features which allow you to clone, snapshot, or rapidly move data (replicate) from one location to another. File copies or bcp over your network simply are not scalable. This increases their usefulness for disaster recovery.
· SANs play well with clusters. Clusters share resources between the nodes that form the cluster.
· SANs will allow a cluster or a server to boot off a SAN.
· SANs offer increased utilization of storage. With locally attached storage large amounts of disk space can be wasted. With a SAN you can expand or contract the amount of disk space a server or cluster can access.
· SANs will also offload some of the processing from the host system to the SAN itself.
Benefits of a SAN:
The primary benefits of a SAN are:
Availability: A single copy of data is accessible to any and all hosts via multiple paths.
Reliability: Dependable data transportation ensures a low error rate, and an ability to recover from failures is provided.
Scalability: Servers and storage devices may be added independently of one another, and do not depend on proprietary systems.
Performance: Fibre Channel (the standard method for SAN interconnectivity) has a 100MB/sec bandwidth and low overhead, and it separates storage and network I/O.
Manageability: Emerging software and standards for both FC-AL and Fibre Channel fabric allow single centralized management and proactive error detection and correction.
Return On Information Management: Due to increased redundancy and superior manageability, as well as the ability to add storage and servers independently of one another SANs provide a lower cost of ownership and a Higher Return On Information Management (ROIM).

When requesting storage from your SAN admin or Windows admin, tell them what RAID level and storage tier you need. That's important because the RAID level determines just how much read and write I/O you need, as well as the amount of redundancy you require. It's important that a DBA is involved in the decision process for RAID level volumes -- after all, you know the data within the system much better than any storage engineer does.

No comments: