Monday, March 17, 2008

Setting the Alignment Offset on ESX Server and a (Virtual) Windows Server



Setting the Alignment Offset on ESX Server and a (Virtual) Windows Server

To add to the layer of confusion, we must discuss what needs to be done when assigning a LUN to an ESX Server, and then creating the (virtual) disk that will be assigned to the (Virtual) Windows Server.

As stated in the previous blog titled Disk Alignment, we must align the data on the disks before any data is written to the LUN itself. We align the LUN on the ESX Server because of the way in which a Clariion Formats the Disks in the 128 blocks per disk (64 KB Chunk) and the metadata written to the LUN from the ESX Server. Although, it is my understanding that ESX Server v.3.5 takes care of the initial offset setting of 128.

The following are the steps to align a LUN for Linux/ESX Server:
Execute the following steps to align VMFS
1. On service console, execute “fdisk /dev/sd”, where sd is the device on which you would like to create the VMFS
2. Type “n” to create a new partition
3. Type “p” to create a primary partition
4. Type “1” to create partition #1
5. Select the defaults to use the complete disk
6. Type “x” to get into expert mode
7. Type “b” to specify the starting block for partitions
8. Type “1” to select partition #1
9. Type “128” to make partition #1 to align on 64KB boundary
10.Type “r” to return to main menu
11.Type “t” to change partition type
12. Type “1” to select partition 1
13. Type “fb” to set type to fb (VMFS volume)
14. Type “w” to write label and the partition information to disk

Now, that the ESX Server has aligned it’s disk, when the cache on the Clariion starts writing data to the disk, it will start writing data to the first block on the second disk, or block number 128. And, because the Clariion formats the disks in 64 KB Chunks, it will write one Chunk of data to a disk.

If we create a (Virtual) Windows Server on the ESX Server, we must take into account that when Windows is assigned a LUN, it will also want to write a signature to the disk. We know that it is a Virtual Machine, but Windows doesn’t know that. It believes it is a real server. So, when Windows grabs the LUN, it will write it’s signature to the disk. See blog titled DISK ALIGNMENT. Again, the problem is that the Windows Signature will take up 63 blocks. Starting at the first block (Block # 128) on the second disk in the RAID Group, the Signature will write halfway across the second disk in the raid group. When Cache begins to write the data out to disk, it will write to the next available block, which is the 64th block on the second disk. In the top illustration, we can see that a 64 KB Data Chunk that is written out to disk as one operation will now span two disks, a Disk Cross. And from here on out for that LUN, we will see a Disk Cross because there was no offset set on the (Virtual) Windows Server.
In the bottom example, we see how the offset was set for the ESX Server, the offset was also set on the (Virtual) Windows Server, and now Cache will write out to a single disk in 64 KB Data Chunks, therefore limiting the number of Disk Crosses.

Again, from the (Virtual) Windows Server we can set the offset for the LUNs using either Diskpart or Diskpar.

To set the alignment using Diskpart, see the earlier Blog titled Setting the Alignment Offset for 2003 Windows Servers(sp1).

To set the alignment using Diskpar:
C:\ diskpar –s 1
Set partition can only be done on a raw drive.
You can use Disk Manager to delete all existing partitions
Are you sure drive 1 is a raw device without any partition? (Y/N) y
----Drive 1 Geometry Information ----
Cylinders = 1174
TracksPerCylinder = 255
SectorsPerTrack = 63
BytesPerSector = 512
DiskSize = 9656478720 (Bytes) = 9209 (MB)
We are going to set the new disk partition.
All data on this drive will be lost. Continue (Y/N) ? Y
Please specify the starting offset (in sectors) : 128
Please specify the partition length (in MB) (Max = 9209) : 5120
Done setting partition
---- New Partition information ----
StatringOffset = 65536
PartitionLength = 5368709120
HiddenSectors = 128
PartitionNumber = 1
PartitionType = 7

As it shows in the bottom illustration from above, the ESX server has set an offset, the (Virtual) Windows Machine has written it’s signature, and has set the offset to start writing data to the first block on the third disk in the Raid Group.

17 comments:

TheNCHeil said...

SAN Guy,
I read this post and have 2 questions/comments.
My understanding is if you create your ESX VMFS datastore via Virtual Center, it takes care of the alignment offset automatically since at least version 2.0.
Can you clarify your second point about setting the offset for the virtual machine. Are you referring to RDM only? It is my understanding from EMC World sessions last year, if your VM guest is created within a VMFS datastore (not RDM) there is no need to align within the guest OS. Can you clarify this point?
6 of use VMware/Clariion users are debating this point. Thanks,
Mike

san guy said...

This is from a white paper titled "VMware ESX Server 3.0
Recommendations for
Aligning VMFS Partitions"

Instructions for Guest File System Alignment:

Once you have aligned your VMware VMFS partitions, you also need to align the file system partitions within your virtual machines. The following sections discuss how to align guest
operating system partitions in Linux and Windows environments.

It goes on to discuss how to set the offset using fdisk, and diskpar for a virtual windows machine.

Giacomo Italy said...

Hi San guy! Great articles in your blog:).One question, i've the clariion cx300 and vmware ESX 3.5.
Now i want to create a new LUN, i see the "Allignement offset (LBA") is 0, is it correct to put 128?
Thanks i nadvance

san guy said...

giacomo italy...if you are talking about the alignment offset being set to "0" at the LUN properties box in Navisphere, the answer is to leave it set to "0". You should not set the alignment offset via navisphere. It should be set through the host based utility, ie diskpart for windows 2003, etc...

Anonymous said...

Thanks :)

Giacomo Italy said...

Yes, I mean in navisphere...thanks for your reply!

TheNCHeil said...

SAN guy,
Thank you for the white paper reference "VMware ESX Server 3.0
Recommendations for
Aligning VMFS Partitions". Question for you, this document does not mention RDM drives at all. Can you confirm that if you use an RDM drive you should NOT perform the alignment offset?
We have an open ticket with VMware on this topic and they have yet to produce a statement stating either way. The engineer "thinks" you should not align an RDM but it is not backed up with anything.
Thanks,
Mike

Anonymous said...

RDM's should be aligned according to the OS being installed on it. Windows will need alignment to the first 64k block.

Anonymous said...

ugg bootsTHE SNOW wholesale ugg bootsSTARTED TO FALLWholesale handbags SEVERAL HOURS wholesale clothingBEFORE HER laborWholesale jewelry began.wholesale clothing A few flakes wholesale handbagsfirst, wholesale furniturein the dull Furniture Wholesalegray late-afternoon Wholesale jewelrysky,Ceramic tileand then Micro sd cardswind-driven swirls Wholesale clothingand eddies Wholesale Jewelryaround the edges Wholesale fashion jewelryof their wide Wholesale costume jewelryfront porch. ugg bootsHe stood by her wholesaleside at the windowwholesale electronics, watching sharp gusts of snow billow,skin care products
then swirl and drift to the ground. All around the neighborhood, lights came on, and the naked branches of the trees turned white.

Anonymous said...

ugg bootsOutside,uggs snow continuedHigh pressure blower to fall quietly throughIndustrial fan the darkness, Industrial bloweras bright and thick Commercial bloweras static in the wholesalecones of light castchina wholesale by the streetlights.wholesale shoes By the time he rosewholesale clothing and looked out watchesthe window, pressure blowertheir car had become a soft white hill on the edge pressure blowersof the street.fans Already his footprints blowers
in the driveway had filled and disappeared.

Anonymous said...

I am soo bloody confused ! Nobody can give me a proper answer to this.
Ok so I"m playing with esxi on a CX300 and according to the techpaper if you created the datastore via VI then you're already aligned from the esx side of live.
You still need to align partitions inside the VM. From the paper:
"Note: Aligning the boot disk in the virtual machine is neither recommended nor required. Align only the data disks in the virtual machine."
Q1: By boot disk are they referring to "/boot" or the entire VM ?
Q2: By data disks are they referring to the fact that I might assign a new virtual disk from another datastore for fast i/o ?
In other words what do I do with "/" f.e. Does that need to be aligned or not ?

Alex said...

More questions I'd love to have answers for..

1. Why is it that a "windows signature" or "linux partition" causes an offset but the signature that LVM2 writes when you do a "pvcreate" on a lun does not ?
Aren't there plenty of other tools that write signatures to disks before using them ? (F.e oracle ASM does that too). What's so special about this one ?

2. Is a windows signature really the exact same size as a linux partition table ?

3. Do Clariions and DMX's really stripe luns the same exact way ? If not why am I told to align partitions to 128 blocks instead of 63 for both arrays ?

4. Do we not have this issue with Netapp luns since they're fake and really files on a WAFL filesystem and WAFL writes stuff all over the place out of our control anyway ?

Alex said...

Wait...you can't align "/" anyways since it'd destroy the data on it - right ?
Can someone please tell me that I don't need to worry about that anymore ?
If I was to assign another virtual disk to my VM for an application I assume I have to align that if I use a straight partition but not if I use LVM2 instead right ? Or does that only apply to an RDM ?
I thought if I create a datastore in VI it takes care of the alignment - is that for the ESX side only ?
Still very confused...

SAN Guy - HELP...

新同学 said...

The Nets wow goldwent 9-7 in October and November, wow goldand they went 6-10 in January. wow goldThe falloff offensively haswow gold outweighed the improvementwow gold defensively. But we see wow goldnow that the fast start was a bit of a mirage.

新同学 said...

There aredofus kamas three teams that appearkamas dofus twice above. The Thunderdofus kamas are on the right end of kamas dofusboth lists. They'veacheter dofus improved bothbuy kamas offensively and defensively, acheter kamasthanks in part to how horrible they were early in the season. They had nowhere to go but up.

Anonymous said...

29047126483369175 I play dofus Replica Watches for one year, I Replica Rolex Watches want to get some Replica Watch kamas to buy Replica Chanel Watches item for my character. So, I search "Replica Swiss Watches" on google and found many website. As Exact Replica Graham Watch the tips from the forum, I just review the Swiss Replica Watches websites and choose some Replica Montblanc Watches quality sites to Replica Cartier Watches compare the price, and go to their Replica Breguet Watches online support to make Replica Breitling Watches the test. And Last Chaos Gold I decide to use Replica BRM Watch at the end. And Tag Heuer Replica Watch that is the Replica IWC Watch beginning..

Jonathan said...

This command from the esx server side of things helped find the lun sd#

esxcfg-vmhbadevs