Monday, March 17, 2008

Setting the Alignment Offset on ESX Server and a (Virtual) Windows Server





Setting the Alignment Offset on ESX Server and a (Virtual) Windows Server

To add to the layer of confusion, we must discuss what needs to be done when assigning a LUN to an ESX Server, and then creating the (virtual) disk that will be assigned to the (Virtual) Windows Server.

As stated in the previous blog titled Disk Alignment, we must align the data on the disks before any data is written to the LUN itself. We align the LUN on the ESX Server because of the way in which a Clariion Formats the Disks in the 128 blocks per disk (64 KB Chunk) and the metadata written to the LUN from the ESX Server. Although, it is my understanding that ESX Server v.3.5 takes care of the initial offset setting of 128.

The following are the steps to align a LUN for Linux/ESX Server:
Execute the following steps to align VMFS
1. On service console, execute “fdisk /dev/sd”, where sd is the device on which you would like to create the VMFS
2. Type “n” to create a new partition
3. Type “p” to create a primary partition
4. Type “1” to create partition #1
5. Select the defaults to use the complete disk
6. Type “x” to get into expert mode
7. Type “b” to specify the starting block for partitions
8. Type “1” to select partition #1
9. Type “128” to make partition #1 to align on 64KB boundary
10.Type “r” to return to main menu
11.Type “t” to change partition type
12. Type “1” to select partition 1
13. Type “fb” to set type to fb (VMFS volume)
14. Type “w” to write label and the partition information to disk

Now, that the ESX Server has aligned it’s disk, when the cache on the Clariion starts writing data to the disk, it will start writing data to the first block on the second disk, or block number 128. And, because the Clariion formats the disks in 64 KB Chunks, it will write one Chunk of data to a disk.

If we create a (Virtual) Windows Server on the ESX Server, we must take into account that when Windows is assigned a LUN, it will also want to write a signature to the disk. We know that it is a Virtual Machine, but Windows doesn’t know that. It believes it is a real server. So, when Windows grabs the LUN, it will write it’s signature to the disk. See blog titled DISK ALIGNMENT. Again, the problem is that the Windows Signature will take up 63 blocks. Starting at the first block (Block # 128) on the second disk in the RAID Group, the Signature will write halfway across the second disk in the raid group. When Cache begins to write the data out to disk, it will write to the next available block, which is the 64th block on the second disk. In the top illustration, we can see that a 64 KB Data Chunk that is written out to disk as one operation will now span two disks, a Disk Cross. And from here on out for that LUN, we will see a Disk Cross because there was no offset set on the (Virtual) Windows Server.
In the bottom example, we see how the offset was set for the ESX Server, the offset was also set on the (Virtual) Windows Server, and now Cache will write out to a single disk in 64 KB Data Chunks, therefore limiting the number of Disk Crosses.

Again, from the (Virtual) Windows Server we can set the offset for the LUNs using either Diskpart or Diskpar.

To set the alignment using Diskpart, see the earlier Blog titled Setting the Alignment Offset for 2003 Windows Servers(sp1).

To set the alignment using Diskpar:
C:\ diskpar –s 1
Set partition can only be done on a raw drive.
You can use Disk Manager to delete all existing partitions
Are you sure drive 1 is a raw device without any partition? (Y/N) y
----Drive 1 Geometry Information ----
Cylinders = 1174
TracksPerCylinder = 255
SectorsPerTrack = 63
BytesPerSector = 512
DiskSize = 9656478720 (Bytes) = 9209 (MB)
We are going to set the new disk partition.
All data on this drive will be lost. Continue (Y/N) ? Y
Please specify the starting offset (in sectors) : 128
Please specify the partition length (in MB) (Max = 9209) : 5120
Done setting partition
---- New Partition information ----
StatringOffset = 65536
PartitionLength = 5368709120
HiddenSectors = 128
PartitionNumber = 1
PartitionType = 7

As it shows in the bottom illustration from above, the ESX server has set an offset, the (Virtual) Windows Machine has written it’s signature, and has set the offset to start writing data to the first block on the third disk in the Raid Group.

25 comments:

TheNCHeil said...

SAN Guy,
I read this post and have 2 questions/comments.
My understanding is if you create your ESX VMFS datastore via Virtual Center, it takes care of the alignment offset automatically since at least version 2.0.
Can you clarify your second point about setting the offset for the virtual machine. Are you referring to RDM only? It is my understanding from EMC World sessions last year, if your VM guest is created within a VMFS datastore (not RDM) there is no need to align within the guest OS. Can you clarify this point?
6 of use VMware/Clariion users are debating this point. Thanks,
Mike

san guy said...

This is from a white paper titled "VMware ESX Server 3.0
Recommendations for
Aligning VMFS Partitions"

Instructions for Guest File System Alignment:

Once you have aligned your VMware VMFS partitions, you also need to align the file system partitions within your virtual machines. The following sections discuss how to align guest
operating system partitions in Linux and Windows environments.

It goes on to discuss how to set the offset using fdisk, and diskpar for a virtual windows machine.

Anonymous said...

Hi San guy! Great articles in your blog:).One question, i've the clariion cx300 and vmware ESX 3.5.
Now i want to create a new LUN, i see the "Allignement offset (LBA") is 0, is it correct to put 128?
Thanks i nadvance

san guy said...

giacomo italy...if you are talking about the alignment offset being set to "0" at the LUN properties box in Navisphere, the answer is to leave it set to "0". You should not set the alignment offset via navisphere. It should be set through the host based utility, ie diskpart for windows 2003, etc...

Anonymous said...

Thanks :)

Anonymous said...

Yes, I mean in navisphere...thanks for your reply!

TheNCHeil said...

SAN guy,
Thank you for the white paper reference "VMware ESX Server 3.0
Recommendations for
Aligning VMFS Partitions". Question for you, this document does not mention RDM drives at all. Can you confirm that if you use an RDM drive you should NOT perform the alignment offset?
We have an open ticket with VMware on this topic and they have yet to produce a statement stating either way. The engineer "thinks" you should not align an RDM but it is not backed up with anything.
Thanks,
Mike

Anonymous said...

RDM's should be aligned according to the OS being installed on it. Windows will need alignment to the first 64k block.

Anonymous said...

I am soo bloody confused ! Nobody can give me a proper answer to this.
Ok so I"m playing with esxi on a CX300 and according to the techpaper if you created the datastore via VI then you're already aligned from the esx side of live.
You still need to align partitions inside the VM. From the paper:
"Note: Aligning the boot disk in the virtual machine is neither recommended nor required. Align only the data disks in the virtual machine."
Q1: By boot disk are they referring to "/boot" or the entire VM ?
Q2: By data disks are they referring to the fact that I might assign a new virtual disk from another datastore for fast i/o ?
In other words what do I do with "/" f.e. Does that need to be aligned or not ?

stucky said...

More questions I'd love to have answers for..

1. Why is it that a "windows signature" or "linux partition" causes an offset but the signature that LVM2 writes when you do a "pvcreate" on a lun does not ?
Aren't there plenty of other tools that write signatures to disks before using them ? (F.e oracle ASM does that too). What's so special about this one ?

2. Is a windows signature really the exact same size as a linux partition table ?

3. Do Clariions and DMX's really stripe luns the same exact way ? If not why am I told to align partitions to 128 blocks instead of 63 for both arrays ?

4. Do we not have this issue with Netapp luns since they're fake and really files on a WAFL filesystem and WAFL writes stuff all over the place out of our control anyway ?

stucky said...

Wait...you can't align "/" anyways since it'd destroy the data on it - right ?
Can someone please tell me that I don't need to worry about that anymore ?
If I was to assign another virtual disk to my VM for an application I assume I have to align that if I use a straight partition but not if I use LVM2 instead right ? Or does that only apply to an RDM ?
I thought if I create a datastore in VI it takes care of the alignment - is that for the ESX side only ?
Still very confused...

SAN Guy - HELP...

Jonathan said...

This command from the esx server side of things helped find the lun sd#

esxcfg-vmhbadevs

Anonymous said...

Who knows where to download XRumer 5.0 Palladium?
Help, please. All recommend this program to effectively advertise on the Internet, this is the best program!

Anonymous said...

There were three guys,World of warcraft gold an American and Newfoundlander.wow Power Leveling They were all going to be executed.knight noah The executioner said that since all three were to be ms mesos executed that night, flyff money that they would each get world of warcraft power leveling to choose the flyff money method archlord money by which wedding dresses they would die Their warhammer power leveling choices were:cabal alz lethal injection,cheap aion kinah electric chair Knight Gold or world of warcraft gold by hanging.he American was afraid of needles and did'nt Dragonica crone want to be hanged. flyff money The American 2moons dil chose the electric chair.cdkey He sat in the chair and they wow gold pulled the flyff gold switch and nothing happened.

Anonymous said...

Join us to start sharing your reviews, news about air jordans, and how to clues- (www.dragonkicks.com)this community here for everybody to use! we collect all air jordan shoes.
such the famous jordan shoes - air jordan 11. or air jordan 13 ,air jordan 23.
and the latest jordan 2010, by the way,
Cheap Cell Phones collection is also my hobby.

Anonymous said...

replica Rolex Day Date II : This particular Japanese movement Patek Philippe replica can certainly be the ultimate style statement for you. This is a fully automatic Rolex Milgauss replica , which requires no batteries at all. The case of the Burberry watches is made out of 440 Grade, solid stainless steel and certainly gives a polished stainless steel look. This trendy IWC replica watches is certainly a certified replica U-boat watches movement replica oris . The glass is absolutely scratch resistant as it is made out of genuine Croum replica watches . The Panerai watches is absolutely great in terms of functionality and also waterproof at the same time. The replica Rolex Explorer watches displays accurate markings all over it including the bands. This particular w Graham replica not only looks great but also very strong and durable at the same time just like the Rado watches itself.

A.Lange & Sohne watches movement Movado watches is not only a great piece of accessory but can also be a style statement for you. When you buy these fabulous replica watches from replica Breitling watches , you not only get a 1 year full replacement warranty on them but also receive full customer satisfaction provided by the store itself.

Sildenafil said...

A friend of mine asked me the other day for help, he wanted me to set the alignment offset on ESX Server and a virtual Windows server and told him I know about computer stuff and all, but I did not know how to set this up, but I will give him the link of your blog!

wedding dresses 2011 said...

this is a good blog

forgery said...

Hi… that was great stuff.. I really like this subject. Could you tell me more … I would love to explore.
Forged documents

anime cosplay costumes said...

anime cosplay costumes from http://www.webcosplaycostumes

long strapless dresses said...

I was recommended these sets by a friend, and I must say, this music is so good. Thank you. Please come out and do your thang on the West Coast someday soon.

Careprost said...

I really enjoy reding your posts as I learn a lot from them. I also broaden my thinking as far as what I can use and do with things

wholesale wedding dresses said...

I believe this really is excellent information. Most of men and women will concur with you and I ought to thank you about it

Kamagra said...

hey buddy,this is one of the best posts that I’ve ever seen; you may include some more ideas in the same theme. I’m still waiting for some interesting thoughts from your side in your next post.

flywowgoldkylin said...

Save On Jordan Shoes Retro Vii Jordan Shoes Retro Vii. Compare
http://www.powerfulkicks.com/
Jordan shoes
Jordan kicks
basketball shoes
cheap shoes
Jordan 11