Proven Practice

All of the Advanced Options for HA

Posted by Virtualbox on Friday, May 15, 2009 , under | comments (0)

We manage to get all the HA advanced options. With ESX 3.5 Update 2 VMware added a couple of extra advanced options, this is the complete list:

• das.failuredetectiontime - Amount of milliseconds, timeout time for isolation response action(with a default of 15000 milliseconds).

• das.isolationaddress[x] - IP address the ESX hosts uses for its heartbeat, where [x] = 1‐10. It will use the default gateway by default.

• das.usedefaultisolationaddress - Value can be true or false and needs to be set in case the default gateway, which is the default isolation address shouldn’t be used for this purpose.

• das.poweroffonisolation - Values are False or True, this is for setting the isolation response. Default a VM will be powered off.

• das.vmMemoryMinMB - Higher values will reserve more space for failovers.

• das.vmCpuMinMHz - Higher values will reserve more space for failovers.

• das.defaultfailoverhost - Value is a hostname, this host will be the primary failover host.

The new ones:

• das.failuredetectioninterval - Changes the heartbeat interval among HA hosts. By default, this occurs every second (1000 milliseconds).

• das.allowVmotionNetworks - Allows a NIC that is used for VMotion networks to be considered for VMware HA usage. This permits a host to have only one NIC configured for management and VMotion combined.

• das.allowNetwork[x] - Enables the use of port group names to control the networks used for VMware HA, where [x] = 0 - ?. You can set the value to be Service Console 2 or Management Network to use (only) the networks associated with those port group names in the networking configuration.

• das.isolationShutdownTimeout - Shutdown time out for the isolation response “Shutdown VM”, default is 300 seconds. In other words, if a VM isn’t shutdown clean when isolation response occurred it’s being powered off after 300 seconds.

VMFS Deep Dive

Posted by Virtualbox on Tuesday, April 28, 2009 , under mpath, scsi reservation conflict, vmfs | comments (0)

Slide 1

Agenda

VMFS Deep Dive

- ESX Storage Stack and VMFS

- VMFS Vs RDM

- SCSI reservation conflicts

- Multipathing

- Snapshot LUNs and resignaturing

The Storage Stack in VI3

VMFS – A Clustered filesystem for today’s dynamic IT world

Ø Built-In VMFS Cluster File System

Ø Simplifies VM provisioning

Ø Enables independent VMotion and HA restart of VMs in common LUN

Ø File-level locking protects virtual disks

Ø Separates VM and storage administration

Ø Use RDMs for access to SAN features

Raw Disk Mapping (RDM)

Mapping files in a VMFS volume

Ø Presented as virtual SCSI device

Ø Key contents of the metadata include location and locking of mapped device

Ø Virtual machine must interact with a real disk on the SAN

Ø Microsoft Cluster Services (MSCS)

Storage – VMFS vs. RDM

RAW VMFS

RAW may give better performance Leverage templates and quick provisioning

RAW means more LUNs - More provisioning time Fewer LUNs means you don’t have to watch Heap

Advanced features still work Scales better with Consolidated Backup

Preferred Method

Skeleton of a VMFS

A VMFS holds files and has its own metadata

Metadata gets updated through

- Creating a file

- Changing a file’s attributes

- Powering on a VM

- Powering off a VM

- Growing a file

When metadata is updated, the VMkernel places a non-persistent SCSI reservation on the entire VMFS volume
Lock held on volume for the duration of the operation
Other VMkernels are prevented from doing metadata updates

VMFS 3 & SCSI Reservations

Concurrent-access filesystem
Most I/O happens simultaneously from all hosts
Filesystem metadata updates are atomic and performed by the requesting host

- Locking a file for read/write (e.g. vmdk when powering on VM)
- Creating a new directory or file
- Growing a file etc.

For the time needed by the locking operation (NOT metadata update), a LUN is reserved (=locked for access) to a single host

SCSI Reservation Conflict – What it is

What happens if we try to perform I/O to a LUN that’s already reserved?

- A retry counter is decreased and the I/O operation is retried

- The retry is scheduled with a pseudo-random algorithm

- If the counter reaches 0, we have a SCSI reservation conflict

SCSI: 6630: Partition table read from device vmhba1:0:6 failed: SCSI reservation conflict (0xbad0022)

SCSI: vm 1033: 5531: Sync CR at 64

SCSI: vm 1033: 5531: Sync CR at 48

SCSI: vm 1033: 5531: Sync CR at 32

SCSI: vm 1033: 5531: Sync CR at 16

SCSI: vm 1033: 5531: Sync CR at 0

WARNING: SCSI: 5541: Failing I/O due to too many reservation conflicts

WARNING: SCSI: 5637: status SCSI reservation conflict, r

status 0xc0de01 for vmhba1:0:6. residual R 919, CR 0, ER 3

Who’s holding a SCSI Reservation?

One ESX host (persistent reservation)

- vmkfstools –L reserve : This should NEVER EVER be done

- Interaction with installed third-party management agents

Multiple ESX hosts, alternatively

- High latency/slow SAN

o Critical lock-passing between ESX hosts during vmotion

- SAN firmware slow in honoring SCSI reserve/release

o Synchronously mirrored LUNs

One non-ESX host

- LUN erroneously mapped to e.g. a Windows host

No host

- Persistent reservation held by the SAN

- Needs investigation by the SAN vendor

ESX Server Multipathing

Multipathing – vmhbaN:T:L:P notation

Determined at boot, install / rescan:

- N = adapter number

- T = target number (generally 1 SP = 1 target)

Determined by the SAN

- L = LUN ID

- SCSI identifier of the LUN (not shown here)

Determined at datastore or extent creation

- P = partition number (if 0 or absent = whole disk)

Per-LUN Multipathing Failover Policy

VMware supports using only one path at a time

- MRU = Most Recently Used

- Fixed = choose a preferred path & failback to it

- multiple ESX hosts or multiple LUNs, allows for manual load balancing between SPs

Never setup Fixed policy with an active/passive SAN! Why?

Path Thrashing

Ø Only possible on active/passive SANs

Ø Host 1 needs access to the LUN through SP1

Ø Host 2 needs access to the LUN through SP2

Ø The LUN keeps being trespassed between SPs and it’s never available for I/O

Multipathing

Active/Active

- LUNs presented on multiple Storage Processors

- Fixed path policy

Failover on NO_CONNECT

Preferred path policy

Failback to preferred path if it recovers

Active/Passive

- LUNs presented on a single Storage Processor

- MRU (Most Recently Used) path policy

Failover on NOT_READY, ILLEGAL_REQUEST or NO_CONNECT

No preferred path policy, no failback to preferred path

Load Balancing

- Fixed (Preferred Path)

1st active path discovered or user configured.

Active/Active arrays only

- Most recently used (MRU)

Active/Active arrays

Active/Passive arrays

Snapshot LUNs and Resignaturing

How VMware ESX Identifies Disks

Ø Each LUN has a SCSI identifier string provided by the SAN vendor

Ø The SCSI ID stays the same amongst different paths

Ø The vmkernel identifies disks with a combination of LUN ID, SCSI ID and part of the model string

# ls -l /vmfs/devices/disks/

total 179129968

-rwxrwxrwx 1 root root 72833679360 Nov 13 12:16 vmhba0:0:0:0

lrwxrwxrwx 1 root root 58 Nov 13 12:16 vmhba1:0:0:0 -> vml.020000000060060160432017002a547c3e7893dc11524149442035

lrwxrwxrwx 1 root root 58 Nov 13 12:16 vmhba1:0:1:0 -> vml.02000100006006016043201700a99d1c3bb9c5dc11524149442035

lrwxrwxrwx 1 root root 58 Nov 13 12:16 vmhba1:0:10:0 -> vml.02000a000060060160432017000db2f61d17d3dc11524149442035

(...)

Snapshot LUNs & Resignaturing – Key Facts

Ø ESX identifies objects in a VMFS datastore by path e.g. /vmfs/volumes//

Ø The VMFS UUID (aka signature) is generated at VMFS creation

Ø The VMFS header includes hashed information about the disk where it’s been created

The Check for Snapshot LUNs

- VMFS relies on SCSI reservations to acquire on-disk locks, which in turn enforce atomicity of filesystem metadata updates"

- SCSI reservations don’t work across mirrored LUNs

- To avoid corruption, we need to prevent mounting a datastore and a copy of it at the same time

Ø On rescan, the information about the disk in the VMFS header metadata (m/d) is checked against the actual values

Ø If any of the fields doesn’t match, the VMFS is not mounted and ESX complains it’s a snapshot LUN

LVM: 5739: Device vmhba1:0:1:1 is a snapshot:

LVM: 5745: disk ID:

LVM: 5747: m/d disk ID:

ALERT: LVM: 4903: vmhba1:0:1:1 may be snapshot: disabling access. See resignaturing section in SAN config guide.

LUNs Detected as Snapshots – Causes

Ø LUN ID mismatch

Ø SCSI ID change (e.g. LUN copied to a new SAN)

Ø They are effectively snapshots (e.g. DR site)

LUNs Detected as Snapshots – How to Fix

Are they mirrored/snapshot LUNs?

- If yes: will the ESX host(s) ever see both original and copy at the same time?

Yes – resignature

No – either allow snapshots or resignature

- If no: do multiple ESX hosts see the same LUN with different IDs?

Yes – fix the SAN config; if not possible allow snapshots

No – IDs permanently changed: either allows snapshots or resignature

Resignaturing Issues

Never ever resignature while the VMs are running

- resignaturing implies changing UUID and datastore name

- All paths to filesystem objects (vmdks, VMs) will become invalid!

How to Change the Polling Interval of the cmafcad Fiber Channel Agent

Posted by Virtualbox on Friday, April 17, 2009 , under cmafcad, polling interval | comments (1)

Information
***********
The HP Management Agents for VMware ESX Server 3.x include a Fiber Channel Agent (FCA agent) called cmafcad. If SCSI reservation conflicts on the ESX host are resulting in failed I/O or performance issues, it can be necessary to increase the polling interval of the Fiber Channel Agent. This can reduce the amount of SCSI reservation conflicts typical during peak business hours, VM startup, and VMotions.

Details
******
The following can be seen in the /var/log/vmkernel file:

WARNING: SCSI: 5446: Failing I/O due to too many reservation conflicts
WARNING: SCSI: 5541: status 0xbad0022, rstatus 0xc0de01 for vmhba1:0:0. residual R 919, CR 0, ER 3
WARNING: Fil3: 1538: Failed to reserve volume

NOTE: 0xbad0022 translates to VMK_RESERVATION_CONFLICT per vmkerrcode.

Although reservation conflicts are not always an indication of an issue, large amounts of reservation conflicts resulting in failing I/O are, and should be addressed. There are many things that can contribute to reservation conflicts in a Virtual Infrastructure environment. Be advised that the following is only one possible solution to this issue. Other possible causes should be investigated.

Increasing the polling interval of the FCA agent can reduce SCSI reservation conflicts on the host by decreasing the amount of reservations required on a given LUN.

The following steps show the procedure for increasing the polling interval:

1. Login to the ESX server from an SSH client or from iLO.
2. Using an editor such as nano or vi, open the file /opt/compaq/storage/etc/cmafcad.
3. Change the polling interval from 15 seconds to a larger span, such as 60 seconds.

Look for the variable PFLAGS. By default, it looks like this: PFLAGS="-p 15 -s OK"
Change it to the desired value: PFLAGS="-p 60 -s OK"

4. Save the file, and exit from the editor.
5. Restart the management agents on the host. The following shows how this is done with the 8.0.0 Management Agents. See the appropriate documentation or man pages for later agents.

# service hpasm stop
# service hpsmhd restart
# service hpasm start

The new settings should be the following in a ps listing:

# ps -auxwww | grep cmafcad
root 14557 0.0 0.9 14676 2452 pts/1 S 18:31 0:00 cmafcad -p 60 -s OK

Logo

All of the Advanced Options for HA

VMFS Deep Dive

How to Change the Polling Interval of the cmafcad Fiber Channel Agent

Counter

MAP

About Me

Categories

Mini Updates

About me

Logo

All of the Advanced Options for HA

VMFS Deep Dive

How to Change the Polling Interval of the cmafcad Fiber Channel Agent

Counter

MAP

About Me

Categories

Mini Updates

About me

Be my Friend ?