Some of the VMs hosted on VMware ESXi 5.5 (DELL PowerEdge T310 host) occasionally become unresponsive.
All attempts to login to the host via VMware vSphere Client fails with following error:
Could Not Connect
vSphere Client could not connect to "10.5.5.25". An unknown connection error occurred. (The request failed because the remote server took too long to respond. (The operation has timed out))
VMware console vmkernel system logs (ALT + F12) display following errors:
mptscsih: ioc0: task abort: FAILED (sc=0x412e8182ecc0)
WARNING: LinScsi: SCSILinuxAbortCommands: 1843: Failed, Driver MPT SAS Host, for vmhba2
WARMING : ScsiPath: 6292: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status
mptscsih: ¡oc0: attempting task abort! (sc=0x412e81718200)
The vmbha2 refers to a RAID-1 SATA array that had all affected virtual machines hosted.
A temporary fix was to reboot the VMware host server. However, the same issue would occur few days or few weeks later.
DELL RAID controller and SATA hard drives didn't report any issues. Upgrading DELL motherboard BIOS and RAID controller firmware didn't help. Installing the latest VMware ESXi patch (U3b) also didn't make any difference.
After some research I found that similar issues can be caused by Interrupt Remapping used by VMware ESXi. In theory this should improve performance, but it can be incompatible with certain server hardware configurations. To fix the issue I had to disable the Interrupt Remapping:
- Download and install VMware vSphere PowerCLI
- Connect to the server: Connect-VIServer -Server 10.5.5.25 -User root -Password ********
- Expose the ESXCLI functionality: $myesxcli = Get-EsxCli -VMHost 10.5.5.25
- Check existing value of iovDisableIR: $myesxcli.system.settings.kernel.list() | More
Existing value should be False
- Now disable Interrupt Remapping: $myesxcli.system.settings.kernel.set("iovDisableIR","TRUE")
- Check that iovDisableIR value is now set to True: $myesxcli.system.settings.kernel.list() | More
DELL PowerEdge T310
VMware ESXi 5.5