Hyper-V R2 Core host networking problem in VMM 2008 R2

This Friday i helped a customer with a little problem, they have a Hyper-V cluster with 4 nodes, after a switch firmware upgrade they experienced some networking instability and in their search of failure they accidentally unchecked the Host access checkbox in the networking properties on the net, as their networking configuration did not allow a separate nic for the management we had to set it up with this enabled. Of course when they applied this change the host lost connection.

I instructed them to use the 59manager to configure the host locally and set the host access on this virtual switch, this did not help cause the server would still not respond. I went over to help them on site, the first thing we did was to remove the host from the cluster and then start to test and when we removed the virtual switch the nic started to respond to ping but as soon as we re-added it to the virtual switch it stopped working. We also tried to remove the nic config and set it to dhcp and then add it to the virtual switch with the host access enabled, which did not work either. After this when we tried to set the IP in SCONFIG we got an error that stated that there was an error and the address could not be set. I thought it might be some bug in the team networking sw or drivers so we updated those as well, but no more luck there either..

Then we found the following site that described the exact same problem, Microsoft Enterprise Networking Team , in this blog they referred to a script that clean out the whole hosts virtual networking config, “nvspcrub.js” , with the /p option. Well we had to try something so we ran the script and cleared the hosts all virtual switches.  Then we added a new virtual switch and checked the Host access and tried to set the IP in SCONFIG, Still same error with “Can not set IP address”, After this we where almost on our way to give up and reinstall the host, then we thought of one last chance to set the IP through netsh (this after reading about a bug with SCONFIG) so with the command

netsh int ip set add "Local area connection 3" static 192.168.8.139 255.255.255.0 192.168.8.254

It actually worked and the host started to respond to ping 🙂 , quite frustrating that we removed all the config and then find out it was a bug in sconfig that was the causing the error.

Now we had to re-add all virtual networking switches, this was of course a perfect job for powershell, so i wrote a script that took one of the other hosts config and created the same virtual switches on the failed host, also connecting to the NIC corresponding to the right net and vlan

# Create Virtual Networks on Host
#
# Niklas Akerlund /RTS

# Take ref nic from another host

Add-PSSnapin Microsoft.SystemCenter.VirtualMachineManager
$VMMserver = Get-VMMServer sbgvmm01

$Networks = Get-VirtualNetwork | where {$_.VMHost -eq "HYP04.desso.se" -and $_.HostBoundVlanId -ne "3750"}

$NICs = Get-VMHostNetworkAdapter | where {$_.VMHost -eq "HYP01.desso.se"}

$VMHost = Get-VMhost -ComputerName "HYP01.desso.se"

foreach($Network in $Networks){
$split = $Network.Name -split ' '
if ($split[1] -eq "1"){
$Name = $split[0] + " " + $split[1]
$match = $split[2] -match "\d+"
$vlanid = $Matches[0]
$vlanid = [int]$vlanid
}elseif ($split[1] -like "VLAN*"){
$Name = $split[0]
$match = $split[1] -match "\d+"
$vlanid = $Matches[0]
$vlanid = [int]$vlanid
}else{
$Name = $split[0] + " " + $split[1]
$match = $split[2] -match "\d+"
$vlanid = $Matches[0]
$vlanid = [int]$vlanid
}

$HostNIC = Get-VMHostNetworkAdapter -VMHost $VMHost | where {$_.ConnectionName -eq $Name}

if ($HostNIC -ne $null){

New-VirtualNetwork -Name $Network.Name -VMHost $VMHost -VMHostNetworkAdapters $HostNIC -BoundToVMHost $FALSE
Set-VMHostNetworkAdapter -VMHostNetworkAdapter $HostNIC -VLANEnabled $TRUE -VLANMode "Trunk" -VLANTrunkID $vlanid
write-host $HostNIC.ConnectionName
Write-Host $vlanid
}
}

After i ran this and the host got all it´s virtual networks back i could add the host back to the cluster again. Instead of some typo errors with manually entering all the virtual switches, with some powershell we could be sure that we got the same config as the other host already in the cluster!

One thing that i first missed was the -BoundToVMHost $FALSE in the New-VirtualNetwork which resulted that all my virtual networks had the Host Access checkbox marked and i had one NIC for each of them on my host, this of course was not what i wanted, one could think that this would be false as default but for some reason MS and the VMM team thought different, well no worries i created a small script to just update my virtual networks with that option (the script above is corrected after my mistake), so i ran:

# Update networks with BoundtoHost $false
#
# Niklas Akerlund /RTS

$Networks = Get-VirtualNetwork | where {$_.VMHost -eq "HYP01.desso.se" -and $_.HostBoundVlanId -ne "3750"}

foreach ($Network in $Networks){
Set-VirtualNetwork -VirtualNetwork $Network -BoundToVMHost $false
}

Where the network with the VLAN 3750 was the one i wanted the host access to stay because it was the management nic of the host.

Hot-add CPU and Memory in a Win 2008 Datacenter with SQL in vSphere

I have today tested how it works to hot-add both memory and vCPU to a virtual machine running Windows 2008 R2 Datacenter Edition, this machine also has SQL 2008 R2 Enterprise edition installed.

First i had to enable hot-plug in the virtual machine, there is of course two ways to do this, either via the GUI or

the powerCLI way:

$vmConfigSpec = New-Object VMware.Vim.VirtualMachineConfigSpec
$mem = New-Object VMware.Vim.optionvalue
$mem.Key="mem.hotadd"
$mem.Value="true"
$vmConfigSpec.extraconfig += $mem
$cpu = New-Object VMware.Vim.optionvalue
$cpu.Key="vcpu.hotadd"
$cpu.Value="true"
$vmConfigSpec.extraconfig += $cpu

$vm = Get-VM TempNHot | Get-View
$vm.ReconfigVM_Task($vmConfigSpec)

when this has been executed on the particular VM i can then when it is running use powerCLI again and add resources to it.

Get-VM TempNHot | Set-VM -MemoryMB 3072 -NumCpu 4 -Confirm:$false

According to Microsoft documentation it is only supported to do hot-add of memory and CPU in Datacenter edition of Windows, Also regarding SQL you have to use either Enterprise or Datacenter edition to also get it working into the application.

To check that the new resources where used i tested with the SQLstress application to get some load on the SQL server and check the taskmgr, but it did not show that the load where spreading to the new added vCPU, after some research i found out that the SQL server do not start to use new hardware right away, it need to be reconfigured to schedule load on additional CPU´s, so there is a little manual intervention but no downtime on the server!

before:

So i started a query in SQL Management Studio and wrote

RECONFIGURE;
GO

After :

In the taskmanager i could now see that all four vCPU where equally loaded by the SQL server.

Maybe it is not so common to have to do this but if you set up a large Tier-1 SQL server in a virtual world you surely want to be able to hot-add resources when it is loaded. Think of the advantages that this brings when you actually both can add memory and cpu resources without any downtime!

In a virtualization world we always recommend our customers to buy Windows Datacenter licenses on their hosts so using it on a VM will not add any extra cost. The SQL server is of course quite a price jump from standard to enterprise but if your big SQL server uses more than 64 GB ram you will still need to use Enterprise licenses 🙂

 

TEC 2011 Successfully implement and Transition into Hyper-V Session

Some summary based on my own session i held at The Experts Conference 2011 in Frankfurt yesterday. I think it was about 40 people in the crowd. The TEC is about 350 attendees total.

When i have checked with Quest if i can put my whole presentation here i will do an updated posting, i will put some points what i think is crucial when setting up a new vitualization platform.

  • Assessment and Consolidation Planning
  • Design and Testing
  • Migration and Optimization
  • Capacity Planning and Performance follow up

When deciding for a new virtualization platform, no matter if it is the first or you are going to exchange an existing, there are some steps that need to be considered. First you have to know what you are running in your datacenter, what kind of operating systems and what kind of applications, also you must get a workload profile for those servers to know what their demands are. If you do not do your homework and plan for the load you will surely get some beating from your organization when you have virtualized the servers and they run like crap. As tools you can use the Microsoft Assessment and Planning Toolkit or if you already use SCCM/SCOM you will both have inventory and performance data. Another thing to consider when planning is licensing, in a big consolidation you can save quite a much money when using Datacenter licensing on the hosts.

Design your platform to be modular and easily expanded. Do serious deciding on what your boundaries are both technically and financially this should be done in a workshop with application owners and management and then documented. Do not forget about managing and monitoring software. Another thing to rethink is how to take backup in your platform, with Hyper-V integration tools you can take consistent backups with VSS snapshot support into the VMs, we recommend our customers to take backup on host level for quicker RTO. When you have decided for a platform you can do a PoC to test your decisions and see that it works as expected, Many hardware manufacturers do lend out hw for you to test for a limited time to evaluate.

When the platform is set up and correctly configured you want to do some hardware and load testing, there are several tools for this. Memtest, IOmeter, SQLiosim, Exchange jetstress, Exchange Load Gen and others. The most important thing to consider here is that you want to check that your new platform can handle the load you did measure and predict for in the analysis and design. Also test fail-over functionality so that all hardware and software works as expected when a PSU or a network cable brakes. After all testing has successfully been made you want to document this for later so you have a validation document signed.

After you have a platform set up and tested you want to start migrate and optimize workloads into this. There are some tools that can be used for this, SCVMM, disk2vhd and Quest vConverter for example. One thing to consider when doing migration is to look back at the analysis of the workloads to set the right amount of resources, both virtual processors/vm ram and vhd disk files with the partitions inside (for best performance we would like to use fixed or pass-through disks). When optimizing after migration you want to clean out hidden devices and services/software that was used for the machine in the physical world but do not have a purpose anymore!

When all your machines are migrated we want to continuously check the performance and capacity so you can prepare and implement additional host resources before it runs out. You can use SCOM/SCVMM if you have it in your environment, another great performance tool is the PAL (Performance Analysis of Logs ) that you can use in conjuction with performance counters and logman to schedule datasets on your Hyper-V Core host servers, also there is a product from Quest, vFoglight.

the last slide i had a strip from Dilbert that i find quite funny, statement though: WE DO NOT LEAVE OUR CUSTOMERS AS DOGBERT DOES AFTER A virtualization Project 😛

links

MAP 6.0

Memtest

PAL

Performance tuning win 2008 R2 SP1

 

Configure VM settings and vmdk´s with powerCLI

I want to share my latest automation scripting, i am in a project where we are in-sourcing from a hosting company. We have connected the hosts to the outsourcers NFS share, of course with powerCLI, when doing it this way i get the datastores on all servers in our cluster, without the risk of differences between the hosts datastores.

# Create NFS shares on all hosts
#
# Niklas Åkerlund /RTS
$NFSdatas = Import-Csv -Path "nfsdatastores.csv" -Delimiter ";"
$VIHosts = get-cluster -Name Cluster1 | get-vmhost | where {$_.ConnectionState -eq "Connected"}
foreach ($VIHost in $VIHosts){
foreach ($NFSdata in $NFSdatas){
$NFSHost = $NFSdata.Host
$NFSshare = $NFSdata.Share
$NFSShareName = $NFSdata.ShareName
if (($VIHost | Get-Datastore | where {$_.Name -eq $NFSShareName -and $_.type -eq "NFS"}-ErrorAction SilentlyContinue) -eq $null){
Write-Host "Monterar NFSstore $($NFSShareName) på $($VIHost)"
New-Datastore -Nfs -VMHost $VIHost -Name $NFSShareName -Path $NFSshare -NfsHost $NFSHost
}
}
}

Now when we have this in place, during the transitions the hosting company shut down the VMs on their hosts that we are going to take over.  And we add the VM to the inventory on our vCenter, when doing this the vmdk got a different datastore id in the config, also some settings should be updated to the corporate standard for the virtualization platform at the customer.

# Script to update VM with vmdk and right settings
#
# Argument in is VM name
# Niklas Akerlund / RTS AB 2011

$VMname = $args[0]
if ($VMname -ne $null){
$VM = Get-VM $VMname
$Datastore = Get-Datastore -VM $VM
$HDDs = Get-Harddisk -VM $VM

# Remove incorrect hdd referenes

Remove-HardDisk -HardDisk $HDDs -Confirm:$false

foreach ($HDD in $HDDs){
$HDDname = $HDD.Filename
$HDDsNames = $HDDname.Split("/")
$count = $HDDsNames.count
$VMdkName = $HDDsNames[$count-1]
#Write-Host $VMdkName
$diskpath = "[" + $Datastore.Name + "] " + $VM.Name + "/" + $VMdkName

#Write-Host $diskpath

New-HardDisk -VM $VM -DiskPath $diskpath
}

# Reconfigure VM Settings

$spec = new-object VMware.Vim.VirtualMachineConfigSpec
$spec.MemoryAllocation = New-Object VMware.Vim.ResourceAllocationInfo
$spec.MemoryAllocation.Limit = -1
$spec.CpuAllocation = New-Object VMware.Vim.ResourceAllocationInfo
$spec.CpuAllocation.Limit = -1
$spec.tools = New-Object VMware.Vim.ToolsConfigInfo
$spec.tools.toolsUpgradePolicy = "manual"
$spec.swapPlacement = "inherit"

$VM = $VM | get-view
$VM.ReconfigVM_Task($spec)
}

After this we start up the VM and later we do a storage vMotion of the VM to the customers FC-SAN

Get-VM theVM | Move-VM -Datastore fcdatastore1

Host Profiles and vmkernel ports with Jumbo Frames MTU 9000

Today i have found a limitation using host profiles and this together with a vmkernel port that has Mtu 9000 activated. maybe it has not been a requirement when designing the host profiles?

We set up the reference host with 4 vmkernel ports, one for management, one for vmotion, one for FT and one for NFS. The port that we wanted to use Jumbo frames for was the vmotion port.

As i wrote in an earlier post, I used the powerCLI to configure the Mtu for the actual vmkernel port Get-VmhostNetworkAdapter -Name vmk1 | Set-VmhostNetworkAdapter -Mtu 9000.

Then i add this host as an reference host in the Host Profiles and attach it to the cluster. Adding a new host and then Apply profile, creates all our vmkernel ports correctly but when checking what Mtu the vmotion vmkernel port got, it is created with the default Mtu of 1500. This is not so good because i do not want to use several different ways to configure and i want to be able to trust the Host Profiles solution. The only vmkernel port that was created before applying host profiles was the management port so it has nothing to do with editing exisiting. So the result is that i need to after applying a host profile, run a powerCLI command to edit the Mtu.

Strangely no matter if the Mtu is 9000 or 1500 the hosts are compliant in the GUI..

This applies to vSphere 4.1 u1 (i do not know how this behaves in vSphere 5)

Conclusion of this is that I have to think a bit more about using the Host profiles. If it is not fully implemented then it is not usable to get uniform hosts.