Hyper-V R2 Core host networking problem in VMM 2008 R2

This Friday i helped a customer with a little problem, they have a Hyper-V cluster with 4 nodes, after a switch firmware upgrade they experienced some networking instability and in their search of failure they accidentally unchecked the Host access checkbox in the networking properties on the net, as their networking configuration did not allow a separate nic for the management we had to set it up with this enabled. Of course when they applied this change the host lost connection.

I instructed them to use the 59manager to configure the host locally and set the host access on this virtual switch, this did not help cause the server would still not respond. I went over to help them on site, the first thing we did was to remove the host from the cluster and then start to test and when we removed the virtual switch the nic started to respond to ping but as soon as we re-added it to the virtual switch it stopped working. We also tried to remove the nic config and set it to dhcp and then add it to the virtual switch with the host access enabled, which did not work either. After this when we tried to set the IP in SCONFIG we got an error that stated that there was an error and the address could not be set. I thought it might be some bug in the team networking sw or drivers so we updated those as well, but no more luck there either..

Then we found the following site that described the exact same problem, Microsoft Enterprise Networking Team , in this blog they referred to a script that clean out the whole hosts virtual networking config, “nvspcrub.js” , with the /p option. Well we had to try something so we ran the script and cleared the hosts all virtual switches.  Then we added a new virtual switch and checked the Host access and tried to set the IP in SCONFIG, Still same error with “Can not set IP address”, After this we where almost on our way to give up and reinstall the host, then we thought of one last chance to set the IP through netsh (this after reading about a bug with SCONFIG) so with the command

netsh int ip set add "Local area connection 3" static 192.168.8.139 255.255.255.0 192.168.8.254

It actually worked and the host started to respond to ping 🙂 , quite frustrating that we removed all the config and then find out it was a bug in sconfig that was the causing the error.

Now we had to re-add all virtual networking switches, this was of course a perfect job for powershell, so i wrote a script that took one of the other hosts config and created the same virtual switches on the failed host, also connecting to the NIC corresponding to the right net and vlan

# Create Virtual Networks on Host
#
# Niklas Akerlund /RTS

# Take ref nic from another host

Add-PSSnapin Microsoft.SystemCenter.VirtualMachineManager
$VMMserver = Get-VMMServer sbgvmm01

$Networks = Get-VirtualNetwork | where {$_.VMHost -eq "HYP04.desso.se" -and $_.HostBoundVlanId -ne "3750"}

$NICs = Get-VMHostNetworkAdapter | where {$_.VMHost -eq "HYP01.desso.se"}

$VMHost = Get-VMhost -ComputerName "HYP01.desso.se"

foreach($Network in $Networks){
$split = $Network.Name -split ' '
if ($split[1] -eq "1"){
$Name = $split[0] + " " + $split[1]
$match = $split[2] -match "\d+"
$vlanid = $Matches[0]
$vlanid = [int]$vlanid
}elseif ($split[1] -like "VLAN*"){
$Name = $split[0]
$match = $split[1] -match "\d+"
$vlanid = $Matches[0]
$vlanid = [int]$vlanid
}else{
$Name = $split[0] + " " + $split[1]
$match = $split[2] -match "\d+"
$vlanid = $Matches[0]
$vlanid = [int]$vlanid
}

$HostNIC = Get-VMHostNetworkAdapter -VMHost $VMHost | where {$_.ConnectionName -eq $Name}

if ($HostNIC -ne $null){

New-VirtualNetwork -Name $Network.Name -VMHost $VMHost -VMHostNetworkAdapters $HostNIC -BoundToVMHost $FALSE
Set-VMHostNetworkAdapter -VMHostNetworkAdapter $HostNIC -VLANEnabled $TRUE -VLANMode "Trunk" -VLANTrunkID $vlanid
write-host $HostNIC.ConnectionName
Write-Host $vlanid
}
}

After i ran this and the host got all it´s virtual networks back i could add the host back to the cluster again. Instead of some typo errors with manually entering all the virtual switches, with some powershell we could be sure that we got the same config as the other host already in the cluster!

One thing that i first missed was the -BoundToVMHost $FALSE in the New-VirtualNetwork which resulted that all my virtual networks had the Host Access checkbox marked and i had one NIC for each of them on my host, this of course was not what i wanted, one could think that this would be false as default but for some reason MS and the VMM team thought different, well no worries i created a small script to just update my virtual networks with that option (the script above is corrected after my mistake), so i ran:

# Update networks with BoundtoHost $false
#
# Niklas Akerlund /RTS

$Networks = Get-VirtualNetwork | where {$_.VMHost -eq "HYP01.desso.se" -and $_.HostBoundVlanId -ne "3750"}

foreach ($Network in $Networks){
Set-VirtualNetwork -VirtualNetwork $Network -BoundToVMHost $false
}

Where the network with the VLAN 3750 was the one i wanted the host access to stay because it was the management nic of the host.

Hot-add CPU and Memory in a Win 2008 Datacenter with SQL in vSphere

I have today tested how it works to hot-add both memory and vCPU to a virtual machine running Windows 2008 R2 Datacenter Edition, this machine also has SQL 2008 R2 Enterprise edition installed.

First i had to enable hot-plug in the virtual machine, there is of course two ways to do this, either via the GUI or

the powerCLI way:

$vmConfigSpec = New-Object VMware.Vim.VirtualMachineConfigSpec
$mem = New-Object VMware.Vim.optionvalue
$mem.Key="mem.hotadd"
$mem.Value="true"
$vmConfigSpec.extraconfig += $mem
$cpu = New-Object VMware.Vim.optionvalue
$cpu.Key="vcpu.hotadd"
$cpu.Value="true"
$vmConfigSpec.extraconfig += $cpu

$vm = Get-VM TempNHot | Get-View
$vm.ReconfigVM_Task($vmConfigSpec)

when this has been executed on the particular VM i can then when it is running use powerCLI again and add resources to it.

Get-VM TempNHot | Set-VM -MemoryMB 3072 -NumCpu 4 -Confirm:$false

According to Microsoft documentation it is only supported to do hot-add of memory and CPU in Datacenter edition of Windows, Also regarding SQL you have to use either Enterprise or Datacenter edition to also get it working into the application.

To check that the new resources where used i tested with the SQLstress application to get some load on the SQL server and check the taskmgr, but it did not show that the load where spreading to the new added vCPU, after some research i found out that the SQL server do not start to use new hardware right away, it need to be reconfigured to schedule load on additional CPU´s, so there is a little manual intervention but no downtime on the server!

before:

So i started a query in SQL Management Studio and wrote

RECONFIGURE;
GO

After :

In the taskmanager i could now see that all four vCPU where equally loaded by the SQL server.

Maybe it is not so common to have to do this but if you set up a large Tier-1 SQL server in a virtual world you surely want to be able to hot-add resources when it is loaded. Think of the advantages that this brings when you actually both can add memory and cpu resources without any downtime!

In a virtualization world we always recommend our customers to buy Windows Datacenter licenses on their hosts so using it on a VM will not add any extra cost. The SQL server is of course quite a price jump from standard to enterprise but if your big SQL server uses more than 64 GB ram you will still need to use Enterprise licenses 🙂

 

TEC 2011 Successfully implement and Transition into Hyper-V Session

Some summary based on my own session i held at The Experts Conference 2011 in Frankfurt yesterday. I think it was about 40 people in the crowd. The TEC is about 350 attendees total.

When i have checked with Quest if i can put my whole presentation here i will do an updated posting, i will put some points what i think is crucial when setting up a new vitualization platform.

  • Assessment and Consolidation Planning
  • Design and Testing
  • Migration and Optimization
  • Capacity Planning and Performance follow up

When deciding for a new virtualization platform, no matter if it is the first or you are going to exchange an existing, there are some steps that need to be considered. First you have to know what you are running in your datacenter, what kind of operating systems and what kind of applications, also you must get a workload profile for those servers to know what their demands are. If you do not do your homework and plan for the load you will surely get some beating from your organization when you have virtualized the servers and they run like crap. As tools you can use the Microsoft Assessment and Planning Toolkit or if you already use SCCM/SCOM you will both have inventory and performance data. Another thing to consider when planning is licensing, in a big consolidation you can save quite a much money when using Datacenter licensing on the hosts.

Design your platform to be modular and easily expanded. Do serious deciding on what your boundaries are both technically and financially this should be done in a workshop with application owners and management and then documented. Do not forget about managing and monitoring software. Another thing to rethink is how to take backup in your platform, with Hyper-V integration tools you can take consistent backups with VSS snapshot support into the VMs, we recommend our customers to take backup on host level for quicker RTO. When you have decided for a platform you can do a PoC to test your decisions and see that it works as expected, Many hardware manufacturers do lend out hw for you to test for a limited time to evaluate.

When the platform is set up and correctly configured you want to do some hardware and load testing, there are several tools for this. Memtest, IOmeter, SQLiosim, Exchange jetstress, Exchange Load Gen and others. The most important thing to consider here is that you want to check that your new platform can handle the load you did measure and predict for in the analysis and design. Also test fail-over functionality so that all hardware and software works as expected when a PSU or a network cable brakes. After all testing has successfully been made you want to document this for later so you have a validation document signed.

After you have a platform set up and tested you want to start migrate and optimize workloads into this. There are some tools that can be used for this, SCVMM, disk2vhd and Quest vConverter for example. One thing to consider when doing migration is to look back at the analysis of the workloads to set the right amount of resources, both virtual processors/vm ram and vhd disk files with the partitions inside (for best performance we would like to use fixed or pass-through disks). When optimizing after migration you want to clean out hidden devices and services/software that was used for the machine in the physical world but do not have a purpose anymore!

When all your machines are migrated we want to continuously check the performance and capacity so you can prepare and implement additional host resources before it runs out. You can use SCOM/SCVMM if you have it in your environment, another great performance tool is the PAL (Performance Analysis of Logs ) that you can use in conjuction with performance counters and logman to schedule datasets on your Hyper-V Core host servers, also there is a product from Quest, vFoglight.

the last slide i had a strip from Dilbert that i find quite funny, statement though: WE DO NOT LEAVE OUR CUSTOMERS AS DOGBERT DOES AFTER A virtualization Project 😛

links

MAP 6.0

Memtest

PAL

Performance tuning win 2008 R2 SP1

 

Configure VM settings and vmdk´s with powerCLI

I want to share my latest automation scripting, i am in a project where we are in-sourcing from a hosting company. We have connected the hosts to the outsourcers NFS share, of course with powerCLI, when doing it this way i get the datastores on all servers in our cluster, without the risk of differences between the hosts datastores.

# Create NFS shares on all hosts
#
# Niklas Åkerlund /RTS
$NFSdatas = Import-Csv -Path "nfsdatastores.csv" -Delimiter ";"
$VIHosts = get-cluster -Name Cluster1 | get-vmhost | where {$_.ConnectionState -eq "Connected"}
foreach ($VIHost in $VIHosts){
foreach ($NFSdata in $NFSdatas){
$NFSHost = $NFSdata.Host
$NFSshare = $NFSdata.Share
$NFSShareName = $NFSdata.ShareName
if (($VIHost | Get-Datastore | where {$_.Name -eq $NFSShareName -and $_.type -eq "NFS"}-ErrorAction SilentlyContinue) -eq $null){
Write-Host "Monterar NFSstore $($NFSShareName) på $($VIHost)"
New-Datastore -Nfs -VMHost $VIHost -Name $NFSShareName -Path $NFSshare -NfsHost $NFSHost
}
}
}

Now when we have this in place, during the transitions the hosting company shut down the VMs on their hosts that we are going to take over.  And we add the VM to the inventory on our vCenter, when doing this the vmdk got a different datastore id in the config, also some settings should be updated to the corporate standard for the virtualization platform at the customer.

# Script to update VM with vmdk and right settings
#
# Argument in is VM name
# Niklas Akerlund / RTS AB 2011

$VMname = $args[0]
if ($VMname -ne $null){
$VM = Get-VM $VMname
$Datastore = Get-Datastore -VM $VM
$HDDs = Get-Harddisk -VM $VM

# Remove incorrect hdd referenes

Remove-HardDisk -HardDisk $HDDs -Confirm:$false

foreach ($HDD in $HDDs){
$HDDname = $HDD.Filename
$HDDsNames = $HDDname.Split("/")
$count = $HDDsNames.count
$VMdkName = $HDDsNames[$count-1]
#Write-Host $VMdkName
$diskpath = "[" + $Datastore.Name + "] " + $VM.Name + "/" + $VMdkName

#Write-Host $diskpath

New-HardDisk -VM $VM -DiskPath $diskpath
}

# Reconfigure VM Settings

$spec = new-object VMware.Vim.VirtualMachineConfigSpec
$spec.MemoryAllocation = New-Object VMware.Vim.ResourceAllocationInfo
$spec.MemoryAllocation.Limit = -1
$spec.CpuAllocation = New-Object VMware.Vim.ResourceAllocationInfo
$spec.CpuAllocation.Limit = -1
$spec.tools = New-Object VMware.Vim.ToolsConfigInfo
$spec.tools.toolsUpgradePolicy = "manual"
$spec.swapPlacement = "inherit"

$VM = $VM | get-view
$VM.ReconfigVM_Task($spec)
}

After this we start up the VM and later we do a storage vMotion of the VM to the customers FC-SAN

Get-VM theVM | Move-VM -Datastore fcdatastore1

Host Profiles and vmkernel ports with Jumbo Frames MTU 9000

Today i have found a limitation using host profiles and this together with a vmkernel port that has Mtu 9000 activated. maybe it has not been a requirement when designing the host profiles?

We set up the reference host with 4 vmkernel ports, one for management, one for vmotion, one for FT and one for NFS. The port that we wanted to use Jumbo frames for was the vmotion port.

As i wrote in an earlier post, I used the powerCLI to configure the Mtu for the actual vmkernel port Get-VmhostNetworkAdapter -Name vmk1 | Set-VmhostNetworkAdapter -Mtu 9000.

Then i add this host as an reference host in the Host Profiles and attach it to the cluster. Adding a new host and then Apply profile, creates all our vmkernel ports correctly but when checking what Mtu the vmotion vmkernel port got, it is created with the default Mtu of 1500. This is not so good because i do not want to use several different ways to configure and i want to be able to trust the Host Profiles solution. The only vmkernel port that was created before applying host profiles was the management port so it has nothing to do with editing exisiting. So the result is that i need to after applying a host profile, run a powerCLI command to edit the Mtu.

Strangely no matter if the Mtu is 9000 or 1500 the hosts are compliant in the GUI..

This applies to vSphere 4.1 u1 (i do not know how this behaves in vSphere 5)

Conclusion of this is that I have to think a bit more about using the Host profiles. If it is not fully implemented then it is not usable to get uniform hosts.

Edit vmkernel port MTU on distributed switches – using PowerCLI

According to the KB 1038827 “Enabling Jumbo Frames for VMkernel ports in a virtual distributed switch”, VMware says that you have to recreate the vmkernel port to set the MTU for jumbo frames. This is not true if you use powerCLI, I do not know exactly how it is done beneath the hood but it is very easy to configure using quite a few lines scripting..  By the way, there is no way in the GUI to edit this.

$cred = Get-Credential
Connect-VIServer ESXhost.test.loc -credential $cred

Get-VMHostNetworkadapter -name vmk2 | Set-VMHostnetworkadapter -Mtu 9000

Get-VMHostnetworkadapter -name vmk2 | ft Mtu

Setting the Mtu on the vmkernel port is basically not different using a standard vSwitch or a distributed vSwitch.

Of course you can connect to a vcenter and add a foreach loop to set the Mtu for more than one host vmkernel port.

VMware distributed switches and PowerCLI/Onyx

I have had the opportunity to do some PowerCLI scripting on an installation where we have  vDS (virtual Distributed Switch). In the PowerCLI there is not so much cmdlets for the distributed switches, that is kind of awkward as there is so many cmdlets for everything else.. Luckily LucD had made some nice functions for me to use when creating the port groups.

I used his function for creating port groups, as the customer had about 20 vlans that needed to be added it was a perfect match to do it by powerCLI because setting up this manually is boring! So i had a csv file with the name and vlan id which i ran through in a foreach loop, then all was done.


# Create Distributed virtual portgroups for each VLAN
# Niklas Åkerlund / RTS AB 2011-09-09
#

$Datacenter = "datacenter"
$vDSName = "dvswitch01"
$vDSPortGroupPorts = 128

# Call Functions from motherscript
. .\Set-vDS-Porgroup-functions2.ps1

$vDS = Get-dvSwitch -DataCenterName $Datacenter -dvSwitchName $vDSName
#Write-Host $vDS
$vlans = Import-Csv vlan.csv -Delimiter ";"

foreach ($vlan in $vlans){
$name = $vlan.Name
$vlanid = $vlan.VLAN
if ($name -ne ""){
Write-Host $name
New-dvSwPortgroup $vDS $name -PgNumberPorts $vDSPortGroupPorts -PgVlanType "VLAN" -PgVlanId $vlanid
}
}

But then we realized that we needed to change some settings with both the security and load balancing so i had to remove all my port groups and start over.. I did not want to remove them manually and the powerCLI cmdlet that removes standard port groups could not be used on a vDS, I did not find the code from LucD in his blog to remove a vDS port group so i came up with the brilliant idea to use Onyx, it is a tool from VMware Labs that interprets the traffic between the vSphere Client and the vCenter and transform it to powerCLI code or .Net or SOAP or Javascript.

I then after starting this tool connected to my vCenter and through the vSphere Client removed a vDS port group, i got the powerCLI code (which I probably could have found out being a bit smarter in powershell/powerCLI without Onyx, but now I´m not :-P) So i did a small script to find all my vDS port groups and remove them.. Note that i cannot remove a vDS port group that already has been populated with connected VM´s.


# Remove vds port groups
#
# Niklas Åkerlund / Real Time Services AB

$vlans = Import-Csv vlan.csv -Delimiter ";"

$PGs = Get-VirtualPortGroup

foreach ($vlan in $vlans){
foreach ($PG in $PGs){
if ($vlan.Name -eq $PG.Name){
$destroy = $PG.Id
#Write-Host $destroy
$pek = Get-View -Id $destroy
$pek.Destroy_Task()
}
}
}

And now i could run the add script again with the added parameters for more security and load balancing.

New-dvSwPortgroup $vDS $name -PgNumberPorts $vDSPortGroupPorts<code>
-PgVlanType "VLAN" -PgVlanId $vlanid -SecPolMacChanges:$false</code>
-SecPolForgedTransmits:$false -TeamingPolicy "loadbalance_loadbased"

Win 8 Server dev preview and Hyper-V NIC team

There is quite a buzz out on twitter and blogs about the new features that has come to Windows 8 and the new Hyper-V version. I want to give you a little heads up about how it works to create network team with NICs (yes it works with different nic cards. in my case a Intel and a Broadcom)

I have now installed the server on my test-machine in our office and was eager to test the NIC teaming, at first i did not understand how it was working and tried to bind two nics together in the network connections window in the control panel, as i later realized and read in Aidan Finns´s blog, that it is done through the LBFOAdmin.exe (this is opened when pressing Nic Teaming Enabled/Disabled)

There you have to highligt your server to configure it, as the new server manager can handle remote servers and you can configure several workloads at the same time and you do not have to log in to each server to administer it.

I have named my team to NET2000 and added the two nics, i have also set it to be switch independent (i have actually set it in a simple 5 port switch), you can also chose LACP or Static Teaming. For Load Distribution mode you can chose Address Hash or Hyper-V port (now i am sharing the team with the management and a hyper-v switch so i am using the Address Hash.

As yo can see i can then add several virtual nics with different vlan id. I really hope that the fix one issue though, as you can see here i have a virtual nic interface called VMnet, when i then want to add this in the hyper-v manager it does have a different name as you can see in the next screenshot. It would have been wonderful to be able to se the Name also in the virtual switch manager.

As i before had to use the same network cards from the same manufacture and use their teaming software this is a giant step forward with the win 8 and the built in teaming functions. One thing to test later when i get my hands on a nic that can handle SR-IOV is how that feature works with a team, but that is  another blog post!

 

Novell Platespin Forge upgrade

The past two days i have been upgrading a Platespin Forge from 2.5 to 3.1 on a Forge 510 Appliance, this runs VMware VI 3.5 Update 4.

I think that the Forge appliance is a really good product for companies that have a need for a Disaster Recovery solution. If you want to read more about it click here.

The customer had bought the appliance for two years ago and has not had any time to set it up and start replicating workloads.

The appliance is a customized Dell HW with a custom VI 3.5 installation, We could not upgrade it to vSphere, the only update on the Novell site is the VI 3.5 U5. We tried to upgrade via the vSphere Client Host Upgrade Utility but got a failure, we also tried the hostupgrade.sh script also failing. We have started a support case asking novell how to do and i will update the blog when i get the right procedures.

The next trouble we went into was when we tried to upgrade the Forge Management VM software from 2.5 to 3.1, The installation succeeds but when we check the gui we do not have an protect container which is kind of vital because without it we cannot start any protection of workloads, if we checked with the Platespin browser executable we could see it there but not in the web gui. The not so obvious solution to this was to do a two part upgrade, first update all windows patches and then upgrade to version 3.0.2 and verify that the container for protection was still there and working, after that we could proceed with the upgrade to Forge 3.1 (which is of today the latest version) and after this the protect container was there and refreshed correctly. Thank God for VM snapshots that we took after each step so we easily could go back after each failed step!

Although the upgrade steps in the documentation did not work for us i can recommend it because Platespin has always done a good job on writing  and explaining in their product documents.

Some strange issues regarding when we add the Management VM to their domain and install AV is left but that is another support case.

 

VMware vCenter and VMware vCenter Update Manager 11

After the vacation this summer i have had much to do and not any time for blogging, i will try to behave better and keep you readers updated in my findings..

I just want to clarify for those of you running several vCenter installations for your different virtualization platforms and use vCenter Update manager for updating your hosts.

When you install the vCenter update manager you can only add one vCenter and there is no support for using the same Update manager for several vCenter instances. From a management point of view it would have been a nice feature to be able to use the same vCenter Update Manager for several vCenter instances in a linked mode, as you would only have one to handle.

In the Update Manager documentation it clearly says : “The Update Manager installation requires a connection with a single vCenter Server instance. ”  link to vSphere 5.0 vum installation documentation is here , This is not new for the 5.0 and is also the case for earlier versions of vCenter and VUM