We have been working on the process for performing the upgrade on the full UCS system to go from 1.0 to 1.1 code, which will support the VIC interface card (Palo Card).
Our approach has been to take things one step at a time and to understand the process as we move forward. We downloaded the 1.1(1l) bin file and release notes and have gone through them. Pretty basic info in the notes, steps you through each part of the process well and provides enough info to do it. However, since we are in production we are using caution, I do not want any surprises.
Steps in full UCS Code Upgrade:
1. Server Interface firmware upgrade, requires a reboot.
2. Server BMC (KVM) firmware upgrade, reboot is non-disruptive.
3. Chassis FEX (fabric extender), requires reboot.
4. UCS Manager, requires reboot.
5. Fabric Interconnect (6120), requires a reboot.
First step was to open a ticket with Cisco. This allows us to document everything in the ticket, allows Cisco to use it to pull in the correct resources and gives us a single point of contact. Next we requested a conference call with TAC and key engineers at Cisco to talk through the process, things to check beforehand, what has worked in their labs, and other customers.
Items like making sure your Ethernet connections northbound (in our case a 6500) have spanning-tree port-fast enabled, confirm your servers are using some type of NIC teaming or have failover enabled on the vNIC in UCSM, and understanding the differences in UCSM between “update firmware” (moves the new code to the device, stages it) and “activate firmware” (moves the new code into startup position so next reboot it takes effect). Special note, when you activate firmware if you do NOT want the item to automatically reboot you need to check the box for “Set Startup Version Only”. This will move the new firmware to the “startup” position but not perform the reboot. When the activation is complete the item will be in “Active Status: pending-next-reboot”. However, you do not have this option when doing the BMC (KVM) update. The BMC update reboot does not effect the server system, so it will reboot once the BMC code is activated.
Most of this information can be found in the release notes and I am sure you don’t want me to just reiterate it to you. We have been through all of the upgrade steps the first week UCS arrived so I am pretty confident the process will work. The challenge now is how to do it with a production system. I am fortunate to have only 17 blades in production. This leaves me with the 1 spare blade with an Emulex mezz. card and 6 spare blades with the VIC mezz. cards. This provides me with a test pool of blades (Note: these are only “spare” until we get through the upgrade process, then they will go into the VMware cluster and be put to use).
Goal: Upgrade UCS as Close to Non-Disruptive as Possible:
Our concern is will a blade function normally running 1.1 code when all the other components are still on 1.0 code? I suspect it will. If it does work, the plan is to take our time to update all of the blades over a week or so following the below steps.
This week we updated and activated the firmware on the Interface Cards for all 7 spare blades. Next we did the same thing for the BMC firmware for the 7 spare blades.
Next step is to put an ESX host into maintenance mode and move the Service Profile to the blade running 1.1 code with an Emulex mezz. card. We can confirm the ESX host functions and then move some test, dev, and then low end prod servers to this ESX host. This will allow us to develop a comfort level with the new code. This step should fit into our schedule early next week. If we see no issues we will be able to proceed with the process of putting additional ESX hosts into Maintenance Mode, then in UCSM update the firmware, activate firmware and then reboot the blade. This process will allow the ESX cluster blades get fully updated with no impact on the applications.
For our W2K8 server running vCenter we will perform a backup and schedule a 15 min downtime to reboot it and move the Service Profile to our spare blade running 1.1 code with the Emulex mezz. card. We can then confirm functionality, etc. If there are issues we can just move the Service Profile back to the server running 1.0 code and we are back in service. This same process will be repeated for the 2 – W2K3 servers running on UCS blades.
By using the flexibility built into VMware that we all use and love (vMotion, DRS, etc.) and the hardware abstraction provided by the UCS Service Profiles the process of updating the blade interface and BMC firmware should be straight forward with minimal impact on the end users. I will give an update next week . . .
Well the day finally came when I recieved my first group of UCS blades with the new UCS M81KR Virtual Interface Card (VIC) or what has been known as the Palo Card. This is the cool CNA built by Cisco specifically to add a great deal of flexibility to the I/O needs of virtual host servers (ok, mainly focused on VMware ESX 4.x, where all the cool virtualization is happening!).
I should have taken a picture of it! Gone are the Emulex or QLogic stamped name on the mezzanine card. The VIC provides all of the I/O function to the server blade. It is a single card with 2 – 10 GB FCoE ports to the northbound switches and then up to 128 virtual I/O interfaces facing the server/host side.
To be able to manage and build your own customized I/O world for an ESX host or guest machines you have to perform a code upgrade to your UCS system. Once that code upgrade is complete, you see Cisco has added an additional tab in the UCS Manager to be used for configuring the new virtual I/O functions. Note, I have not seen this new tab yet, we are currently planning our code upgrade process. I am interested to see how it goes upgrading the firmware, etc. on a production UCS system. I am sure I will blog about it!
So what does my world currently look like? I have 2 – 6120 Fabric Interconnects, 3 chassis and 25 B200 M1 blade servers (yes, I need to get a 4th chassis to house my 25th blade). 19 of the B200 blades contain the Emulex CNA and 6 B200 blades with the new VIC CNA. I currently have my new “VIC” blades in the chassis but not in use. The UCS manager sees the new blades, can tell me about the VIC, displays the interfaces differently (no virtual vNIC or vHBAs have been created yet).
Stay tune for an update on the code upgrade and screen shots of the new Virtual Tab, etc.
Here is the link at Cisco for details: