Pages

Wednesday, February 03, 2010

My Cisco UCS system in the lab can't talk to anything



If you are lucky enough to get a Cisco UCS system for your lab you might get a bit confused if your run it up and don't connect it to any upstream switches. That is you try to just use the blades to talk to each other.

The reason is that the default mode for the Fabric-Interconnect is End Host Mode and by default the uplink fail action is link down, so the NICs on your blades look down.

I have seen people hit this problem so thought I would quickly write something up.

Lets start with a refresher on the way the switching in the Fabric Interconnect works.
Your UCS Fabric Interconnects (F-I) can work in End Host Mode (EHM), the recommended setting or Switch Mode. In EHM the F-I "forwarding is based on server-to-uplink pinning. A given server interface uses a given uplink regardless of the destination it’s trying to reach. Therefore, fabric interconnects don’t learn MAC addresses from external LAN switches, they learn MACs from servers inside the chassis only. The address table is managed so that it only contains MAC addresses of stations connected to Server Ports. Addresses are not learned on frames from network ports; and frames from Server Ports are allowed to be forwarded only when their source addresses have been learned into the switch forwarding table. Frames sourced from stations inside UCS take optimal paths to all destinations (unicast or multicast) inside. If these frames need to leave UCS, they only exit on their pinned network port. Frames received on network ports are filtered, based on various checks, with an overriding requirement that any frame received from outside UCS must not be forwarded back out of UCS. However fabric interconnects do perform local switching for server to server traffic. This is required because a LAN switch will by default never forward traffic back out the interface it came in on." (source)
So local traffic between the blades stays inside and everything else is throw North bound to your main switches, which in this case don't exist. (Can you tell I am not a networking guy). Based on this, sounds like your blades will have no troubles talking to each other right, wrong.

Normally your F-I is going to be North bound connected, there is little sense being isolated. But remember, there are two Fabrics in your UCS environment for redundancy (or there should be). You are going to have two F-I, an A and B side, and each of these will be connected North bound.

For a visual picture of this see my previous schematic.

Now here is the kicker that cause the lab scenario problem. In the normal world what would you want to occur if your F-I lost North bound connectivity thus causing it to because isolated from the rest of the world? Your blades are going to be sending out traffic and its going to drop. Yet you have another F-I and because you have everything nice and redundant the traffic can probably go that way. You probably want the vNICs to go down so the system knows to send its traffic out the other Fabric.

So, in UCS there is an uplink-fail-action Network Control Policy. What this policy does is define what should happen when your uplinks fail (or in your lab case not even connected). By default the policy is link-down, which causes the operation state of the vNICs on your blade to go down in order to facilitate fabric failover for the vNICs. The alternative is warning which leaves them active. The setting is done through the CLI and can be found in the documentation here. So in your lab change the policy to warning and things should start working.

Of course in your lab you could change to Switch mode, but that would be no fun at all. Hopefully this helps someone from banging their head against the wall for a shorter time than you otherwise might.

Rodos

No comments:

Post a Comment