Error on Cisco UCS pinning training
A quick heads up, Cisco have been teaching everyone the wrong thing on the pinning of UCS.
A lot of people have been going on the Cisco Unified Computing System Bootcamps, such as myself, Scott Lowe, Rich Brambley and Brian Knudtson.
One of the important things they teach you is about the pinning, a big deal is made about it in the class (well mine anyway). Here is the page from the course notes, sorry about my scribbles.
The most common implementation will be 2 links from the IOM to the F-I. I have been testing in the lab this week the pining, how to re-pin, how long it takes. Was planning on writing it all up. However one thing I have noticed is that the pinning was all backwards.
Beware its the opposite to what they teach.
When you have 2 uplinks it actually works like this
Port 1 on the IOM goes to the ODD (not even) blades, thats 1,3,5,7.
Port 2 on the IOM goes to the EVEN (not odd) blades, thats 2,4,6,8.
When doing your design for load balancing and failure procedures things like this do matter and I am a little annoyed that Cisco could get this wrong.
I thought it may have been a problem with the early course notes, but I contacted Brian Knudtson via twitter a few minutes ago who just happened to be sitting in a bootcamp as I type, he checked the current notes, still wrong.
Its obviously an error in the technical editing but I can't believe it has not been picked up. Cisco, please update your documentation and train people right. I will also go through some official channels to get it fixed.
More details on pinning thing to come, when I finish my testing and analysis. Lucky I don't believe what I read!
[UPDATE] Note that page 166 of the "Project California" book has a table that gets this right, its in the section "Redwood IO_MUX" (Thanks to David Chapman for pointing this out). There is also a very interesting statement "In future releases the configuration of slot pinning to an uplink will be a user configurable feature." Sounds interesting. Now that I know a lot more about UCS I may go back and read the whole black book again, my first reading was days after the book was released, I may pick up a pile of new things this time round.
Rodos
P.S. Sorry if I sound annoyed, I have been passing this info on to many people, and I don't like having to go back on my statements or look stupid. I now need to go and update all of my UCS diagrams I have been spreading around (which detailed the pinning). Like finding the boot order bug, this is why we doing testing.
You are everyones hero here Rodos. Thanks for identifying the minutia in the technotes We all appreciate it.
ReplyDeleteGood catch. In the classes I teach, I don't spend much time on the pinning - for one, there's nothing you can do about it (it is what it is), and if you're architecting your solutions around which node is pinned to which uplink, you're missing out on a critical piece of the UCS puzzle - mobility, which to me means not caring which node my profile is running on. The are of course scenarios where you *will* care and *will* want to control those things, so it's good to have the correct data.
ReplyDeleteDavid I totally agree with your statement that with stateless computing you should not really care. Yet as you say there are some circumstance where you *will* care.
ReplyDeleteThe more critical thing is when a tech pulls a cable because he thinks its only going to effect subset A, and instead it effects subset B, it causes problems. Thats how your 99.999 SLA drops to something much lower.
Thanks for commenting.
Rodos,
ReplyDeleteThanks for this catch!
I have to say I agree with Dave's comments about the stateless profiles and not really having to care about odd/even pinning. The bigger point is to make sure you have 2 FEX and 2 FI to have the hardware HA.
Then again, what did you expect from a guy doing a "for Dummies" approach anyways?! :)
I believe there is some confusion around port number and link number scheme for the connectivity between IOM (aka FEX) and FIC (aka Fabric Interconnect “switch”)
ReplyDeleteTo be precise, replace the words Port 1, Port 2.. etc in the table above with Link 1, Link 2 etc. Then it becomes technically correct ;> I have seen this mix-up so far only in the UCS training book they hand-out in the class.
There is no “Hard Pinning” of blades to specific ports on IOM. To explain what I mean, you could connect two links to first two ports of IOM. In that case, server 1,3,5,7 are mapped to “Link-1” which happened to be using Port 1 on IOM. Similarly, server 2,4,6,8 are mapped to Link-2 which is plugged into Port 2. Now consider another two-links scenario, where link-1 is plugged into port-3 of IOM and Link-2 is plugged into Port-4 of IOM. This is also a valid configuration. I don’t think there is any specs which specifies that if you are using two links, they should be using specific ports on IOM. In the later case, server 1,3,5,7 are still mapped to Link-1 which happens to be plugged into Port-3 of IOM.
Rodos: Probably you can test the alternative 2-links theory in the lab pretty quick.
In fact, during a failure scenario stepping down from 4-links to 2-links configuration due to a link failure, which of the three surviving links will act as Link-1 and Link-2 is determined at “random”.
In short, pinning of servers to ports on IOM is dynamic and it changes depending on how many active links you have between IOM and FIC.
My 2-cents ;>
if you preset the uplink is 2 per FEX, you may experience down time when you want to increase to 4 uplink per FEX. Did you test on this as well?
ReplyDeleteThanks for letting it know to all of us. Good work.
ReplyDelete