Pages

Sunday, October 25, 2009

What comes after virtual server sprawl?

Do you think virtual server sprawl is real? Is it a problem in your environment? This last week I did some thinking in relation to virtual sprawl. Have we remove physical server sprawl and replaced in with virtual server cluster sprawl? Where are the new areas of inefficiencies and how can they be addressed?

Here is the summary of where I got.
  • Many organizations adopt virtualisation to reap large savings is rack space, networking, power and cooling.
  • In addition we are now able to virtualise a lot more workloads than possible in the past. Releases like VMware vSphere allow the move towards 90% virtualised environments.
  • However without continuing along the “virtualisation maturity model” and introducing virtual server life cycle management organizations start to become faced with virtual server sprawl greater than their previous physical server sprawl. It is just too convenient to create a new, or duplicate an existing, machine. Early on the impacts of this sprawl are hidden.
  • Just as data life cycle management has become important to handle the grown in data, virtual machine life cycle management has become important to handle the grown in virtual machines.
  • Yet the for large Enterprise the problem does not end there.
  • In the large Enterprises which have standardized on virtualisation alongside a “virtual first” policy I am seeing that they are required to create islands of virtual clusters, each dedicated for particular workload purposes. An Enterprise will have solo’d clusters of virtual server farms dedicated to different development or testing groups, departments, production areas or security zones. Each of these farms often has different build characteristics for memory and networking capacity and cooling.
  • What is required is for the great benefits that were brought to the servers in terms of abstraction from the hardware to be applied to the physical layers of the virtual server farms.
  • This is why the Enterprise is looking at virtualisation of the remaining physical layers of storage and networking. As an example converged fabrics such as Cisco UCS can virtualise the remaining physical server stack. By abstracting all of the networking and server personalities the remaining server silos are broken down even further driving further savings and standardization.
Food for thought.

Rodos

Thursday, October 15, 2009

Cloudy discussions

Even though everyone is getting into the cloud the community of people really getting into it (and talking and announcing) is not that great, one often keeps seeing the same names. Well from what I can see anyway.

So it was interesting to see that some of the people I am hooked up with at NaviSite have announced their cloud offering.
NaviSite Managed Cloud Services (NaviCloud), based on VMware vSphere™ 4 and VMware vCloud™ and the Cisco Unified Computing System, is an innovative, enterprise-class infrastructure platform that provides high-performance, scalable, on-demand, usage-billed IT infrastructure.
No surprises there, VMware, vCloud, UCS. I have been sharing some UCS implementation insights with these guys as we are both early adopters of the technology. Its great to be able to share with those outside your geography, because they are not your competitors.

But the point of this post is an interesting detail that I noticed about the Savvis Cloud which was posted at Channel Register; Savvis picks Compellent for public cloud service.
Compellent arrays track block-level activity and automatically move data blocks across tiers of storage, from fast and expensive solid state drives (SSD) to slower and capacious SATA hard disk drives across intervening faster but lesser capacity HDDs. This means that expensive SSD is used to optimum effect without placing excess data there, and that data movement does not need manual intervention.

Savvis chief technology officer Bryan Doerr said: "Compellent will help us drive down the cost of lifecycle storage while preserving application performance for our customers' applications.”

Todd Loeppke, the technical VP of storage architecture in Doerr's office, added: "It allows us to take cost out of storage... (We) write data to the right tier for performance reasons and then it will waterfall down the tiers as it ages." Doing it at the LUN or sub-LUN level "is not granular enough".
This is great stuff. I have really been pushing that storage for the cloud has to be super automated especially in the performance tiering. So its really interesting to see these comments and drivers from Savvis which support that. Not sure if Savvis are thinking of playing games with their customers and moving them to a lower tier of storage if they don't have the demand, whilst having them pay for a higher tier. I doubt it.

Yet is the driver performance? I don't think so. To me one key driver for this type of technology is reducing your operational costs. Its the operational support costs that will kill you and loose all your profit in Cloud. So if you can use a technology to self tune the storage across your performance tiers you may just be able to stop many support calls regarding "My IO is slow, whats wrong?". These calls are going to just suck up all your profit from that customer for the month. I know my experience with EMC Mozy is an example of that, which is probably why its support was so bad. So just as valid as moving some storage down the tiers you may move it up for a while as well, even if the customer is not paying for it. It may just be cheaper than the support call.

Of course Compellent are the not the only players in this space. Anyone in the field knows that others, such as EMC have this. For EMC its their coming FAST technology(see 1 and 2).

This is why people really need to be particular when they are architecting Clouds, especially large ones. Don't let people sell you the same old things, such as storage as one example, without describing what's different about it for Cloud. Are you just purchasing large scale storage or are you purchasing storage that has characteristics and functions to support the Cloud use case.

Also interesting to see that Savvis mention SSD in there, I think we are going to see that become common in the Cloud space. When you are paying by the Gb, SSD can be very attractive.

Rodos

Wednesday, October 14, 2009

Cisco view of Virtualization link to Cloud

Are Cloud and virtualization (specifically the server type) the same thing? It's a question often asked. In the following video, Glenn Dasmalchi, technical chief of staff in the office of the CTO at Cisco, provides a summary of how cloud computing and virtualization are related.



Glenn states that Cloud is IT services which are on demand and elastic. Server virtualisation brings economics (savings through better utalisation) and flexibility (for on demand deployment along with movement within or across data centers) to the use case for the Enterprise.

Rodos

Twitter shirt

Sometime we just embarrass ourselves, today was one of those days.

As I got dress this morning I figured I would try out my new t-shirt ordered from http://www.customtees.com.au/. All I can say is my wife burst out laughing, rolled over in bed and continued laughing. All day I have been getting weird looks from people in the office.



I don't know whats wrong with everyone. I think its cool, which makes me sad I know.

You know you want one, you just can't admit it.

Rodos

Saturday, October 10, 2009

UCS Palo and C-Series

The Register has published some of the information regarding the awaited Palo adapter along with the C-Series rack servers for Cisco Unified Computing System (UCS). Now that some of its public, even though many already know about it, we can start talking more outside of our NDAs.

Have a read of the article but here are the highlights or new things.

Some of the C-Series models will start shipping this year.
  • C200-M1, two-socket, 1U rack box, up to 96 GB of main memory, two PCI-Express 2.0 slots and up to four 3.5-inch SAS or SATA drives. Ships November.
  • C210-M1, two-socket, 2U rack box, up to 96 GB of main memory, five PCI-Express slots and up to sixteen 2.5-inch SAS or SATA drives. Ships November.
  • C250-M1, two-socket, 2U rack box, up to 384 GB of main memory with the Catalina memory technology and up to eight 2.5-inch SAS or SATA drives. Expected to ship in December.
Starting to ship any day is the full width B250-M1 blade. This model has the Catalina memory technology to go to 384 GB of main memory. It also has 2 Mezz cards so it can provide 40Gb of bandwidth, 20Gb of each fabric (F-I A/B).

The article also gives a production name to the long awaited Palo card, being the Virtual Interface Card (VIC). The VIC is a CNA that in theory "supports up to 128 virtual network interfaces (vNICs) on the C-Series version of the card, which plugs into a PCI-Express 2.0 x16 slot, and up to 64 vNICs on the mezzanine card that plugs into the B-Series blades". The PCI-Express version of the VIC will ship in December.

In order to run the VIC (Palo) with VMware you will need to upgrade your vSphere to the next version, vSphere Update 1 (40u1), which is not released yet. Given that these cards are going to start to appearing soon you would expect that Update 1 may be coming soon! I certainly won't be saying when in this post!

Lastly details of something that I think a lot of people don't realise about the C-Series blades
it is not possible to use the C-Series rack servers in conjunction with the UCS box, which has the system and network management software converged into the UCS 6100 switch. [...] But sometime in the first half of 2010, Cisco is going to allow the C-Series racks to plug into the UCS system.

Until then, customers have to use C-Series racks servers as they would any other such machine, using a variety of in-band and out-of-band system management tools and KVM switches, and perhaps plugging them into Nexus 5000 switches to at least converge network and storage links into the server.
If you would like some more details on a few of these items. I detailed the extended memory technology called Catalina, videos of the B250-M1 extended memory blade along with a VIC (Palo) adapter and a lame unprepared video of a C-Series.

[Update : Here is a video from Cisco revealing many of the details. http://www.cisco.com/en/US/products/ps10493/index.html]

Rodos

Wednesday, October 07, 2009

Downloading software for Cisco UCS

Wonder where to go to download software/firmware for your Cisco Unified Computing System (UCS) system? It can be hard to find, there are so many different locations for UCS info.

Here is the link, http://tools.cisco.com/support/downloads/go/Redirect.x?mdfid=282558030. You will require a CCO login to get access.

Once logged in this is what you will see.



I hear there is a new release of UCSM in the wings, which is required for the Palo card. So you will want to keep this link handy.

Of course, I have updated my UCS Resources page with the link. Although the page is getting messy, I must restructure and review it, but everything important is there.

Rodos

Error on Cisco UCS pinning training

A quick heads up, Cisco have been teaching everyone the wrong thing on the pinning of UCS.

A lot of people have been going on the Cisco Unified Computing System Bootcamps, such as myself, Scott Lowe, Rich Brambley and Brian Knudtson.

One of the important things they teach you is about the pinning, a big deal is made about it in the class (well mine anyway). Here is the page from the course notes, sorry about my scribbles.



The most common implementation will be 2 links from the IOM to the F-I. I have been testing in the lab this week the pining, how to re-pin, how long it takes. Was planning on writing it all up. However one thing I have noticed is that the pinning was all backwards.

Beware its the opposite to what they teach.

When you have 2 uplinks it actually works like this

Port 1 on the IOM goes to the ODD (not even) blades, thats 1,3,5,7.
Port 2 on the IOM goes to the EVEN (not odd) blades, thats 2,4,6,8.

When doing your design for load balancing and failure procedures things like this do matter and I am a little annoyed that Cisco could get this wrong.

I thought it may have been a problem with the early course notes, but I contacted Brian Knudtson via twitter a few minutes ago who just happened to be sitting in a bootcamp as I type, he checked the current notes, still wrong.

Its obviously an error in the technical editing but I can't believe it has not been picked up. Cisco, please update your documentation and train people right. I will also go through some official channels to get it fixed.

More details on pinning thing to come, when I finish my testing and analysis. Lucky I don't believe what I read!

[UPDATE] Note that page 166 of the "Project California" book has a table that gets this right, its in the section "Redwood IO_MUX" (Thanks to David Chapman for pointing this out). There is also a very interesting statement "In future releases the configuration of slot pinning to an uplink will be a user configurable feature." Sounds interesting. Now that I know a lot more about UCS I may go back and read the whole black book again, my first reading was days after the book was released, I may pick up a pile of new things this time round.

Rodos

P.S. Sorry if I sound annoyed, I have been passing this info on to many people, and I don't like having to go back on my statements or look stupid. I now need to go and update all of my UCS diagrams I have been spreading around (which detailed the pinning). Like finding the boot order bug, this is why we doing testing.

Tuesday, October 06, 2009

The challenge for VMware ahead

What do you think is the challenge ahead for VMware? Microsoft, the continued push of Citrix to maintain it desktop space, commoditisation of the hypervisor, moving from a product based sales organisation to a services one (vCloud)? Whilst all of those may be valid, I think the merge and growth of SpringSource is a major challenge and opportunity for VMware.

SpringSource is something that I think a lot of people in the infrastructure space just don't get, and the VMware user base has typically been from the data center infrastructure crowd. You could see this at VMworld this year. Some of the most exciting demonstrations in the keynotes were those performed by SpringSource, yet the majority of attendees (IMHO) gave it a big yawn. Sure there was bad timing in the first keynote and people were leaving to get to their sessions without being late but if the demo is compelling and interesting it should win out over attending just another session.

I have been musing over this ever since VMworld. In an interview I did with John Troyer I mentioned that I was interested to see how the SpringSource integration panned out (starting at 9:00 minutes in if you want to watch it). It sparked my interest again when I notice that The Hoff finally had the light bulb go on about the disappearance of the OS and mentions SpringSource in his post "Incomplete Thought: Virtual Machines Are the Problem, Not the Solution…". If /Hoff is getting his head around this, then there must be a lot more to it.

I won't go into the whole where does PaaS, the disappearance of the OS and such things play, thats a much longer and considered post. What I am interested in is just what VMware are going to do with SpringSource?

Will it be kept as a separate community? Something for the coders? That is going to have to occur as the coding community is the consumer and mind share that Spring has been able to capture to date and must maintain. Without developers coding to the platform there is not much value in the environment.

However is Spring also going to be integrated into the ever growing VMware family of management products for the data center community. You bet. It can be seen already. You just have to look at the details of CloudFoundary. Developed for Amazon Web Services its not staying there. VMware state
During the coming months, SpringSource will extend Cloud Foundry’s capabilities with enhanced cloud management features and other services. SpringSource will bring Cloud Foundry’s capabilities to Amazon Web Services as well as VMware’s vCloud service provider partners and internal VMware vSphere environments–providing infrastructure choice, deployment flexibility, and enterprise services. [emphasis mine]

Thats right, internal vSphere environments. If you thought Spring was something that may not enter your domain, VMware may have other intentions.

If you think I am getting all weird here. Just ask yourself how well VMware have gone with ThinApp. Do you think that the VMware community gets and understands ThinApp, whats the track record like? The feature rich versions of VMware View come with ThinApp whether you want it or not. Some have felt it has been forced in to increase adoption, after all if its included you might as well use it. Are we going to see Spring and CloudFoundry bundled in the same way, inside vSphere? We don't know yet if that might be a great thing, or a not so great thing, time will tell.

So SpringSource going to be another Thinstall? I don't think so, Spring was a great purchase on so many levels, what will make it sink or swim is what VMware do with it from now on. It's going to be very interesting to watch over the coming year, I know I am going to have my eye on it closely.

Rodos

Saturday, October 03, 2009

Cisco Networkers Brisbane 2009 Customer Appreciation Party

Here is the video from the Cisco Networkers Brisbane 2009 Customer Appreciation Party



Thankfully thats it, me and my great Flip Camera are all video'd out!

Rodos

Friday, October 02, 2009

Cisco TAC support for Unified Computing System (UCS)

Whilst at Cisco Networkers in Brisbane this week I caught up with Robert Burns who leads the support team for Server Virtualization & Data Center Networking at the Sydney TAC. Cisco run a follow the sun program so Rob and his team cover UCS support for the globe at certain times of the day. As the TAC team get the first access to the hardware they also give great feedback to the BU on the technology.

The video below is an interview I did with Rob on the role the TAC plays for UCS support and what he thinks of the technology.



Its great to know that there is a comprehensive bunch of people who really know the kit well to support any potential issues that may arise. After chatting to Rob I can tell you, he knows his UCS.

Rodos

48 facts or tips on Cisco Unified Computing System (UCS)

Whilst at Cisco Networkers in Brisbane 2009 I prepare a blitz of social media. One of the things I did was use twitter to get the message out on Cisco UCS. The conference ran for 3 days and every 30 minutes between 9:00am and 5:00pm I sent a new UCS fact or tip. The purpose was to get discussion and interest going on this great technology. Hopefully it would get a few people coming to my Employers stand to talk UCS with me (which it did).

People who were not at the show found it helpful as well, there were many retweets.

Here is the list in all its glory. Its hard to say much in 140 characters!

  1. Unless they are manually pinned, FLOGIs are assigned the FC uplinks in the appropriate VSAN in a round-robin fashion.
  2. Menlo is the code name for one of the Mezz adapters. A CNA with 2 10GE (with failover) and 2 FC ports.
  3. Oplin is the code name for one of the Mezz adapters. 2 10GE ports only, no failover functionality.
  4. Palo is the code name for one of the futr Mezz adapters. Provides multiple Eth vNICs or FC vHBAs (limits apply)
  5. The min between rails for chassis mounting is 74cm. Overall depth of chassis is 91cm if you include power cbls.
  6. Understand those UCS acronyms in the UCS Dictonary of terms. http://rodos.haywood.org/2009/08/cisco-ucs-dictionary.html
  7. If a F-I failes the dataplane failover depends on how you setup HA. Control plane (UCSM) takes approx 100 sec.
  8. The IOM multiplexes IO ports & BMC from blades along with CMS and CMC to the 10GB ports (1,2or4) going to the F-I.
  9. For only one Fabric-Interconnect (lab use maybe) you must place the IOM in left slot which is Fabric A.
  10. Allow 2Kw per chassis of 8 blades. My testing shows @ 50% CPU load 1600 watts consumed.
  11. Only the first 8 ports of the Fabric-Interconnect are licensed up front. Add port licenses for greater ports.
  12. Create Pin Groups & apply to multiple service profiles to do manual pinning to North uplinks. else round robin.
  13. The half width B200-M1 has 12 DIMM slots, the full width B250-M1 has 48 (but its not avail yet).
  14. @stevie_chambers writes great information on operational practices with UCS. http://viewyonder.com/
  15. Default F-I mode is end-host, North traffic is not switched rather each vNic is pinned to a uplink port or port channel.
  16. If you need grid redundancy for you power ensure you order 4 PSUs as 3 only provides N+1 redundancy.
  17. Smart Call Home is valid for Support Service or Mission Critical but NOT Warranty or Warranty Plus contracts.
  18. You can not use 3 uplinks from an IOM to its Fabric-Interconnect, only 1,2 or 4.
  19. LDAP for RBAC uses the mgmt port IP's on the F-I as source of reqsts, NOT the shared virtual IP address.
  20. Server pools can auto populate based on qualification of mess adapter, RAM, CPU or disk.
  21. A helpful list of UCS links and resources can be found at http://haywood.org/ucs/
  22. The F-I's store their data on 256Gb of internal flash. Backup can be done from GUI or CLI to a remote sftp loc.
  23. Templates can be either intial or updating. Modfying an updating template updates existing instances too.
  24. In UCS maximum 242 VLANs are supported. Remember that VLANs 3968 to 4048 are reserved and can not be used.
  25. Within the RBAC the privilages are for updating, everyone can view the UCSM configurations.
  26. Serial EEPROM contain in chassis mid-plane helps resolve split brain of F-I, each half maint by each a IOM.
  27. Warning. Even though the 61x0 Fabric-Interconnects are based on the Nexus 5000 they R not the same so don't compare btwn.
  28. All uplinks from an IOM must go to the same Fabric-Interconnect.
  29. Only the Menlo card does internal failover 4 Eth when a IOM looses an uplink.All other reqr host multipathing software.
  30. KVM virtual media travels over the CMS network inside the IOM and therefore only runs at 100Mb.
  31. There is a limit of 48 local users within UCSM, for more interface to RADIUS, LDAP or TACACS+.
  32. UCSuOS - UCS Utility Operating System is "pre-OS configuration agent" for the blade, previously named PNuOS.
  33. Fabric-Interconnect backup can be performed to either FTP, TFTP, SCP or SFTP destinations.
  34. The CLI is organized into a hierarchy of command modes, use "scope" and mode name to move down modes.
  35. @bradhedlund writes great technical information on UCS. http://www.internetworkexpert.org/
  36. Each blade and chassis contains a locator beacon which flashes blue when enabled via the GUi, CLI or manually.
  37. The F-I runs in NPV end-host not switch mode. You must connect to external FC storage via the expansion modules with FC.
  38. UCSM split brains may be due to a partition in space or a partition in time.
  39. An amber power light on the blade indicates standby state, green means powered on so check before removing it!
  40. If there is a "*" next to the end of the scope in the CLI don't forget to execute "commit-buffer"!
  41. Removing a blade will generate an event and set a presence of 'missing". The blade needs to be decomish from the ivntry.
  42. UCSM lets you cfg >2 vNics/vHBAs. Atmpt to associate it and receive a major fault due to insuff'nt resc. Wait for Palo.
  43. The "show tech-support" command details the config and state of your environment. Use liberally.
  44. UCSM can pull stats at a collection interval of 30sec, 1 2 or 5 minutes. Modify via the collection policy.
  45. Connect each Chassis IOM to its F-I via low cost Copper Twinax up to 5m, otherwise Fiber with apprt SFP+ trancvr.
  46. Visio icons for UCS can be downloaded from http://www.cisco.com/en/US/products/prod_visio_icon_list.html
  47. Cisco NetPro Forums has a Unified Computing section so learn, share, support. http://short.to/rv07
  48. For further insights into UCS after Networkers follow my UCS feed for updates http://rodos.haywood.org/search/label/UCS


Rodos

Brad Wong talks about FCoE

Whilst at Cisco Networkers I caught up with Brad Wong. Brad is the Product Manager for Nexus and the Unified Computing System (UCS) in the Server Access Virtualisation Business Unit (SAVBU) at Cisco Systems. I have meet with Brad a few times before and he was very gracious to give his time for me to ask him a few questions around FCoE.

I certainly think that FCoE is important for Data Centers over the next few years, yet there is confusion around how to use it today and where its going. So I was very keen to get Brads take on it, after all he drives the products where most of this lives.



Given a bit of time I will post up some deeper details of some of the things that Brad mentions along with a series of links.

Cheers

Rodos

Thursday, October 01, 2009

Tommi Salli talks about Cisco UCS

At the Customer Appreciation Party at Cisco Networkers 2009 in Brisbane Australia I was fortunate enough to be introduced to Tommi Salli (thanks Andrew White from Cisco).

Tommi is a Senior Technical Marketing Engineer for the Unified Computing System (UCS) within the Server Access Virtualisation Business Unit (SAVBU) at Cisco Systems. Tommi was one of the co-authors of the original UCS book "Project California: a Data Center Virtualization Server - UCS (Unified Computing System) by Silvano Gai, Tommi Salli, Roger Andersson", which can be purchased through Lulu. I ordered my copy within hours after it was available and it is now dog eared and covered in highlighter. The book is an introduction to the technology, I therefore don't really use it any more unless I am after some great words when writing up prose on a particular topic.

We discussed lots of areas of UCS together and I thought it would be good to do a quick video, which Tommi was gracious enough to do. Thanks mate! Hope you enjoy watching it as much as I enjoyed doing it.



Rodos