Cisco UCS for beginners – an end-user’s overview

Update: At the time I wrote this post (February 2014), I was not a Cisco employee. Since then (as of June 2014) I have gone to work for Cisco. This shouldn’t change anything about the post, and it is still just me and not an official publication, but since the original disclaimer below is not currently accurate, I thought I would clarify that.

I’ve been working on a series of posts about upgrading an integrated UCS environment, and realized about halfway through that a summary/overview would make sense as a starting point.

I recommend a refreshing beverage, as this is longer than I’d expected it to be.

I will note up front that this does not represent the official presentation of UCS by Cisco, and will have errors and omissions. It does reflect my understanding and positioning of the platform, based on two years and change of immersive experience. It is also focused on C-Series (rack-mount servers), not B-Series (blade servers and chassis), as I have been 100% in the C-series side of the platform, although I try to share a reasonable level of detail that’s applicable to both. And I expect it will provide a good starting point to understanding the Unified Computing System from Cisco.

Unified Computing System – Wait, What?
UCS, or Unified Computing System, is Cisco’s foray into the server market, with integrated network, storage, management, and of course server platforms. As a server admin primarily, I think of it as a utility computing platform, similar to the utility storage concept that 3PAR introduced in the early 2000s. You have a management infrastructure that simplifies structured deployment, monitoring, and operation of your servers, reducing the number of inflection points (when deployed properly) to coordinate firmware, provisioning, hardware maintenance, and server identity.
ucs rack layoutUCS includes two types of servers. The original rollout in 2009 included a blade server platform, generally known as B-Series or Chassis servers. I would guess that 9 out of 10 people you talk to about UCS think B-Series blades when you say UCS. Converged networking happens inside the blade chassis on an I/O Module, or IOM, also known as a Fabric Extender, or FEX. Local storage lives on the blades if needed, with up to 4 2.5″ drives available on full-width blades (2 drives on half-width), and a mezzanine card slot for a converged network adapter and/or a solid state device.
At some point along the way, it seems customers wanted more storage than a blade provides, and more I/O expansion capacity, so Cisco rolled out a rack-mount product line, the C-Series “pizza box” servers, which provided familiar PCI-e slots, no less than twice the drive bays (8 2.5″ or 4 3.5″ on the lowest storage density C200/C220 models), and an access convergence layer outside the server in the form of a Fabric Extender, or FEX, a Nexus 2200-series switch.
Both platforms are designed to go upstream to a Fabric Interconnect, or FI, in the form of a UCS 6100 or 6200 series device. The FI is the UCS environment’s egress point; all servers (blade and/or rack-mount) in a single UCS domain or “pod” will connect to each other and the outside world through the FI. Storage networking to FCoE and iSCSI storage devices happens at this level, as does conventional Ethernet uplink.

So far it sounds pretty normal. Isn’t it?

You can use Cisco UCS C-series rack-mount servers independently without a FI, in the same way you might use a Dell PowerEdge R-series or HP ProLiant DL-series server. They work in standalone mode with a robust integrated management controller (CIMC) that is analogous to iDRAC or iLO, and they present as industry standard servers. The fully-featured CIMC functionality is included in the server (no add-on licensing, even for virtual media), and there’s even a potent XML API for the standalone API.
Many of the largest deployments of Cisco UCS C-Series servers work this way, and in the early days of my deployment, it was actually the only option (so we had standalone servers running bare metal OSes managed on a per-server basis). And for storage-dense environments, this method does have its charm.
The real power of the UCS environment, however, comes out when you put the servers under UCS Manager, or UCSM. This is what’s called an “integrated” environment, as opposed to a “standalone” environment where you manage through the individual CIMC on each server.
ucs model based frameworkUCSM lives inside the Fabric Interconnect, and is at its core a database of system elements and states called the Data Management Engine or DME. The DME uses Application Gateways to talk to the managed physical aspects of the system–server baseboard (think IPMI), supported controllers (CNAs and disk controllers), I/O subsystem (IOM/FEX), and the FI itself.
UCSM is both this management infrastructure, and the common Java GUI used to interact with its XML API. While many people do use the UCSM Java layer to monitor and manage the platform, you can use a CLI (via ssh to the FI), or write your own API clients. There are also standard offerings to use PowerShell on Windows or a Python shell on UNIX to manage via the API.

What’s this profile stuff all about?

A key part of UCS’s benefit are the concepts of policies, profiles, and templates.
Policy is a standard definition of an aspect of a server. For example, there are BIOS policies (defining how the BIOS is set up, including C-state handling and power management), firmware policies (setting a package of firmware levels for system BIOS, CIMC, and supported I/O controllers), disk configuration policies (providing initial RAID configuration for storage).
UCS service profileA Service Profile (SP) contains all the policies and data points that define a “server” in the role sense. If you remember Sun servers with the configuration smart card, that card (when implemented) would contain the profile for that server. In UCS-land, this would include BIOS, firmware, disk configuration, network identity (MAC addresses, VLANs, WWNs, etc) and other specific information that gives a server instance its identity. If you don’t have local storage, and you had to swap out a server for another piece of bare metal and have it come up as the previous server, the profile has all the information that makes that happen.
A Service Profile Template provides a pattern for creating service profiles as needed, providing consistency across server provisioning and redeployment.
There are also templates for things like network interfaces (vNIC, vHBA, and iSCSI templates) which become elements of a Service Profile or a SP Template. You might have a basic profile that covers, say, your web server design. You could have separate SP templates for Production (prod VLANs, SAN configuration) and Test (QA VLANs, local disk boot), sharing the same base hardware policies.
And there are server pools, which define a class of servers based on various characteristics (i.e. all 96GB dual socket servers, or all 1U servers with 8 local disks, or all servers you manually add to the pool). You can then associate that pool with a SP template, so that when a matching server is discovered in your UCS environment, it gets assigned to an appropriate template and can be automatically provisioned on power-up.
There are a lot more features you can take advantage of, from logging and alerting to call-home support features, to almost-one-click firmware upgrades across a domain, but that’s beyond the scope of this post.

I hear you can only have 160 servers though.

This is true, in a sense, much like you can only have 4 people in a car (but you can have multiple cars). A single UCS Manager can handle 160 servers between B-Series and C-Series. This is probably a dense five datacenter racks’ worth of servers, or 20 blade chassis, or some mix thereof (i.e. 10 chassis of 8 B-Series blades each, plus 80 rack-mount C-Series servers). But that’s not as bad a limitation as some vendors make it out to be.
You can address the XML API on multiple UCS Manager instances. A management tool might check inventory on all of your UCSM domains to find the element (server, policy, profile) that you want to manage, and then act on it by talking to that specific UCSM domain. Devops powers activate? This will get confusing if you create policies/profiles/templates at different times (i.e. while you’re waiting for your tools team to write a management tool).

But there’s something easier.

UCS Central is a Cisco-provided layer above the UCSM instances, that provides you with central management of all aspects of the UCS Manager across multiple domains. It’s a “write once, apply everywhere” model of policies and templates, that allows central monitoring and management of your environment across domains and datacenters.
UCS Central is an add-on product that may incur additional charges, especially if you have more than five UCS domains to manage. Support is not included with the base product. But when you get anywhere close to that scale, it may well be worth it. Oh, and in case you didn’t see this coming, there’s an XML API to UCS Central as well.

I don’t have a six figure budget to try this out. What can I do?

I’m glad you asked. Cisco makes a free “Platform Emulator” available. It’s a VM commonly referred to as UCSPE, downloadable for free from Cisco and run under the virtualization platform of your choice (including VMware Player, Fusion, Workstation, or others). 
Chris Wahl has a video demonstrating the download process and a series introducing the Cisco UCS Platform Emulator here on Youtube. You can get the actual downloads at Cisco’s Communities Site and bring the emulator up on your own computer.
Chris Wahl UCS PE screenshotThe UCSPE should let you get a feel for how UCSM and server management works, and as of the 2.2 release lets you try out firmware updates as well (with some slightly dehydrated versions of the firmware packages).
It obviously won’t let you run OSes on the emulated servers, and it’s not a replacement for an actual UCS server environment, but it will get you started.
If you have access to a real UCS environment, you can back up that physical environment’s config and load it into the UCSPE system. This will let you experiment with real world configurations (including scripting/tools development) without taking your production environment down.

Is Cisco UCS the right solution to everything?

grumpy-cat

Grumpy cat says “No.” And I just heard my Cisco friends’ hearts drop. But hear me out, folks.
To be completely honest, the sweet spot for UCS is a utility computing design. If you have standard server designs that are fairly homogeneous, this is a very good fit. If your environment is based around some combination of Ethernet, iSCSI, and FCoE, you’re covered. If your snowflake servers are running under a standard virtualization platform, you’re probably covered as well.
On the other hand, if you build a 12GB server here, a 27.5GB server there, a 66GB server with FCoTR and a USB ballerina over there, it’s not a good fit. If you really need to run 32-bit operating systems on bare metal, you’re also going to run up against some challenges. Official driver support is limited to 21st Century 64-bit operating systems.
If you have a requirement for enormous local storage (more than, say, 24-48TB of disk), there are some better choices as well; the largest currently available UCS server holds either 12 3.5″ or 24 2.5″ drives. If you need a wide range of varied network and storage adapters beyond what’s supported under UCS (direct attach fibre channel, OC3/OC12 cards, modems, etc.), you might consider another platform that’s more generic.
Service profiles let you replace a server without reconfiguring your environment, but if every server is different, you’re not going to be able to use service profiles effectively. You can, of course, run UCS C-Series systems in standalone mode, with bare metal OSes or hypervisors, and they’ll work fine (with the 32-bit OS caveat above), and many companies do this in substantial volume, but you will lose some (not all) of the differentiation between Cisco UCS and other platforms.

Disclaimers:

I’ve worked with Cisco UCS as part of my day job for about two years. I don’t work for Cisco, and I’m not posting this as a representative of my employer or of Cisco. Any errors, omissions, confusion, or mislaid plans of mice and men gone astray are mine alone.

More details:

Images other than Grumpy Cat above borrowed under belief of fair use from the Cisco UCS Manager Architecture paper, the Understanding Cisco Unified Computing System Service Profiles paper, and the fine work of Chris Wahl of WahlNetwork.com.
Advertisements

I’ve been called certifiable before – a sysadmin’s developing thoughts on certification

I’ve been a system administrator in some form or another since, I suppose, Summer 1988 when I provided ad hoc support for the RSTS/11 system at my college. I made a few bucks doing it as a lab assistant for two years, but I was probably too much of a proto-BOFH to stay on the payroll. I still fielded more questions than most of the lab assistants, and it prepared me moderately well for the following 25 years of user, system, and platform support.

One thing I’ve rarely ever done is get formally trained, or even less often, certified in a technology. I was three classes short of a computer science undergrad major just for fun, which should tell you I’m certifiable (didn’t take RPG, COBOL, or Calculus, but I did a bit of recreational Discrete Mathematics and two doses of Machine Structures).

Around the turn of the century, I took the Legato Certified Administrator (Data Protection) class and exam, and got certified on a technology I’d been deploying and managing for a few years at the time. In 2010 I took the Cloudera Hadoop Administrator course. I almost passed the certification exam then, but didn’t have time to go back and retake it before the retake offer expired. And that’s the extent of my formal training to date.

So what’s changed now?

Having been welcomed into the communities around Cisco’s datacenter technology and VMware’s virtualization platforms, I’m feeling an unnatural desire to work toward certifications in both of those areas. I have the 200-120 box set for CCNA Routing & Switching, although I’ve been leaning toward the datacenter path. I’m still trying to figure out what path to take with VMware, but we’ll have to see.

I was reading the Cisco Learning Network post “6 Reasons Employers Value Cisco Certifications” and it made me think about my aversion to certification over the last few years. So what’s wrong with certification, and what might be right about it?

What could possibly go wrong?

For one, some people collect certifications the way I collect old computers and soho routers. The cert may be representative of being able to complete a vendor’s exam, but may not reflect feet-on-the-ground (or hands-on-the-keyboard) skills, much less big picture architectural thinking. This was common when we were searching for a full year for a network admin at one job a few years back. No matter how many network certs you have, if you can’t at least give a shot to explaining subnetting, you’re probably not ready for the real world.

Another issue is that most certifications are vendor-specific, and may impart an undue bias toward that vendor over others. I’d like to think this isn’t the case, and a truly good network administrator/architect would know a broad swath of the market and be able to fit technology to an identified and triaged problem/business need, rather than trying to squeeze the business need into a given technology.

But what’s right?

For one, there are different skill levels and foci, and tiered/niched certifications can give a hint as to what level someone is. If I come in to an interview with a CCNA R&S, for example, I probably won’t be asked to provide in-depth explanations of SS7 or 802.11ac. There will always be bad interviewers, like the guy a few years ago who wanted me to explain in depth how BGP worked, after I had said twice that I wasn’t a network engineer and had only worked on LANs. So this isn’t foolproof on either end.

More important to me, now that I’m thinking about the process, is that pursuing a certification gives you a roadmap to study and prepare, and a somewhat finite goal to achieve. I never learned Perl because I didn’t really have a scope or a fixed goal. Making a personal goal to “learn me some networking,” alas, probably won’t get me anywhere.

Having a goal to, say, “take the CCNA DC exam at Cisco Live in May” gives me a framework and a finite goal. I can set aside time every week, study some of the Cisco Learning Network materials, watch some Pluralsight programs with Chris Wahl, and have a fixed time frame for preparation for the exam.

So where do we go from here?

For one, I think that box set of the 200-120 CCNA R&S library will probably sit in the closet for a few more months. It was on sale with an extra coupon at Barnes and Noble last summer, so I don’t feel too bad about it.

I will be plotting out my Cisco Certification Written Exam at Cisco Live in May, as hinted above. I blew off the free exam last year, which was probably good considering I’d had Tech Field Day 9 the week before (Tech Field Day events are great for scrambling the brain, and the 90-100F temperatures were leaning toward poaching my brain along with it).

I’m going to get more involved with Cisco Learning Network, as I’m sure Matt Saunders won’t let me slip on this. Hopefully some of my fellow Cisco Champions will cheer, jeer, prod, or otherwise support me on the journey as well.

And I’ll be sure to share my adventure with you fine readers… feel free to poke at me here if you have suggestions or haven’t heard from me on the certification path in a while.

Do share any certification feedback, suggestions for me, or warnings for other readers… in the comments below. 

Found Out There – 2013-02-03

I’m going to try to push myself toward blogging more by adding an every-two-weeks (or every week) list of interesting links I came up with over the past so long.

I don’t have ads on my site, partly because nobody’s begged me to take their money, and partly because I think the clean look is more enjoyable and requires less maintenance and tracking.

But if you’d like to help my geek gadget (and caffeine) habit, feel free to shop through my Amazon.com affiliate link or use the links on the lower right side of this page. I also occasionally put things in my Amazon aStore, and link to them here. Buying those things (or anything through those links) also gets me a few bucks that can go toward a pound of loose-leaf tea or a new hybrid hard drive. Thanks in advance for those of you who choose to do this.

So here are my interesting links for the week ending 2013-02-03.

Gratuitous self-links:

Other things I’ve found of interest:

What did you find this week, that you think I should be reading?