Indyramp is old enough to drink now!

www.indyramp.com as of December 1996 (earliest archive.org entry)

http://www.indyramp.com as of December 1996 (earliest archive.org entry)

Eighteen years ago today, my registration for indyramp.com went through.

Creation date: 03 Jul 1995 04:00:00

The next day, as I recall, the Network Solutions $100 registration charge for two years of service went into place. It later decreased, and my domains moved from Network Solutions to Joker.com (requiring a fax to Germany to confirm), to a friend’s OpenSRS/Tucows RSP, to GoDaddy, and finally to its current resting place at Namecheap.

Indyramp Consulting had been around a bit before that, originally designed to be a third Internet Service Provider in Indianapolis, Indiana. In those days, it still wasn’t completely insane to seriously consider introducing an ISP that didn’t offer SLIP or PPP connections. Anyway, that plan didn’t quite work out, and I took a job at one of the other two ISPs, IQuest, which is now called LightBound.

But I kept the Indyramp name around, registered the domain, and set up a couple of web sites on it–including my earliest claim to fame, the pop music page (and first Internet fan page of any sort) for Alyssa Milano.

In 1995, I had a dedicated dialup connection from IQuest (an employee perk), and hosted much of my site on a recased Gateway 2000 i386 PC with 16MB of DIP RAM on a riser board and a 540MB hard drive that dual-booted the October beta of Windows 95.

From 1996 to 2003 the site was hosted in my apartments in Hayward and then Milpitas, California, on various levels of dedicated hardware (from a SPARCstation IPC and SPARCstation 2 to various Pentium Pro workstations to a 19-server multivendor multi-OS “colo” in my spare bedroom in Milpitas).

In 2003, I had been laid off and the job market was challenging, and I moved from home-hosting to a virtual machine at Johncompanies.com (technically a FreeBSD jail, but virtually the same thing). The site is still hosted there just over 10 years later.

Between 1995 and the mid-2000s I hosted/ran many mailing lists, play-by-email games, the original website/mailing list for Linux IP Masquerading, and a few other random things that came up.

Nowadays it’s mostly email filtering and a screen session for irc, but the name is still there and I have a lot of history out there on the interwebs with Indyramp.

So happy 18th birthday to Indyramp. I’m pretty sure the Fitbit on my arm has more memory than the first server that hosted you, but ultimately the Fitbit didn’t get me into hosting and networking, out to California, and to where I am today.

Mohs’ law and big data (Hadoop is hard)

I’ve spent more time than usual the past two weeks talking with people, and listening to people, about Hadoop. I’ve been administering Hadoop clusters for (part of) a living for about 4-5 years now, and I’ve gotten pretty good at answering questions people don’t have, or want, answers for.

In the past week or so I’ve heard one vendor advocate that Hadoop gives you a free analytics environment with no need for expensive developers since it’s free software, and another vendor advocated that you can just virtualize Hadoop by putting lots of  datanodes on a single host and save lots of money. Easy peasy, right?

no-just-no

 I’m proposing we consider Mohs’ Law in this situation.

No, I’m not misspelling Moore’s Law, which tells us that compute power/efficiency will double every 24 months. I’m suggesting a law that’s more of a diamond in the rough, if you don’t mind.

Hadoop is hard. 

 It’s based on Friedrich Mohs developing a method of describing hardness of materials about 200 years ago. And it’s a great pun. But it’s also a reminder that “yum install” does not a production application make.

But Rob, I can get Hadoop in 15 minutes!

yellow-hadoop

It is pretty easy to get started with Hadoop. It’s even free of charge to get started (or even to go into production) with the platform itself. I recommend it. Go do it now. I’ll wait.

For starters, go grab the Cloudera QuickStart VM or the Hortonworks Sandbox VM from their respective websites. Pull it into your desktop virtualization platform of choice. Look at the docs. Run some of the tests. At that point you’re farther along than most people who promote Hadoop.

But at that point you don’t have a functioning business intelligence/data warehouse/analytics application environment, any more than installing Ubuntu 13.04 into VirtualBox gives you a production e-commerce site.

There’s still a lot of work to be done. Some of it is difficult, but a fair bit of it is just downright hard. Understand what you want to do, what data you can pull into your environment. Figure out what your customers/users/analysts need out of the data. Make sure you can validate the output. Automate all your tests. Go back to your data sources and make sure you’re getting all the data. Go back to your end users and make sure you’re giving them what they want. Lather, rinse, repeat.

Rob’s Corollaries to Mohs’ Law

If you remember nothing else, think about an analytics environment the way you would a monitoring environment. I’ve supported both for almost a decade, and the take-home I’ll save you ten years on is this:

Make sure you’re measuring what you think you’re measuring.

Make sure you’re measuring what you need to be measuring.

This rule also applies to a lot of other technology… customer surveys, dating sites, and so forth. But it takes formidable effort to get these two corollaries right (without coronaries), and even if you do throw together something with Insta-analytics.com (probably not a real site, not meant as an endorsement), they won’t be able to tell you what you need or whether you’re getting it.

So where do we go from here?

First of all, if you’re interested in getting familiar with Hadoop, go grab a VM above and give it a try. Simulate Pi Indiana-style. Grab a book and try some of the stuff it suggests.

Then, go talk to the BI team in your company, or the analyst who does performance dashboards when she’s not writing code and designing employee event signage and chasing your kids out of the server closet, or whoever. Find out what they’re doing.

And finally, unless your vendor makes its livelihood supporting Hadoop, don’t take their take on Hadoop as gospel. Apocrypha maybe, mistranslation at worst, and probably not enough to go on.

Hey, I’m in Silicon Valley and want to learn more, what can I do?

Funny you should ask.

BayLISA is hosting a Hadoop meeting on Thursday, May 16, at Yahoo! in Sunnyvale. There’s a waiting list but it usually fades closer to the event. Come see Alan Gates of Hortonworks, Eric Sammer of Cloudera, and Ryan Orban of Nutanix talking about Hadoop innovations and how to get involved.  (Disclaimer: I am president of BayLISA, but I don’t get any profit or direct benefit if people come to the meetups.)

There’s also a Hadoop User Group meetup on Wednesday, May 15, although it’s a bit more suited to advanced users who are already familiar with the technology. Their waitlist is also a fair bit longer. But check it out and see if it fits your needs.

If you’re not in Silicon Valley, check Meetup for local groups, or see if one of the Hadoop vendors has local meetings or events you can attend. If you find one, feel free to add it in the comments here so other people will know where to look.

Maybe Yahoo’s Mayer Made The Right Decision After All

Marissa Mayer, CEO of Yahoo, and Jackie Reses, their “EVP of People and Development,” made waves last week by apparently banning working from home (and to some extent, flexible schedules).

As a worker who can do 90% or more of my work from somewhere other than my own company’s offices, I take some offense at this, and it makes Yahoo another company I’d not want to work for.

But I can see why Yahoo’s management would do this, and it may well be the right choice for them.

The losers in this decision are many of the workers with families, long commutes, and/or existing agreements with their direct managers to accommodate working from home. Also losing out will be their coworkers, and anyone counting on their efficiency and effectiveness. Also the recruiters trying to bring in Silicon Valley talent with this albatross around their necks, the people on understaffed teams counting on new talent coming in, and quite possibly Yahoo’s shareholders.

I’ve been in a couple of environments where working from home (briefly) became a rift. In both cases, there was a problem with a very small number of employees who couldn’t work effectively from home, and rather than actively managing those few people, management attempted to overmanage a large group.

office-space-face-to-faceIn one environment, the politics at the director level and above were toxic, and were chasing out good talent in droves. I think 2-3 of a dozen coworkers were still there at the end of six months, including none of the people who interviewed me. The director who stepped in felt that people wanted to leave (he was correct), and that the best way to keep that from happening was to quietly cancel work-from-home without communicating this change (he was incorrect). There were a lot of people who “had to take [their] cat to the vet” for a month or two, even people with no cat, and I hear the director finally figured out that he was doing it wrong. I was gone by then as well.

In another case, about two or three employees were abusing work-from-home privileges, sending their “I’m working from home” email at 2pm (if at all) and not answering their phones. So the person in management responsible for these people’s managers declared that anyone wanting (or needing) to work from home more than once a month had to take sick time to do it, even if they were working the whole day. This was thwarted by HR, to the company’s benefit–silly state laws would’ve gotten in the way anyway–but it communicated management’s respect for employees very clearly.

Don’t get me wrong here. Working from home (unless contractually obligated) is a privilege, and if it’s abused, it can and should be revoked on a per-employee basis. It should also be a matter between the abusing employee and his/her manager, not the C-levels and the press.

But if you want to acquire and retain the kind of talent that Silicon Valley thrives on, you need better leadership, not just suboptimal management and one-size-fits-all backtracking mandates. I think people wanted to believe Marissa Mayer brought that sort of leadership to Yahoo, but a lot are thinking better of that belief now.

Some references:

Update: Some interesting articles on the matter from Business Insider:

What do you think? Is this another step in the recovery direction for Yahoo? Does it change how you view them as a prospective (or current) employer?