Saturday, February 21, 2009

Hyperic First Impressions

As someone who is in charge of a large network of different applications and hosts, I spend a decent portion of the day wondering if our software is working correctly. Some of our software is off-the-shelf, but much of it is application-specific custom built software. To further complicate things, much of this software interacts with other software and services. Staying "aware" in this type of large heterogeneous environment is serious work and in the past we've developed custom apps to monitor other custom apps. Clearly there must be a better way.

BEEP!
Check out this unrelated picture I added to spice things up!

Enter Hyperic.

I learned about Hyperic through this pretty interesting round-up. For those of you who aren't familiar with it, Hyperic is an open source, java-based monitoring platform. My first thought was that Hyperic kind of looks like Cacti (a rolled up fancy version of MRTG). In fact, it's quite a bit different.

Agent Based

The first big difference you notice between a grapher (like Cacti) and Hyperic is that Hyperic has an agent application. This caused me some initial alarm. I worry about having to install an agent on a bunch of boxes in the field. I also worry about the resources consumed by the agent. Fortunately it's pretty simple to install (just a directory containing a JRE and the software). The agent is also really smart at monitoring it's own processor utilization, so you can keep tabs on how much work the it's actually doing.

My agent-related fears assuaged I started setting up a small Hyperic pilot test. Server installation was pretty easy on Linux. For the pilot I decided to use the built-in DBMS (Postgres) rather than setting up an out-board MySQL database. The server started up easily and self configured itself. After setting up a few credentialed users I started installing agents.

Understanding Hyperic

When your first agent connects to Hyperic you start to learn about the system. Hyperic is pretty opaque at first. A new user is immediately confronted with the notions of Platforms, Servers, Services and more. The key to wrapping your head around Hyperic is understanding that a physical server is not a "Server" in Hyperic. It's a "Platform". This is important. It's called a platform because it can run multiple servers (FTP, HTTP, SQL, etc). A server process can then host multiple services. For example: IIS can serve multiple web sites (vhosts). Understanding this hierarchy is key to understanding the way that Hyperic monitors your environment.

So having now added our first platform we start to get a feel for the power that Hyperic brings to the table. Hyperic begins auto-discovering servers on each platform it's installed on. So on my first box it discovered IIS, .Net 2.0 and MySQL. It immediately provided detailed monitoring capabilities. When I say detailed I don't mean: How many MySQL processes are runnng? I mean: How many rows are in table X in database B?

Hyperic immediately provides granular monitoring of installed services.

RAR! I am pile of servers! HEAR ME ROAR!
Check out this other unrelated picture I added to show some contrast to the first unrelated picture!

It doesn't stop with "installed" services either. The next week I added a new service on the machine. Within hours, Hyperic had discovered the new server and made it available for monitoring.

Auto-discovery is all well and good, but it doesn't help us monitor those oddball custom applications. How do we find out if our FTP Uploader application is behind on it's transfers? Well Hyperic provides us two easy mechanisms to do this. The first is at the SQL level: I can have Hyperic monitor the result of an SQL query via the database server's agent. The other option I have is to go to the FTP Uploader's platform and modify it's "inventory" manually. I can tell it about the input queue directory and have Hyperic monitor the number of files inside.

Alerting

Which brings us to alerting. Hyperic has pretty cool system for defining alerting criteria. Anyone who's managed a real time environment knows that false positives are a real problem. To prevent this Hyperic has threshold-based alerting. This allows you to say: Only alert me if the disk is full for more than 1 hour in a 12 hour period. This functionality prevents regular fluctuations from causing spurious alerts.

Once an alert is triggered Hyperic follows user-defined "escalation trees". An escalation tree is designed to gradually increase the loudness of your systems desperate cries for help. For example: suppose a process, for which I'm responsible, begins to choke. An escalation tree might send me an email. But since it's 2am on a Tuesday I'm hardly checking my email. This process is important though, so after waiting a predetermined time, Hyperic decides to send me a text message. Normally that's the kind of alert that I respond to, but on this particular Tuesday night I'm sleeping one off, so I forget to charge my iPhone. Hyperic decides that I'm clearly not the right man for the job and can thereafter begin alerting other people or group mailing lists. All of this is readily configureably and pretty cool stuff. It tracks all sorts of data under the hood as well (IE: who fixed what, when and how).

Open Source

Hyperic is available in two varieties: An open source version called Hyperic HQ, and a closed source version called Hyperic HQ Enterprise. The enterprise version of the software has a number of cool features that could make my life a bit easier, but when I spoke to a sales representative the price proved to be prohibitively expensive in my environment. They seem to charge a per-server-monitored subscription fee. Unfortunately, for the price they ballpark-quoted me, I could hire a full time engineer who's job is solely to manage the open source version of the software.

Don't get me wrong though, the enterprise version has some very cool functionality (like the ability for the server to automatically react to certain conditions and send commands). But unfortunately Hyperic seems to charge on an all-or-nothing pricing model, so if I wanted to use advanced features on a small subset of my hardware I'd end up having to create two parallel installations or pay full price for every server I monitor.

What's next

Well these are all my "first impressions" so over the next couple of months, as I get more involved using Hyperic I'm going to need to get a grip on how much monitoring is appropriate. I have hundreds of servers that could benefit from this type of monitoring but that might create a load situation on the Hyperic server. Furthermore, there is so much data that we can keep track of - some of it incredibly useful. And some of it is... er... not-so-much. I hope that someone is keeping track of the "Number of Deleted Recurring Appointments" in exchange. There is clearly a trade off between performance of the Hyperic server and awareness of the environment. This is true at the network level as well: What if a significant portion of my bandwidth turns out to be monitoring-related messages?

The other factor is that the more systems we begin to monitor using Hyperic, the more Hyperic itself becomes a critical element in our infrastructure. We begin to need to keep tabs on Hyperic itself. This also means that we need to be methodical in our setting up and deleting of alerts. I suspect this will require some level of automation.

To me, this is all very interesting and exciting. To everyone else, my apologies.

Friday, February 20, 2009

Steel

Just in case you're wondering what this is all about, basically over the past couple of weeks I've been thinking a lot about whats involved with working with steel in our shop. And decided to write down my thoughts.

Our scene shop has never really been a "steel" shop. We typically work with wood and wood-based products (note the distinction). When we started looking at the set drawings for our March production of Twelve Angry Men it was pretty clear that we were going to be working with steel.

This photo is not staged... I know it looks it.

So, what are the processes that we use when working with steel? Well, they're basically the same processes as working with wood (cutting, drilling, attaching, etc.). These processes look a bit different when working with a mailable material like steel and therefore require different tools.

Cutting

Cutting is the first major difference between working with steel and working with wood. Inexpensive toothed circular saw blades don't do so well when trying to get through most steel, so we end up using an abrasive cutting tool - an abrasive chop saw. Abrasive cutting uses a rough cutting wheel that deteriorates as it cuts. The result of this deterioration is that, rather than getting duller, the cutting wheel gets smaller as you go.

When dealing with sheet metal, cutting with a wood-type jigsaw generally doesn't work very well because of tendency of the sheet metal to bend. Instead, the more consistent pressure of a band saw is generally preferable it tends not to bend the metal and avoids overheating due to large surface area of the blade.

Attaching

Attaching the metal to other surfaces is easily done using screws and bolts. Drilling out holes for bolts and screws is more time consuming and consumes far more drill bits, but otherwise is pretty similar. Attaching metal to metal can be done using bolts as well, but in this scenario we have a new process which has no analog in the world of wood: Welding.

The unit

Welding, essentially, involves turning two pieces of metal into one. To do this we heat two pieces of side-by-side metal up, then join them by adding metal to both pieces until they form one new lumpy piece of metal. To do this we need a welder.

Our scene shop only has 120V power so I ended up choosing a Hobart Handler 140. For this show we need to be able to weld 1/8" steel - this is on the very outside edge of what 120V welders are capable of using gas MIG process, but the Handler can do it provided you take your time and go slow.

The MIG process is pretty simple. Filler wire starts to come out the tip of the gun when you pull the trigger. As the filler wire approaches the work piece a little lightning bolt shoots out and starts to heat both the work piece and the filler. If you do it right, the filler wire should go into the, now molten, work piece and allow you to build up the metal. If you build-up two pieces of metal at the same time, you've joined them together. Lastly, to prevent the hot metal from immediately oxidizing, all of this lightning bolt action happens inside an invisible plume of shielding gas (in our case Argon and CO2).

But, while the process is simple, actually welding two pieces of metal together can get pretty complicated. For starters there are four important variables: Gas Pressure, Wire Feed Rate, Wire Feed Pressure, and Current. Fiddling with these variables will yield different results depending on the thickness of the metal. Additionally, there is pure skill involved. That little lightning bolt likes to jump around - you need a steady hand, and lots of practice.

A good weld!

Now you've welded two pieces of metal together. But wait, there's more! How do you know the weld will hold? On the surface the weld looks good. There's no excessive oxidation, there are no gaps in the weld, there's no stippling or balling of the bead. That's all the information that you get by looking at it. Next step you can either X-Ray the weld, or you can beat the hell out of it and see if it holds. I'll let you guess which method we use.

Safety

Shop safety is our foremost consideration when doing any kind of work in the shop, and working with steel brings with it some new issues to consider.

The first, most general, issue is that metal tends to be sharp. Anyone who's ever gotten a metal splinter knows what I'm talking about. To mitigate this factor we make sure to dull down any exposed surfaces that have been cut in our shop and make sure to clean up after cutting or drilling metal.

Cutting metal on an abrasive chop saw has some safety issues of its own. Fortunately they're quite obvious to anyone in the vicinity of a saw. As the abrasive blade cuts through the metal it creates a shower of sparks and dust. This shower presents a hazard to both the operator and any bystanders. Bystanders need to keep their distance, the operator needs to wear long sleeves and pants, gloves and a face shield. Additionally abrasive tools should not be used around combustible materials or in the vicinity of volatile solvents.

Welding has a few safety issues associated with it as well. Welding creates heat and light. For the welder, heat is an obvious danger; it's why we wear gloves and use clamps. For the bystander it can be quite hidden. When a piece of steel has been welded it remains hot for some time - despite not necessarily looking hot. It's important that steps be taken to prevent bystanders from coming into contact with freshly welded metal.

The light created by welding has high ultraviolet content. It is therefore inadvisable to be directly exposed to the light cast by the welder. The operator is wearing a mask and gloves and sleeves to prevent exposure. Furthermore, its the operators responsibility to keep bystanders shielded from the direct light of the welder. Often, this can be as simple as avoiding reflective surfaces and keeping your body in between the weld and any bystanders.

Lastly, spatter from the welding presents the same types of fire hazards as operating a cutoff saw. Again, care should be taken to avoid situations that could result in fire. Additionally, consider that material you have just welded may be hot enough to start a fire and keep it appropriately safe.

*Important note: This is not a comprehensive list of safety considerations. Your mileage may vary.

Why Bother?

There are a number of reasons that having steel capabilities is a tremendous asset. In theater, one of the biggest is weight. Building structural objects out of steel can be done significantly lighter that out of wood. This is a crucial factor when designing large sets that need to be moved around or flown out.

Uniformity is another factor. Consider that when you purchase board lumber you're generally getting a natural product and as such it tends to have warps and bends. Warps and bends can really mess things up when you're dealing with close tolerances.

That having been said its important to maintain a balance. Carefully evaluate your material selection and, when choosing steel, be sure to factor in the safety considerations when making your choice.

In my environment I further have to weigh the fact that this is a high school shop and that therefore there are bound to be students involved in the process. At the moment I've decided to operate the steel tools myself and have the students facilitate the process. This takes away the "Operator Risk" and makes me responsible for minimizing the "Bystander Risk". I'm okay with that responsibility.