Last week HP released the HP ProLiant Moonshot Server based on the HP Moonshot 1500 Chassis (link). Two things occurred to me: it has an awesome name, it’s an ideal platform for distributed systems.
When it comes to distributed systems, I would rather have a larger number of small servers than a small number of large servers. In particular, distributed systems that require high throughput and that are I/O bound. If I had a budget of $60K to purchase new hardware for a data grid, I would shoot for an HP Moonshot. It’s about the density. It’s about a data grid with 45 nodes in a single 4.3U chassis. It’s about the fabric. It’s about a data grid with 45 nodes that communicate via an internal fabric. No cables necessary.
Note: I realize that the HP Moonshot server cartridges only have 8GB of memory.
Does it undercut the need for virtualization?
If I had a server with multiple processors, I would be inclined to deploy multiple virtual servers. Perhaps one per processor. Perhaps one for every two processor cores. However, the HP Moonshot servers are powered by the Intel® Atom™ Processor S1260 server class system on a chip (Soc). I would not run multiple data grid nodes on a single dual core processor with or without virtualization.
A colleague, Jon Masters, introduced me to the term hyperscale. It turns out that is exactly the term I was looking for. Jon does a great job defining hyperscale (link).
- High Density
- Thousands of servers per rack.
- System on a Chip (SoC)
- Integrated Controllers *
- Low Performance (Wimpy Cores)
- Network / Storage / Cluster
- Fail in Place (FIP) / Hot Pluggable
To me, high density and fail in place are the by products of housing micro servers within a single chassis. If a distributed system supports node failure, the infrastructure it is deployed to should support fail in place. I rather like Jon’s analogy: dead pixels. Failed servers do not have to be replaced, nor do they require that the entire platform (chassis + servers) be replaced. However, with hot pluggable servers in the form of cartridges, a system administrator can simply replace failed servers with new ones. I like that the server itself, in the form of a cartridge, is the field replacement unit (FRU).
It’s about the fabric, right?
I’m interested in the cluster fabric (2D Torus / link). What if the client-server communication was via the network fabric while the peer-to-peer communication was via the cluster fabric? For example, a data grid with failure detection and replication. Rather than replicate an entry to a random (with respect to networking) server, the entry is simply replicated to the north, south, east, and / or west.
I’m interested in the storage fabric too. A data grid with a file based cache store comes to mind. The data is partitioned into segments, and there is a single file for each segment. Rather than storing the file for a segment on the local physical drive of the primary owner, the file is stored on a storage cartridge via the storage fabric. In the event that the primary owner for a given segment fails and a new primary owner is selected, the new primary owner of said segment can take over reading and writing the segment file.
Note: I realize that the cluster fabric in the HP Moonshot Chassis is not utilized at this time.
* While the Intel Centeron SoC includes integrated I/O, it requires external controllers for SATA / Gigabit Ethernet / USB interfaces. The Intel Avoton SoC will not. The density of HP Moonshot servers with the Avoton SoC will quadruple. It will be about a data grid with 180 nodes in a single 4.3U chassis.
Jon introduced me to the term physicalization too. This was in response to me questioning the need for virtualization. I think of it as creating multiple servers per rack unit rather than creating multiple servers per physical server. That’s not to say that there is no place for provisioning. It’s to say that it takes place in the form in bare metal provisioning. For example, via IPMI and PXE with OpenStack Compute / Nova (link).
A data grid node would not require a physical drive. If a cache store was configured, it could be configured to read and write to a file on a storage cartridge. What if the server image was loaded directly into memory? Perhaps the physical drive could be removed and the server cartridge would be even smaller resulting in even greater density.
I think the concept of a ‘smart chassis’ is interesting. What if provisioning was handled by the chassis? Perhaps it is pull based. When a server cartridge is inserted, it pulls down the server image. Maybe the server image is stored on a storage cartridge in the chassis. Perhaps it is push based. A system administrator can log in to a web application to push a server image to an empty server cartridge. What if the chassis scanned storage cartridges to identify system images? What if a system administrator could push a specific server image to a specific empty server cartridge via drag and drop?
Netflix / Redbox
What if HP treated Moonshot customers as subscribers who were allowed to have 15 to 45 cartridges per chassis out at a time? If a cartridge fails, the customer ships it back to HP and a replacement is shipped back to the customer. What if there was a selection of preconfigured cartridges? A customer could select 10 data grid cartridges and 5 application server cartridges for a proof of concept. If it was successful, they could upgrade their subscription and select 25 additional data grid cartridges and 5 additional application server cartridges. If it was not, they could return all of the cartridges and select 15 new cartridges. Perhaps vendors could sell subscriptions to HP, and HP could ‘rent’ those subscriptions to Moonshot customers in the form of cartridges a la Redbox.