Joyent Weblog
Amazon Web Services or Joyent Accelerators: Reprise
In the Fall of 2006, I wrote a piece On Grids, the Ambitions of Amazon and Joyent, and followed up with Why EC2 isn’t yet a platform for ‘normal’ web applications and the recognition that When you’re really pushing traffic, Amazon S3 is more expensive than a CDN.
The point of these previous articles was to put what wasn’t yet called “cloud computing” into some perspective and to contrast what Amazon was doing with what we were doing. I ventured that EC2 is fine when you’re doing batch, parallel things on data that’s sitting in S3, and that S3 is economically fine as long as you’re not externally interacting with that data to a significant degree (then the request pricing kicks in). Basically it is incorrect that each are universally applicable to all problems and goals in computing, and that they’re cost-effective. An example of a good use case is a spidering application: one launches a number of EC2 instances, crawls a bunch of sites, puts that information into S3, and then launches a number of EC2 instances to build an index of that data and further store it on S3.
Beyond point-by-point features and cost differences, I believe there are inherent philosophical, technical and directional differences between Joyent and Amazon Web Services. This is and has been our core business, and it’s a business model, in my opinion, that competes directly with hardware vendors and customer taking direct possession of hardware and racking-and-stacking it in their own datacenters.
Cloud computing is meant to be inherently “better” than what most people can do themselves.
What’s changed with S3 and EC2 since these articles?
For S3? Nothing really. There are some additional data “silo” services now. SimpleDB is out and there has been some updates to SQS, but I would say that S3 is by far the more popular of the three. The reason is simple: it’s still possible for people to do silly things when storing files on a filesystem (like put a million directories in one directory), but it’s more difficult to do things as silly with a relational database (you still can, but they’re ultimately handled within the RDMS itself, for example, bad queries).
I’m consistently amazed by how many times I have to go over the idea of hashed directory storage.
For EC2 there’s been some improvements.
Annotating the list from “Why EC2 isn’t yet a platform for “normal” web applications we get:
1. No IP address persistence. EC2 now NATs and EC2 instances are on a private network. That helps. Are you able to get permanently assigned, VLAN’ed network address space? It’s not clear to me.
2. No block storage persistence. There is now an option to mount persistent storage in a “normal” way. Presumably it’s block storage over iSCSI (there’s not many options for doing this), hopefully it’s not a formalized FUSE to S3. We’ll see how this holds up performance-wise, now there’s a bit more predictability in data stored in EC2 but experience has shown me that it only takes one really busy database to tap out storage that’s supposed to be serving 10-100 customers. Scaling I/O is still non-trivial.
3. No opportunity for hardware-based load balancing. This is still the case.
4. No vertical scaling (you get a 1.7Ghz CPU and 1 GB of RAM, that’s it). There are now larger instances but the numbers are still odd. 7.5GB of RAM? I like powers of 2 and 10 (so does computer science).
5 & 6. Creation and handling of AMIs. Experience like this is still quite common, it seems.
Structure of modern applications
The three tiers of “web”, “application” and “database” are long dead.
Applications that have to serve data out (versus just pulling in like the spidering example earlier) are now typically structured like: Load Balancers/Application Switches (I prefer the second term) <-> Cache <-> Application <-> Cache <-> Data. Web and gaming applications are exhibiting similar structures. The caching tiers are optional and either can exist as a piece of middleware or as part of the one of the sandwiching tiers. For example, you might cache as part of the application, or in memcached, or you might just be using the query cache in the database itself. And while there are tiers, there are also silos that exist under their own namespaces. You don’t store static files in a relational database, your static assets are CDN’ed and served from e.g. assets[1-4].yourdomain.com, the dynamic sites from yourdomain.com and users logged-in at login.yourdomain.com. Those are different silos.
How to scale each part and why do people have problems in the first place?
Each tier either has state or not. Web applications are over HTTP, an inherently stateless protocol. So as long as one doesn’t introduce state into the application, the application layer is stateless and “easy” to horizontally scale. However, since one is limited in the number of IP addresses one can use to get to the application, and network latency will have an impact at a point, the “front” has state. Finally, the back-end data stores have state, by definition. We end up with: stateful front (Network) <-> stateless middle <-> stateful back. So our options for scaling would be: Load Balancers/Application Switches/Networking (Vertical) <-> Cache (Horizontal or Vertical) <-> Application (Horizontal) <-> Cache (Horizontal or Vertical) <-> Data (Vertical).
The limit to horizontal scale is the network and its latency. For example, you can horizontally scale out multi-master MySQL nodes (with a small and consistent dataset), but you’ll reach a point (somewhere in the 10-20 node range on a gigabit network) where latency now significantly impacts replication time around that ring.
Developing and scaling a “web” application means that you (or someone) has to deal with networking and data management (and different types of data for that matter) if you want to be cost-effective and scalable.
The approach one takes through this stack matters: platform directions
With the view above you can see the different approaches one can take to provide a platform. Amazon started with data stores, made them accessible via APIs, offered an accessible batch compute service on top of those data stores, introduced some predictability into the compute service (by offering some normal persistence), and has yet to deal with load-balancing and traffic-direction as a service. Basically they started with the back and should be working their way to the front.
At Joyent, we had different customers, customers making the choice between staying with their own hardware, or running on Joyent Accelerators. We started with the front (great networking, application switching), persistence, we let people keep their normal backends (and made them fast) and we are working for better solutions (horizontal) for data stores. Solving data storage needs weren’t as pressing because many were already wedded to a solution like MySQL or Oracle. An example of solving problems at the outermost edge of the network would be the article, The wonders of fbref and irules serving pages from Facebook’s cache. This is an example of programming in application switches to offload 5 pages responsible for 80% of an application’s traffic.
Joyent product progression is the opposite of AWS’s. We solved load-based scale with a platform that starts with great networking, well performing Accelerators, Accelerators that are more focused to do particular tasks (e.g. a MySQL cluster). We are working on data distribution for geographic scale, and making it all easier to use and more transparent (solve the final “scale”, administrative scale).
The technology stack of choice does matter: platform technology choices
Joyent Accelerators are uniquely built on the three pillars of Solaris: ZFS, DTrace and Zones. This trio is currently only present in OpenSolaris. What you put on metal is your core “operating system”. Period. Even if you call it a hypervisor, it’s basically an OS that’s running other operating systems. We put a solid kernel on our hardware.
Accelerators are meant to be inherently more performant then a XEN-based EC2 instance per unit of hardware, and to do so within normal ratios: 1 CPU/4GB RAM, utilities available in 1,2,4,8,16,32,64 gb sized chunks. The uniqueness of DTrace adds unparalleled observability, it makes it possible for us to figure out exactly what’s going on in kernel and userland and act upon it for customers in production.
ZFS lets us wrap each accelerator in a portable dataset, and as we’ve stated many times before, it makes any “server” a “storage appliance”.
Add to this Joyent’s use of f5 BigIP load-balancers, Force10 networking fabric, and dual-processor, quad-core, 32GB RAM servers.
Open and portable: platform philosophy
At Joyent, I don’t see us having an interest in running large, monolithic “services” for production applications and services. Things need to remain modular, and breakage in a given part needs to have zero to minimal impact on customers. Production applications shouldn’t use a service like S3 to serve files, they should have access to software with the same functionality and being able to run it on their own set of Accelerators.
We want software that powers services to be open, available, and enable you to run it yourself here on Accelerators, or actually anywhere you want. We develop applications ourselves exactly like you do, we tend to open source them and this is exactly what we would want from a “vendor”. This route also minimizes request (“tick”) pricing. We don’t want to entirely replace people choices in databases, instead Accelerators have to be made to be a powerful, functional base unit for them. Want to run MySQL, PostgreSQL, Oracle, J-EAI/ejabberd, … then by all means do that. No vendor lock-in.
For both platforms, we have our work cut out for us.
Commenting is closed for this article.
Hi,
I agree the Joyent Accelerator compares favorably with Amazon’s offering. For me, though, the big win for Amazon is that I can deploy server instances in minutes, whereas my reading of Joyent’s offering was that provisioning an accelerator costs $50 and can take a few days. Amazon’s initial barrier to adoption is really low, and that is going to draw a developer in to start playing quick and cheap. It would be neat if Joyent could provide faster, cheaper turnaround on new services.
Sincerely,
— Daniel Howard 84 days ago #-daniel
An interesting read, but I am not entirely convinced you are comparing Apples with Apples. One could argue that EC2, S3 et all are built on similarly scalable hardware and operating systems – functionality exposed to the end user is as much a choice of what makes commercial sense, as it is capabilities.
Joyent accelerators do compare well, but their strength comes from what they run on. Fundamentally, OS choice does matter – otherwise you wouldn’t pimp Open Solaris half so much. :)
The choice between an almost ‘instant on’ environment, versus one that requires some effort to construct really shouldn’t be dismissed too lightly. So it’s more a comparison between an operating system (which is really the core point here, given the discourse on kernels and hypervisors) and Amazon’s actual application framework, rather than what it actually runs on.
— Brendan 84 days ago #Here’s an exercise:
- Take a single application from scratch. Something with a database, some sort of web service interface, and an AJAX or Flex/widget interface. Not a toy, but something reasonable.
- Now build and deploy it on Google AppEngine, Amazon EC2/S3, and Joyent accelerator.
- Keep meticulous note of how long it took to write, test, and debug, and more importantly, what steps were taken to deploy the app.
- Also keep track of how much each of them cost to deploy and some on-going cost/performance scenarios.
- Put them side-by-side.
Whoever has the fewest number of steps and the fastest build/deploy time is likely to attract the most developers. Whoever can show that the operating cost scales linearly with use will have developers casting flower petals in their path :-)
A few years ago there was a geek humor piece making the rounds. Someone had tried writing the ‘Hello World’ app across a number of languages/platforms. The C and Java version were two lines. The OLE one was 3-4 pages long. It was a joke, but it reinforced the perception that OLE was hard to use and it never took off.
As an app developer, I don’t care that it runs on Solaris, FreeBSD, or Mac-OS. I want it to work. I want an optimized deployment workflow and a simple way to monitor and keep things running.
If you guys present a way for developers to build and deploy apps quickly and easily and keep them running cost-effectively, you’ll have to beat people off with a stick.
Just my $0.02.
— Ramin 84 days ago #Since you guys just wrote the other day a blog post that establishes a metric of 9 traits for cloud computing, let’s compare Joyent to Amazon.
http://www.joyeur.com/2008/05/08/cloud-nine-specification-for-a-cloud-computer-a-call-to-action
1) Virtualization Layer Network Stability
Joyent = Yes, Amazon = Yes
2) API for Creation, Deletion, Cloning of Instances
Joyent = No, Amazon = Yes
3) Application Layer Interoperability
Joyent = Yes, Amazon = Yes
4) State Layer Interoperability (most difficult)
Joyent = No, Amazon = Yes
5) Application Services (e.g. email infrastructure, payments infrastructure)
Joyent = Yes, Amazon = Yes
6) Automatic Scale (deploy and forget about it)
Joyent = No, Amazon = Yes
7) Hardware Load Balancing
Joyent = Yes, Amazon = No
8) Storage as a Service
Joyent = Yes, Amazon = Yes
9) “Root”, If Required
Joyent = Yes, Amazon = Yes
——————
Totals: Joyent 6/9, Amazon 8/9
— Teddy K 83 days ago #@Teddy: Joyent has an API. Aptana is using it to deliver cloud computing services to their customers. Neither Joyent nor Amazon have state layer interoperability. Neither Joyent nor Amazon have automatic scale. That’s why I gave Joyent a 7. Without hardware load-balancing, I’d give Amazon a 6 cloud score.
Beyond the scoring, do you agree with the nine traits of a cloud computer?
— David Young 83 days ago #Amazon has a very simple advantage, you pay for what you use. I can scale instantly and I can deploy servers as needed. I agree the basic 3 tier web deployment doesn’t work for high traffic sites, but for smaller startups it works fine.
No hardware load balancer, but I can setup a high availability deployment across different datacenters aka “Availability Zones”. That’s not something many cloud providers will be able to offer.
If the box where your load balancer lives dies, relaunch the instance and your are back online.
With the Joyent Accelerators, is there a cancellation process for each instance I might not need ? With Amazon I can simple terminate the instance and I’m done.
This is a perfect example of what you need to be able to do with cloud computing.
— Seth 83 days ago #http://blog.animoto.com/2008/04/21/amazon-ceo-jeff-bezos-on-animoto/
I have to admit I was seduced by AWS only a few weeks ago. I used a prominent service to build my virtual server image and experienced the fastest and most flawless deployment of any software ever. The low cost and fast build and deployment time, and knowing I’d only be billed for actual usage as opposed to paying upfront for bandwidth and disk space I might not necessarily use, was enthralling.
But my party soon ended. With no support and little experience in the linux world, I’ve been driven back to a hosted solution.
The good news – I found Joyent.
— Dom 80 days ago #@Dom
In case you weren’t aware, Amazon does provide paid support (Silver & Gold) – much like Joyent does for it’s Accelerators.
See the link below
http://www.amazon.com/gp/browse.html?node=566801011
— Vince 80 days ago #And there’s OpenSolaris on EC2 as well:
http://blogs.sun.com/ec2/entry/launch_of_opensolaris_on_amazon
— Ivo 73 days ago #