MagLev recap
There has been a huge response to the MagLev demo I gave on Friday, most of it enthusiastic, though not without the inevitable skepticism that comes with any announcement.
For those who weren't at RailsConf, here's a quick summary of how the demo went.
I started off by describing MagLev as a "full stack Ruby implementation", in the same way that Rails is a full stack web framework. To understand what I mean by that, see my earlier post on the Gemstone architecture: not only does MagLev provide a new (and fast) VM for Ruby, but it also provides an integrated shared memory object cache, and integrated transparent persistence. This fully replaces the typical Rails stack of many mongrel instances + several memcached instances + MySQL.
As a first demo, I showed a "magic trick" with two maglev instances running an irb-like shell in side by side terminal windows. A $hat global was defined in each, which just wraps an array and lets you put things in it. In the left window, I put a Rabbit into the $hat. I then looked at the $hat on the right and showed that the Rabbit had magically been transported there.
>> $hat
=> #<Hat:0x0c184bfd01 @contents=[
() ()
( '.' )
(")_(")
]>
How is this possible? Because they're the same hat. The integrated VMs, cache, and storage conspire to create an illusion that global state is shared across all instances: no matter how many VMs you add, over however many machines, they all see and work with the same set of Ruby objects.
There's no limit to what kinds of objects can be shared this way: procs and classes work just as well as arrays and strings. This isn't RPC - the objects are copied into a shared cache when they're created or modified, and if (but only if) another VM needs the object, it will pull it out of the cache and work on the local copy. All of these copies are kept in sync, and any changes are also written to disk by the storage engine so that the entire model is persistent.
This only applies to globally reachable objects - local variables, method arguments and so on aren't generally shared.
Obviously, with this kind of synchronization there has to be some concern for concurrency. MagLev handles this with transactions. Each VM has its own transaction state. When a VM enters a transaction, all of its changes are only locally visible until it is asked to commit. At that point, all of its changes get recorded to the cache and to disk and are available to every other VM.
A transaction can be aborted, in which case *everything* that has happened in that VM since the last commit (object modifications, creation, method or class definition, etc) will get rolled back. A transaction commit can also fail if it conflicts with concurrent changes elsewhere (for example, two VMs modifying the same instance variable of the same object at once).
Because these shared objects are stored on disk, and are lazily loaded into the VMs only when needed, it means you can work with datasets that have many, many more objects than would fit into available RAM. I showed a dataset that I had loaded in which contained 100 million movie reviews, and took up somewhere around 10GB. I could instantly pull in a single movie, modify it, and commit that change, without needing to load the other couple hundred million objects into RAM.
As a final demo, I showed how far MagLev has currently gotten with compatibility by running a simple WEBrick servlet.
At this point, Bob Walker took over. He gave some company background on Gemstone (they've been working on multi-user persistent dynamic language VMs since 1982), and some technical details on MagLev (the VM is a modified version of their Gemstone/S Smalltalk VM, with some Ruby-specific bytecodes; the bytecode is JITted to native code before execution). Then he showed some micro-benchmarks: for what it's worth, MagLev is anywhere from 6 times to (in the extreme case) 111 times faster than the standard 1.8.6 Ruby interpreter on things like fibonacci, block execution, method dispatch, and so on.
Bob then talked about scale. Gemstone has many customers running things like commodities exchanges, derivatives trading, container shipping, and so on that operate at very large scale on top of the same underlying technology as MagLev. Here are a couple of recent unsolicited quotes from a thread on the Joel on Software forums:
"I work for a major shipping company. We have a massive OODB and Smalltalk Application (500 gig range) with 3 million lines of code. We have 2000 plus daily users. We can do 700 transactions a second before slowing down. We also have a Java + SQL +EMS system. On a good day they can do 70 transactions a second, with three times the hardware." --Timo (Saturday, February 16, 2008)"Along side with the major shipping company, we are a major
commodities exchange using GS and ST and while our operational DB is
small (about 5 GB at the start of the trading day to less than 75 GB
and the end) we are probably one of the fastest. We easily handle
transaction rates approaching 6000/sec with about 8000+ daily
users. Our average data center round trip times are in the 2-3 ms
range." --GemStone Weenie (Monday, February 18, 2008)
It's worth noting that that's 6000 writes per second, sustained, and that this application peaks at about 3x that. By comparison, Twitter was once reported as having 600 requests/s (read and write).
Bob then moved onto the vision for MagLev going forward. A few important points:
- It doesn't run Rails yet, but it will.
- It will be RubySpec compliant.
- The Ruby source will be released. The C source code for the VM most likely will remain closed (but anything is possible).
- There will be a free version which will work for most uses, and a paid version for large-scale deployment.
- Look for another announcement/demo at RailsConf Europe in September.
After that we retired to the DoubleTree for a keg of Ruby ale.
As someone who's simultaneously desperate to find an alternative to relation databases and skeptical about OODBs, I found the maglev presentation fascinating.
The benchmarks seem to have poked a hornet's nest of rage, but even without a performance improvement the large distributed VM is really really interesting technology. You guys presented it brilliantly, and I'm looking forward to getting my hands on the bits. *cough* *hint*
Posted by: Michael Koziarski | June 01, 2008 at 12:10 PM
It was good meeting you at the conference; best of luck with Maglev.
Posted by: Wilson Bilkovich | June 01, 2008 at 12:22 PM
The part I find most exciting is that Maglev seems to be solving the very same problems as Erlang is touted for. Except that it's Ruby, not a really ugly language that's stuck in the 80s telco market.
And, as Michael says, *cough* *hint* I can't wait to get my grubby paws on this. :-)
Posted by: Graeme Mathieson | June 01, 2008 at 12:36 PM
Avi,
Any comment on running Seaside/Smalltalk applications side-by-side on top of the same Stone that the Ruby applications are running on?
Is that something that we're going to see?
Posted by: Kurt Schrader | June 01, 2008 at 01:39 PM
Avi,
great to hear this is all going to well. Since you are allowing the C for the VM to be open-sourced, that implies that the VM is completely new for Maglev - not just leveraging the Gemstone/SmallTalk by translating from Ruby to Smalltalk. Is that correct? Obviously you'd be re-using your VM tech for the Ruby one.
Posted by: Tiest Vilee | June 01, 2008 at 03:55 PM
Absolutely fascinating - can't wait to see this in action and potentially try it out.
Posted by: Rowan Hick | June 01, 2008 at 05:11 PM
Your presentation seems to have generated quite a bit of buzz. I would have liked to see it in person, but unfortunately I didn't make it to RailsConf. I look forward to playing with MagLev when it is released.
Posted by: Mirko Froehlich | June 01, 2008 at 09:31 PM
At least this clears up some things
Posted by: markus | June 02, 2008 at 01:39 AM
For the sake of clarity, how about addressing Charles Nutter's comments one by one ?
Or a roadmap with milestones and tenative deadlines ?
How about discussing the governance model of the code ?
I believe many points he made were at least interesting, and as you know his opinion is widely acknowledged as a relevant voice in the community.
Keep up with the great work.
Posted by: vruz | June 02, 2008 at 05:51 AM
It really sounded great until you got to the "It's not open source" part.
Posted by: Bob Aman | June 02, 2008 at 06:14 AM
Yes, because nothing could possibly have monetary value in this world.
Great job as usual Avi.
Posted by: petrilli | June 02, 2008 at 02:11 PM
Avi, can't wait to play around with this thing, please do continue writing about Maglev. I'd be very interesting in hearing more about the practicalities of the persistence and how tightly it actually integrates with it's Smalltalk big-brother, for instance, would I be able to use Smalltalk objects, like JRuby allows you to call Java libs?
Posted by: Johan Sørensen | June 02, 2008 at 02:21 PM
Sounds neat. One question though. You skimmed over the case where synchronization conflicts occur. How do you handle that, or does Gemstone resolve them somehow?
Posted by: Mark Miller | June 02, 2008 at 05:59 PM
@Graeme Mathieson: "The part I find most exciting is that Maglev seems to be solving the very same problems as Erlang is touted for."
Actually MagLev is doing the exact opposite -- creating a transparent, virtual shared memory mechanism that creates the illusion that your Ruby processes are all running in the same VM. Erlang is about different processes not sharing anything, and making the boundaries between processes explicit.
"Except that it's Ruby, not a really ugly language ..."
Well, there's that. :-)
Posted by: Alexander Staubo | June 04, 2008 at 02:33 AM
Avi,
Thanks for the linkable recap, thanks for the great presentation, and thanks for sharing a few minutes near the keg to talk about the motivations behind MagLev. I find this really inspiring and am really looking forward to being able to run this myself.
A vote for as much open source as is at all possible -- the returns may look different but they are always sizeable and worth the effort.
Best,
Rick
Posted by: Rick Bradley | June 04, 2008 at 12:53 PM
Hello Avi,
It's been 3-4 months now since the MagLev demo, and I haven't found any updates about it since. Would you mind sharing some news from that front?
Posted by: Curious | September 19, 2008 at 12:48 AM