September 24, 2008 - by jason
Hello everyone. My name is Kent Langley. This is my first post on the Joyeur weblog and this is the first post in what should be a series over time about the things going on inside and emerging from Joyent Labs.
One of the exciting projects that has been in the lab is a Java based distributed application server called Gigaspaces XAP. We’ve been testing it on Joyent Accelerators in the labs over the last few weeks. I have been working closely with the Gigaspaces team to build out a deployment of Gigaspaces XAP and test a fully functional test application, a Monte Carlo simulation.
The team at Gigaspaces I have been working with are Owen, Dekel, and Geva. They have been a real pleasure to work with. I was excited about opportunity because I’ve been following Gigaspaces since December 2004 when I first contacted Geva and downloaded an early version of Gigaspaces software that was then called Gigaspaces Enterprise Application Grid 4.
Fast forward to now and we’re working with Gigaspaces XAP 6.5 and having a good bit of fun as well.
About the Monte Carlo Simulation: (Copied directly, with minor edits, from Owen’s blog entry about this applications. Read: Owen’s work)
The application is rich in the patterns it utilizes:
Historical information regarding several funds is used to predict the future behavior of those funds over a 20 year time period with Minute by Minute fluctuation in prices being calculated and applied. These funds are grouped in different formations into Portfolios and the entire Portfolio assessed for its growth over the same 20 year period. In the end, the best combination of funds within a portfolio is determined and reported and the best and worst behaving funds are also shown. Technically, several portfolios and their associated funds are written as tasks into the system for analysis – the logic needed to process them and their historical information is sent along with them as well. Workers take the tasks, process the associated logic and return a result showing the outcome of the application of the expected variation in price for each fund in the portfolio over the designated 20 year period. This is performed over a set of possible portfolios – each containing different variations of funds. Each set is repeatedly simulated to avoid undo skewing and allow for better control of the randomness. In the end, the client requests a summary of all the results which is created by a service that operates in parallel across the available spaces to take all the results and sort them and aggregate them to determine the most common, best [highest value portfolio], and worst [lowest value portfolio]. Finally, this information is returned to the client and printed out to the console.
Test Environment Setup:
We used three 4 GiB Joyent Accelerators with no custom modifications for this test. We used Gigaspaces XAP 6.?.?. We did have to make some very minor modifications to scripts to accommodate for the Accelerator environment. We had our first test environment up well within an hour of test. Later, we came back, deployed Gigaspaces XAP across all three machines and the always entertaining Owen managed the deployment and running of the Monte Carlo Simulation.
These test results are taken directly from Dekel's notes during the tests. All tests were run with a single GSM (Grid Services Master) in unicast mode. *Run #1:* 20 Iteration, 50 variants – 1000 different sets (in each set we run 20 years of analysis for 3 stocks, including aggregating the results) GS (Grid Services) Cluster: 5 partitions, each with a backup, 1 worker per partition, 14 GSCs (Grid Service Container) Joyent: 2 X 4GiB Accelerators (8 cores machine) Run completed in: 2 seconds (compare to 1.5 minutes on a laptop) *Run #2:* 200 Iteration, 50 variants – 10,000 different sets, GS Cluster: 5 partitions, each with a backup, 1 worker per partition, 14 GSCs Joyent: 2 X 4GiB Accelerators * * 1 GiB = 1,073,741,824 bytes * 1 GB = 1,000,000,000 bytes Run completed in: 22 seconds *Run #3:* 200 Iteration, 500 variants – 100,000 different sets, GS Cluster: 5 partitions, each with a backup, 1 worker per partition, 14 GSCs Joyent: 2 X 4GiB Accelerators Run completed in: 266 seconds *Run #4:* Now we scale HW and Software in a click": 200 Iteration, 500 variants – 100,000 different sets, GS Cluster: 5 partitions, each with a backup, 1 co-located worker per partition, up to 60 (scaling dynamically, no application down-time and no reconfiguration), 23 GSCs Joyent: 3 X 4GiB Accelerators 1st iteration scaled from 1 to 19: 133 seconds 2nd iteration scaled from 19 to 59: 108 seconds 3rd iteration steady on 60 workers (no effort on scaling the workers): 86 seconds *Run #5* True SBA, scaled the number of partitions": 200 Iteration, 500 variants – 100,000 different sets, GS Cluster: 8 partitions, each with a backup, 1 co-located worker per partition (15 total), 23 GSCs Joyent: 3 X 4GiB Accelerators We scale HW and SW by the same unit of scale – ~50% (2 to 3, 5 to 8) (Linear Scalability!!!: Expected 50% faster time than Run#3 = 177sec) Run completed in: 160 seconds *Run #6:* 200 Iteration, 500 variants – 100,000 different sets, GS Cluster: 15 partitions, each with a backup, 1 co-located worker per partition (15 total), 23 GSCs Joyent: 3 X 4GiB Accelerators Re-configuration for 15 partitions – less than one minute. Only a few clicks! (undeploy, change a number in a pu.xml, redeploy, no need to shutdown the GS cluster - GSC) (Expected 3rd of the time for Run#3) Run completed in: 134 seconds *Run #7:* 200 Iteration, 500 variants – 100,000 different sets, GS Cluster: 15 partitions, each with a backup, 1 co-located worker per partition (15 total), remote workers, 23 GSCs Joyent: 3 X 4Gib Accelerators Re-configuration for 15 partitions – less than one minute. Only a few clicks! (undeploy, change pu.xml, redeploy, no need to shutdown the GS cluster - GSC) 1st iteration scaled from 1 to 15: 87 seconds 2nd iteration scaled from 15 to 34: 81 seconds 3rd iteration steady on 60 workers (no effort on scaling the workers): 67 seconds
Conclusions and Thoughts:
Cloud Computing is the act of deploying, elastically scaling, managing, and running Cloud Computing Applications on Cloud Computers. Cloud Computing Applications are those applications that are well designed to run on Cloud Computers. In this case, Joyent provided infrastructure as a service (IaaS). Gigaspaces provided the Gigaspaces XAP Platform as a Service (PaaS). A savvy developer, Owen, created a cloud computing software application, the Monte Carlo simulation. We definitely did some Cloud Computing in this test!
In emerging from the Joyent Labs program we now know that Gigaspaces XAP is proven to run and scale on Joyent Accelerators. We also showed during the tests that the sun Java stack is particularly powerful because JVM’s can scale vertically well beyond the 2 GiB limitations some other platforms may encounter. For memory hungry applications this can be very useful. In our tests we hit the limits of the 4 GiB accelerators. But, we could have easily scaled vertically by adding more RAM and CPU to the 8, 16, or 32 GiB Acclerator quickly or scaled horizontally by simply adding more Gigaspaces XAP enabled Accelerators and GSC instances.
At one point, we wanted a little more horsepower for the application. So, in a total of about 10 minutes I setup the Gigaspaces application on a new node I had handy and turned it over to Owen who very quickly configured his application and brought up the new nodes for more capacity during the test.
Some other exciting things about this exercise in the lab was that it was, in general, easy. We added and removed nodes, scaled linearly, pulled nodes out, re-deployed the application in seconds to minutes, we processed large amounts of data, and we did all this in the cloud with really minimal efforts.
There is much more to the story, but I want to keep this blog post under 20 pages. I look forward to continuing to work with the Gigaspaces team and exploring the many very interesting possibilities this work presents.
Soon I will also be posting Gigaspaces installation instructions to the knowledgebase. So, stay tuned!