Posts Tagged 'BigTable'

Google Releases BigTable, GFS Support – For Python

See the TechCrunch post here. Essentially this is a little bit of a letdown after the big (unsubstantiated) lead-up over the weekend, due to the Python-only support. Yes, it’s just the first of what may be many languages, but I suppose they could have foreseen the reaction from the Ruby, PHP, or other (C# *cough*) crowds that might be very interested in building out apps within Google’s cloud.

So, if you’re Djangoing, you’re probably dancing…otherwise, hurry up and wait, unless you have the free time to learn a new platform.

We could have used this at Seattle Startup Weekend, where we developed Skillbit (RIP). That was a Django app. For me? If I have to do any heavy lifting using methods out of Programming Collective Intelligence, I may give this a go. Later.

UPDATE: Here’s an excellent write-up of the Google Apps Engine from Brady Forrest.

Google BigTable Orientations

Coming on the heels of Google’s supposed upcoming BigTable announcment, I recommend the following if you want to learn more:

First, a 1-hour video talk given last summer by Jeff Dean about Google’s overall distributed architecture, including Google File System, MapReduce, and BigTable:

Next, a website I found called highscalability.com which talks about a lot of these topics in a blog format. There’s an interesting summary of Google’s architecture with links here. Ironically, this site seems to be down/overloaded a lot.

Next, a whitepaper on BigTable. Lots of details for the inquiring mind, but still approachable for a software person who is not expert in distributed systems, or BigTable in particular. This was linked from the TechCrunch article.

Finally, there’s this separate 1-hour video, also with Jeff Dean, that was given in 2005 at the UW.

I haven’t actually watched this one yet, having opted to watch the 2007 one linked earlier.

Have fun! P.S. I would appreciate notes about other good BigTable orientation information.

BigTable “Execute in the Cloud” Support via Sawzall

I’m going through the BigTable spec/API document and there’s this interesting nugget:

Finally, Bigtable supports the execution of client-supplied scripts in the address spaces of the servers. The scripts are written in a language developed at Google for processing data called Sawzall [28]. At the moment, our Sawzall-based API does not allow client scripts to write back into Bigtable, but it does allow various forms of data transformation, filtering based on arbitrary expressions, and summarization via a variety of operators.

Hmmm….this is interesting. Drop some data in to BigTable, tie it to a Sawzall script you’ve created — how to get the results back, if Sawzall can’t write _into_ BigTable? Have to figure that one out.

For a computationally intensive product like the one I’m developing, this is very attractive. And I don’t have to switch platforms like I would to get cloud processing done in Amazon’s EC2. I want to find out more about Sawzall.

Google Releasing Amazon SimpleDB Killer

I say “killer” only because if Google gets in, it’s going to be good. BigTable, an internal Google database product that they use to support their fast read/writes on petabytes of data (yes, peta-), is going to be released as a consumer offering in the same mode as Amazon’s SimpleDB. See the TechCrunch writeup here.

Good news for web startups? Certainly. Good news for Amazon? Probably, only insofar as a new industry – cloud computing – will support lots of competitors, and Google getting in only further validates the concept (as if it needed validating to begin with).

If Google can make it as easy to use as their other consumer offerings, like Maps, then we’re all in for a treat.



Follow

Get every new post delivered to your Inbox.