Science Communications - Publicity for Technologya


computing in the cloud

"Our world is awash in data," CMU Dean of Computer Science, Randy Bryant said."  Bryant’s comment refers to the one zettabyte (that’s 1021 or one billion trillion bytes) of digital data being generated around the world each year.  The numbers are so mind-bogglingly large that they demand new ways of thinking about how we create, gather, store, access, move and crunch data.  One promising solution to the problem is cloud computing, a newly emergent set of ideas in which data management methods trump hardware and software solutions.  In its simplest form, cloud computing is about remote access.   So anytime you search, you are engaging in a rudimentary form of cloud computing, because you are controlling a large set of remote servers from your desktop.


But Dean Bryant's interest in cloud computing goes beyond searching for information and finding good deals on the Internet.  He wants scientific researchers to be able to dive into massive amounts of data and transform it into useful information, just the way Google, Amazon and eBay do.  Key to cloud computing’s promise is the continuing decline in hardware cost.  “Today you can store the entire Library of Congress collection on a set of hard drives this tall,” Bryant said as he held his hand alongside the arm of the chair in which he was sitting. “Modern disk drives have capacities measured in terabytes, and they cost less than $100 per terabyte, but the rate of transfer has not kept pace,” he continued.


Enter cloud computing: Three methods of computing are key to reconciling the disparity between storage capacity and transfer rate: 1) Virtualization, which allows single processors to behave as though they are multiples of themselves; 2) Parallel computing, which splits big problems into a lot of little ones and distributes them to of different virtual processors for computation, then compiles all the tiny answers to come up with the big answer and; 3) Data Intensive Super Computing (DISC) which, put simply, brings the algorithms to the data rather than the other way around.  So if you had a very large problem, you could use parallel computing to break it up into a lot of little ones, distribute them to the make-believe processors you created by virtualization, use DISC to send a special algorithm for each one out into the cloud to hunt for the data it needs, compute its little answer while it’s there, then bring the answer back home without all the heavy-weight arithmetic, and feed the little answers to the control processor to compute the big answer.


In order to ensure that cloud computing doesn’t become pie-in-the sky, for the past year Bryant and his colleagues have collaborated with Yahoo! on their 4,000 processor, 1.5 petabyte (1.5 million gigabytes) M45 cluster.  Much of the group’s effort has been dedicated to developing ways to build, use, manage, search and secure the cloud infrastructure.  At the same time, machine translation researchers at CMU have put M45 to immediate use by teaching it to translate French to English and vice versa.


While cloud computing holds the promise of solving scientific and social problems nobody ever imagined tackling before, for the rest of us it is likely to mean that someday the cloud will make our mundane mouse and keyboard realities happier.  Imagine a light-duty cloud-access machine that you never have to upgrade.  A cloud where every piece of software you ever thought about owning is available on a pay-as-you-go basis.    No more storage problems.  No software installations or updates. Your files safely tucked away in multiple puffs inside the cloud.  Automatic maintenance and backups.  No more crawling under your desk.  No more calls to tech support.  Ahhh! Life in the cloud might not be so bad.


This article first appeared in Tom Imerito’s TEQ column, Science Fare

© Copyright 2009, Thomas P. Imerito / dba Science Communications


Bookmark and Share

 
©2009 Science Communications
thomas@science-communications.com