<
What I Will (try to) Tell You
- Some General Background
- What Is Cloud Computing?
- Where Did It Come From?
- What is good for?
- “Cloud Econ 101”
- Challenges and Opportunities
- Should I Cloudify?
- Real Experiences
General Background
- RAD Lab research Goals
- Why we might be on to something
Research engagement with ...
Google • Microsoft • Sun
Amazon • Cisco • Cloudera • eBay • Facebook • Fujitsu
HP • Intel • NetApp • SAP • VMWare • Yahoo! Research
- And in the interest of full disclosure …
These companies do support us financially.
All Hype?
If you ask one notable industry personality, this is all just more marketing hype and buzzwords …
“...we've redefined Cloud Computing to include everything that we already do... I donʼt understand what we would do differently ... other than change the wording of some of our ads.”
--Oracle CEO Larry Ellison, Sept. 2008
There is a lot of hype, but we're here to cut through all that.
Definitions
- Software as a Service (SaaS)
- Utility Computing
- Public and Private Clouds
- Cloud Computing
- “The Cloud”
- Others as they come up
Defining SaaS
- Applications/Software Delivery over the Internet
- Name may be recent, but decades old idea
- Examples:
Google Apps • Gmail for business/education (Part of Google Apps)
SalesForce (CRM) • even CalMail at UC Berkeley
Also, various Payroll services • even Bill Payment services
In some cases, applications are “just outsourced”, but some apps, like CRM for sales/support/marketing need to be mobile but also need to have a safe haven for various customer/client data - they couldn’t really exist as traditional apps.
Defining Utility Computing
- Again an old idea, but now it exists
- (Usually Believeable) illusion of infinite resources
- On-demand (order of minutes)
- Pay-As-You-Go, fine-grained billing, no upfront cost
- Examples:
Amazon EC2/AWS • Microsoft Azure• 3tera
Google AppEngine • Force.com • Cloudera
Utility Computing provides the base on top of which various Software Services can be built and provided.
Defining Clouds
- Clouds
- Instances of HW/SW Infrastructure
- Provide the service called Utility Computing.
- Public Clouds:
- available to external customers
- have a pay-as-you-go billing model
- Private Clouds:
- available only within an organization
- billing may be obfuscated - “Overhead”
So What is Cloud Computing?
It is the combination of Public Clouds providing the service called Utility Computing on top of which various Software Services (Software as a Service) are built and deployed.
Someone (“Cloud Provider”) builds a Cloud (a whole other talk) to provide Utility Computing.
Customers (“Cloud Users”) build their own application on top of Cloud(s) to provide Software as a Service (SaaS) to end-users.
Defining “The Cloud”
Originated with the use of a drawing of a cloud to represent the telephone system and later network systems such as the Internet. Now a vague term used to abstract away any “complex system that takes care of something for me.”
I’ll use it here to mean “in the context of Cloud Computing”.
Where did all this come from?
What made all these old ideas possible in their present form?
- Big Internet Companies with Big Datacenters
Huge Economies of Scale
- Widespread availability of broadband
Everyone online, mobile devices
- Commoditization/Virtualization of x86 HW
- Maturation of common software stacks (free and $$)
- Ability to bill at fine-grained level
Not All Clouds Are Alike
- Type of Virtualization
- Amazon EC2, 3Tera: Xen x86 VM, very DIY
- Microsoft Azure: Bytecode (.NET) VM
- Google AppEngine, Force.com, Cloudera: specific focus
- Low-level/DIY versus High-level/more built-in management
- Storage
- Amazon: S3 (http), Elastic Block Store, SimpleDB
- MS Azure: MS SQL, Azure Storage
- Google: BigTable (non-relational DB)
Cloud Econ - Provisioning
- Cloud User
- Provisioning for peak loads - wasted capacity
- Underprovisioning - lost business and eventually customers
- Use Cloud, as load rises, request more Capacity
- As load falls, return Capacity, pay only for what you use.
- Cloud Provider
- Save Energy - turn off systems to match (load + buffer)
- Turning systems off not as bad as once thought
- Not just saving energy from systems but also from AC systems
Cloud Econ - When Is DIY Cheaper?
- A quick back of the envelope calculation assuming:
- $.085/hr for EC2 Instance (32bit 1.7Ghz, 1.7GB, 160GB 2007)
- $.076/hr to buy server ($2000/server, 3 yr depreciation)
- If peak load > than about 1.1 times avg, EC2 cheaper
- This doesn’t factor in cost of Storage, Network, Power/AC, Staff.
- Lots of ‘hidden costs’, hard to deconstruct
Benefits to Cloud Provider
Utilize off-peak capacity, Sell SW (.NET), Defend a franchise, reuse existing infrastructure
Cloud Econ - Economies of Scale
Massive Economies of Scale - 5-7x factor [Hamilton 2008]
Resource | Cost in Med DC | Cost in Very Large DC | Ratio |
Storage | $2.20/GB-month | $0.40/GB-month | 5.7x |
Network | $95/Mbps-month | $13/Mbps-month | 7.1x |
Staff | ~140 servers/admin | ~1000 servers/admin | 7.1x |
Cloud Econ - Location = Power
10,000 systems need a lot of power
$/KWH | Where | Possible Reasons Why |
3.6¢ | ID | Hydroelectric power; not sent long distance |
10.0¢ | CA | Electricity transmitted long distance over the grid; limited transmission lines in Bay Area; no coal fired electricity allowed in California. |
18.0¢ | HI | Must ship fuel to generate electricity |
[Source:
DOE]
Cloud Econ - Cap/Op Ex
“But what about Capital versus Operational Expenses!!??”
We get this one a lot
- In the long run, a dollar spent is a dollar spent.
- A $ spent in highly utilized “Cloud Compute Time” or
- A $ spent on moderately used system that’s obsolete in 3-5 yrs
- But yes, if your capacity needs are well understood,
then DIY can be cheaper than Cloudifying
Cloud Econ - Risk Transfer
- Cost Assosciativity in Clouds
- cost(1K servers x 1 hr) = cost (1 server x 1K hrs)
- Washington Post: Hillary Clinton's travel docs processed <10 hrs after release of raw docs, posted to WWW within 26 hours.
- RAD Lab - publish academic results using 1000+ servers
- Major enabler for SaaS startups
- Animoto Facebook plugin - traffic 2x every 12 hours for 3 days
- Scaled from 50 to 3500+ servers … then scaled back down
Cool! What else works there?
- Big/Huge one-off or periodic jobs
- Batch Processing (example: Hadoop/MapReduce)
- All that satellite/seismic/collider data
- Startups (or projects) with more need for service than HW
- many “Web Apps” are deployed partially or wholly via EC2
- Prototyping
- Research at scale that can’t be done in-house
What else makes sense or are compelling?
- Desktop apps with large compute needs
- Maybe more generic desktops apps? MS Office?
- Mobile applications
- use mobile device as UI, compute/store “in the cloud”
- mobile device becomes generic and easily replaced.
- Early Example: Danger Hiptop (aka Sidekick), MobileMe
What doesn’t work so well … yet
- Anything requiring bulk data transfer. More on this later
- I/O bandwidth, low network latency guarantees
- Memory/disk intensive work, no I/O bw guarantees available
- HPC/Scientific computing with inter-thread communication
- Amazon announced a Cluster Compute Instance - may help.
- Fault/Delay-Tolerant, Low-communication methods under development
- Anything that needs real entropy
Challenges and Opportunities
Challenge | Opportunity |
Availability | Multiple providers & DCs |
Data lock-in | Standardization |
Data Confidentiality and Auditability | Encryption, VLANs, Firewalls; Geographical Data Storage |
Data transfer bottlenecks |
FedEx-ing disks, Data Backup/Archival |
Performance unpredictability |
Improved VM support, flash memory, scheduling VMs |
Scalable storage |
Invent scalable store - much work done in this area |
Bugs in large distributed systems |
Invent Debugger that relies on Distributed VMs |
Scaling quickly |
Invent Auto-Scaler that relies on ML; Snapshots |
Reputation Fate Sharing |
Offer reputation-guarding services like those for email |
Software Licensing |
Pay-for-use licenses; Bulk use sales |
Should I “Cloudify”
There are few parts to that question:
- Make use of some already available service “in the cloud”
- GMail for Education
- Salesforce
- Online backup
- Move already existing application “into the cloud”
(or develop new app in the cloud)
- “Web Apps”
- Database drive applications
- Build a (Private) Cloud and have it all in-house?
Building Clouds - Private versus Public
Benefit |
Public |
Private |
Economy of scale |
Yes |
No |
Illusion of infinite resources on-demand |
Yes |
Unlikely |
Eliminate up-front commitment by users (1) |
Yes |
No |
True fine-grained pay-as-you-go (2) |
Yes |
?? |
Better utilization (workload multiplexing) |
Yes |
Depends on size (2) |
Better utilization & simplified operations through virtualization |
Yes |
Yes |
- Doesn't factor in NRE costs to get app cloud-ified
- Implies ability to meter & incentive to release idle resources
Should I build a Cloud?
- Why? Is cost savings expected?
- Similar Economies of Scale unlikely for most
- Beware double paying for bundled costs
- Internal incentive to release unused resources?
- If not…don’t expect improved utilization
- Implies ability to meter (technical) and charge (nontechnical)
- But consider Surge Computing
Surge Computing
- Easy: Use cloud for separate/one-off jobs?
- Harder: Provision steady state, overflow to cloud?
- implies high degree of location independence, SW modularity
- must overcome most Cloud obstacles
- Technical means exist
In the Cloud, best practices are critical
- Authentication, data privacy/sensitivity
- Data on public networks, stored in public infrastructure
- Weakest link in security chain == ?
- Support/lifecycle costs vs. alternatives
- Strong appliance market (e.g. spam filters)
- Accountability gap for support
Bottom Line - all those things we should do in our own shops but don’t always do - we need to do them when we move our data into the cloud.
Real Life Experiences - Research
- Eucalyptus deployed on small (~40 node) cluster
- LOTS of Amazon EC2/S3 usage
- We can surge and migrate from one to the other
- Same software, same VMs/AMIs
- Primarily for work that doesnt need Dept Infrastructure
- Kerberos/LDAP, File Servers
- Have to “package up” data/sw to use in the Cloud
- License Servers
- Work that relies upon these - open problem
Results - Research
- Higher Quality Research
- Scale (100s to 1000s of systems) we can’t do in-house
- Faster Results -> we can solve new problems
- Machine Learning/Data Mining research
- eg - Near-realtime system log analysis
- Save money? … Wasn’t a primary gold
- Goal was to increase scale of research
- But, cost to get scale in-house - $1M
- Easy to quantify cost of research
Obstacles Encountered
- Accounting/Billing that rewards cost-effective cloud use
- Funding/Grant Agency culture hasn’t caught up yet
- Funding for “Cluster” often can’t be used to buy time on EC2
- Tools require a lot of experience to use
- But good opportunity for “appliances”
- Software Licensing Sucks
- typically tied to user or fixed to CPUs/systems
- we have a lot of experience dealing with this
but typical user may not.
Education in the Cloud
- CS 98/198, Software as a Service
- Started as Ruby on Rails class
- gave them three cluster nodes to use - I was the sysadmin
- Moved it to Amazon AWS (EC2/S3)
- More realistic environment, Can actually watch a DB fall over.
Would have needed 200 servers for ~20 project teams
- VM image simplifies courseware distribution
- Students can have root - OMG!
- repair damage by re-instantiating image
Possibilities for Education
- No longer strictly limited by lack of in-house resources
- Class projects can take on life beyond end of semester
- Allows for wider collaboration - can be in other depts
- Students can provide more informed input on courseware
Summary - What’s new?
- New Scenarios are possible
- Startups and prototypes without HW/capital investment
- One-off/periodic tasks can exploit “cost-associativity”
- Research/Education are otherwise unheard of scale
- Opportunity for cost savings/better utilization
- Scale up and down to match usage
- Economic motivation to scale down
Summary - Obstacles
- Dependency on 3rd parties
- Data expensive to move, no universal format/method (yet)
- Management interfaces not standardized (yet)
- Still may rely on proprietery software
- Software licensing still sucks
- Security Considerations, IT Best Practices
- Hard to compare cost of Cloud vs in-house
- Administration/Accountability boudnaries
Summary - Cloudifying
- Save Money?
- Economies of Scale unlikely, but can get better utilization
- Need Resource Accounting/Incentive to Release Resources
- Consider use of Cloud for “Surge” computing
- Even if you don’t move to Cloud, use as a driver
- (Re)-examine adherence to best practices
- Identify bundled/hidden costs - “overhead”
Conclusions
- Is this all a bunch of hype?
- It this all a fad?
- Is it for everyone?
- Not yet, but that will change