About the Post

Author Information

Georg Singer is a researcher in the area of cloud computing at the University of Tartu in Estonia. His research focuses on ROI, economic and business aspects of cloud computing.

To cloud or not to cloud – as a function of WAN costs

Amid the hype around cloud computing and the millions of dollars pumped into marketing those utility computing services, not every business operation will benefit by moving to the cloud. So as the dust of the hype clears, we are left with the question of whether to cloud or not to cloud. This is a question that requires rigorous case-by-case consideration and will probably consume significant business analyst resources as we move ahead. Fortunately, there are some simple economic considerations that can help companies along the way.

In his paper called “Distributed Computing Economics” (reference at the bottom) Jim Gray argues, that the ratios between network bandwidth costs, computation costs and storage are pivotal when determining what should be outsourced and what should remain locally. He states that

It is fine to  send a gigabyte over the network if it saves years of computation, but it is not economic to send a kilobyte question if the answer could be computed locally in a second.

Armbrust et al. (reference below) have taken Gray’s numbers and updated them to 2008. Their numbers look as follows:

To cloud or not to cloud   as a function of WAN costs

Table 1 Cost considerations (taken from Armbrust et. al 2009)

 

The numbers in the Table 1, especially the row showing the cost/performance improvements, also underscore that computational power has become even cheaper in comparison to WAN costs. The performance improvement has been 16x as opposed to only 2.7x for WAN.  The last row of Table 1  compares the purchasing power of 1$ for hardware in comparison to renting the same resources on the cloud. CPU hours seem to be 2.56x more expensive on the cloud and disk storage 20-50% more expensive.

So what does this all mean for answering our question? When does it make sense to move to the cloud, and when not? The following decision tree shall bring some more clarity. The tree is built based on the fact that cost relations between WAN bandwidth, CPU hours and disk storage are crucial for the decision.

The crux of this decision tree is a distinction between two types of applications, batch applications and interactive applications. Batch jobs are set up so they can be run to completion without manual intervention, so all input data is preselected through scripts or command-line parameters. This is in contrast to “online” or interactive programs which prompt the user for such input (Wikipedia).

 

To cloud or not to cloud   as a function of WAN costs

Cloud Computing Decision Tree

 

Batch applications

When talking about moving batch applications to the cloud, there are two types of applications, that can be distinguished again. The first is group is the one where new data is constantly generated off the cloud, e.g. a new cartoon movie is being drawn by graphics people offline and then the whole data is sent to the cloud for rendering, or a surveillance company is sending video material to the cloud for further processing or a scientific application like fluid dynamics, where huge amounts of data need to be uploaded to the cloud.  All three examples illustrate an important fact:  Huge amounts of data have to be uploaded to the cloud. Costs for uploading and downloading are incurred twice, first by the cloud computing provider, and second by the local ISP. So here it seems to be more cost-effective to keep the data in-house in process it in-house.

The second group of batch applications that is more suitable to be done in the cloud are the ones where data is only once shipped to the cloud (this could also be done physically be sending hard drives to the cloud computing provider) and then data is only added incrementally. Data warehouse applications are a good example for this kind of applications. Once the initial set of data is in the cloud, only incremental data (like yesterday’s sales figures) have to be uploaded.

Interactive applications

Interactive applications are typical e-commerce applications, like banking, e-shopping, news portals and also video streaming. Those applications typically create thousands and millions of transactions with very little data per transaction transferred. Here it really boils down to whether the demand for an application is predictable or not. If it is predictable, like spikes of demand during Christmas, a hybrid model is the best choice. Here the application would be hosted in the local IT infrastructure most of the time handling the “normal traffic” and only during times with demand spikes, the cloud will be turned on. For completeness it needs to be added here that not all applications can be designed in a way that this hybrid approach is feasible.

If the demand is not predictable (bursty traffic), like an established company launching a new product or a startup, the decision basically boils down to whether an IT infrastructure exists already or not. It this infrastructure exists and is sufficiently utilized (research states that at utilization rates above 70-80% the local IT infrastructure is cheaper than the cloud), it will most probably make sense to go for a hybrid approach again – trying to minimize risks by meeting the peaks in demand with the cloud. In case of a startup, where not IT infrastructure exists and the demand is totally not predictable, a cloud only approach certainly is the most suitable one.

Limitations of this model

We are aware that this model has significant limitations. We do not take into account all other costs like migration costs, personnel costs or electricity costs. Also factors like data protection issues (certainly in case of banking applications) are omitted.

Gray, J.: Distributed Computing Economics. Microsoft Research Technical
Report: MSRTR- 2003-24, Microsoft Research (2003)

http://research.microsoft.com/pubs/70001/tr-2003-24.pdf

Armbrust, M. et al., 2009. Above the clouds: A berkeley view of cloud computing. EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-28.


Tags:

4 Responses to “To cloud or not to cloud – as a function of WAN costs”

  1. I am not so sure how representative these numbers really are. Are we talking about hosted servers and their bandwidth or are we talking about possibilities for SMEs to analyze their own datacenter and its connection to other services. There are several offers concerning bandwidth nowadays which offer unlimited transfer (not so sure how this will actually work out in the end) from a fixed price. I think it might also make a huge difference if the problem provider is already located in the cloud (the designers draw already online) or locally on your desktop or in your datacenter.

    March 23, 2011 at 10:09 Reply
    • If designers draw online already, then this WAN cost issue certainly is not relevant. Yet in theory, to move huge amounts of data in and out of the cloud, does not make economic sense. But again, there are other very important factors that are not accounted for here like: overal company strategy, growth and power to innovate and roll out new products.

      March 23, 2011 at 16:07 Reply

Trackbacks/Pingbacks

  1. Cloud Computing – Amazon, Google & Co against HP, IBM & Co | Cloud Computing Economics - March 24, 2011

    [...] blog post about the costs of running scientific applications on the cloud and the post about how WAN costs influence cloud ROI) . And of course not every business will be equally suitable to move to the cloud – thinking [...]

  2. Risks and rewards of moving applications to the cloud – A white paper | Cloud Computing Economics - March 24, 2011

    [...] is pretty much in line with our findings which we recently published in a blog post titled “To Cloud or not to Cloud – as a function of WAN costs“. We do not agree with the candidate called “Non-Essential Tasks”. Maybe the [...]

Leave a Reply