Taming the public cloud beast: Your monthly bill

Beyond a doubt the public clouds have been a godsend to practically all the startups out there today. Plus the ongoing price wars between the major players: Amazon, Microsoft, Google, IBM, has meant that the price per unit of compute/storage/network capacity has been on the decline. Even adoption amongst the traditional large enterprises is on the increase with success stories being written on the hour every hour. So then how does one explain articles such as the one below?

Here’s why this startup ditched Amazon Web Services by John Cook

And this is not an odd one out case. and other similar articles are available albeit they don’t exactly show up on the first page of most search queries pertaining to cloud pricing/costs thanks to the excellent SEO efforts by all the big providers.

The bottom line: Just like dining out every day at an unlimited buffet leads to obesity ad-hoc usage of cloud computing resources leads to a bloated bill that can take many a startup by surprise. Is the answer then to simply jump ship and switch over to a private cloud or worse yet a traditional infrastructure model? All of the players in the private cloud space are trying hard to convince you to do so. Here is an excellent white paper from Eucalyptus to help you take the leap. Maybe not just yet if some of the strategies outlined below are adopted appropriately.

Utilize compute resources for the shortest possible time: Throughout the history of the World Wide Web the one golden rule followed religiously has been: Be always available lest your loyal consumer ditches you the instant your site is down. Given that public clouds rely on sharing the same physical resources across multiple customers it should come as no surprise that the cheapest pricing plan available for the longest time was the one wherein you spun up reserved instances with a guaranteed up time of 99.995% or better. Not only do you end up paying an upfront charge but also the costs can spiral exponentially as you keep adding nodes. To add to your woes as the application and data complexity increases along with the upsurge in customers you start spinning up the more expensive high end instances.

Invest instead in re-architecting your applications to utilize micro instances by adopting a micro services based approach or better still invest in building up your in house DevOps skills to leverage the on-demand and spot pricing plans. The latest PaaS offerings from Amazon such as AWS Lambda, and BlueMix from IBM provide a host of ready to use micro services that can be leveraged on a as needed basis. To add to that the newest auto-scaling offerings from some of the providers also allow you to spin up container based compute instances instead of entire VMs.

Have a crystal clear strategy for processing raw usage data and/or archive it as quickly as you can: Success in boosting site traffic which invariable leads to more business brings with it a deluge of raw usage data that in turn holds the secrets to the next chapter of your growth. Hence it is very tempting to hold on to as much usage data as you can. Plus there may not be a clean separation between transactional and raw usage data. All the cloud providers leverage this aspect of the growth phase of any startup to drive up your monthly spend. Hence it is critical to watch your storage needs very closely and adapt to increasing raw usage data very quickly.

To start with ensure that you can clearly demarcate between transactional data and all other data generated until the time the transaction is actually completed. Also make sure you can easily sift between anonymous usage data and that associated with a known logged in customer. Store all usage data using object based storage services such as AWS S3 limiting each bucket to a relatively short time duration say five minutes and employ data aggregation to reduce data volume by aggregating to a longer time duration say one hour. The key here is not to try and convert the data to a full-fledged data warehouse/mart schema at this stage. Once the raw data has been thus processed it should be archived on a daily basis using solutions such as AWS Glacier. If you don’t have a strategy to further utilize the semi-processed usage data to populate a data warehouse then archive that as well say on a weekly or monthly basis.

Reduce network traffic to the compute instances and between different availability zones: This is probably the most easily overlooked aspect of your monthly bill. Most of the savvy startups will quickly utilize CDN for static content and script caching thereby reducing network traffic to the compute instances hosting the web applications but as your overall cloud infrastructure grows and you start spanning availability zones for ensuring high availability and disaster recovery the corresponding increase in network traffic across availability zones will start adding up quickly. Luckily your startup will have to be wildly successful before this component of the monthly bill will require too much attention and by that time you will be able to afford the real high end talent required to optimize the architecture further.

The kind of monthly spend on public clouds as described in the article referenced at the start of the article represents a dream come true to most of the startups just starting out of the gate but it is always a good idea to start adopting the right strategies and architectures to manage your monthly spend from the very beginning when even a thousand bucks out of your pocket can seem like a million. Furthermore the right architecture will help you eventually transition to a hybrid cloud model at the right time in the future with the least amount of effort and risk. your pocket can seem like a million. Furthermore the right architecture will help you eventually transition to a hybrid cloud model at the right time in the future with the least amount of effort and risk.

This blog was first published on the ContractIQ site at http://blog.contractiq.com/taming-the-public-cloud-beast-your-monthly-cloud-computing-bill/ on December 17, 2014.

Architecting solutions for Cluster Computing as opposed to Cloud Computing

Recently, while evaluating storage options as part of a consulting engagement, I came across the Isilion offering from EMC and some of the articles in the associated literature talked about the use of Isilion for cluster computing. Given that the emphasis is still on storage, specifically HDFS, it was intriguing that the possibility of compute functions being delegated to nodes a la Map Reduce was discussed quite a bit. Further reading in to what is considered to be cluster computing got me to the Wikipedia article on Computer Clusters.

So it is quite clear as to what the difference is between cloud computing and cluster computing to the extent that we can even safely say the cluster computing is a subset of cloud computing especially given the offerings from Amazon Web Services such as Elastic Map Reduce and the newly launched Lamda. Hence in this blog article I will focus instead on how a solution needs to be architected to leverage cluster computing effectively to get the best bang for the buck out of cloud computing.

Lets start by addressing the biggest challenge with implementing cluster computing: co-location of data on the compute node. While this is an easy problem to solve while utilizing the Map Reduce paradigm it represents a real challenge to use cluster computing for achieving scalability in the typical usage scenarios. Although the use of technologies such as InfiniBand may be an option in some cases the cost benefit analysis would render it useless for most of the typical business applications.

One immediate option is to utilize microservices based architecture. But it is clear from the description in the seminal article by Martin Fowler that it does not address co-location of transaction data although he does talk about decentralized data management and polygot persistence. Clearly is not really meant to allow for easy adoption in a cluster computing scenario. Interestingly though there is a reference to Enterprise Service Bus as an example of smart end points and that is what got me thinking about extending the concept to cluster computing.

The trick then is to apply the event based programming model to the microservices architecture concept leveraging in turn the smart end points aspect. All the transaction data needs to be embedded in the event combined with any contextual state data. Through the use of interceptors or other adapters the data can be deserialized in to the appropriate service specific representation. This is key since the service need not and actually should not be built to consume the event data structure.

While the approach described above would require you to invest significantly in setting up the requisite infrastructure components to provision compute nodes on the fly to handle events, given the recent release of the AWS Lambda service provides us with an opportunity to apply this concept more easily albeit with some new terminology: microservices are implemented as AWS Lamda functions! It would be very interesting to figure out if argument reduction is supported! Check this blog again in a few weeks to find out…

Virtual Machines, Containers, and now LXD: What’s best for me?

First there were virtual machine instances running on bare metal or para-virtual hypervisors, then came containers allowing for better utilization of virtual machine instances, and now we have the Linux Container Daemon (LXD) pronounced “lex-dee”. As the tag line goes “The new hypervisor isn’t a hypervisor, and it’s much, much faster”; one quickly realizes that hyperboles are not where your troubles end! Instead your choices for virtualization just got trickier to sort out.

So here are some simplifying rules to get going quickly:

1. If you have the luxury of hosting your application entirely on the public cloud, stick to the good old fashioned VMs managed using the auto-scale functionality provided by the cloud platform of your choice.

2. If you have the luxury of creating your private cloud using paid software such as the vCloud suite, stick to the good old fashioned VMs managed using the components of the suite such as vRealize.

3. If you are stuck with the free bare metal hypervisors such as vSphere ESXi or Hyper-V(not entirely free) or with paravirtual hypervisors such as KVM then get the right DevOps skills on your team and use Docker containers managed using Fig for development and Chef for production environments.

4. If you are the brave spirit capable of dealing with the hardcore hardware and low level kernel configuration matters opt for Metal As A Service (MAAS) offering from Canonical combined with Juju.

5. If you are lucky to have the hardware setup exactly as required for running OpenStack/CloudStack and have the chops to customize the provided management apps then by all means rock your world by replicating most of the basic AWS offerings within your datacenter.

6. Finally, if you are truly bored of your mind by all the mundane options listed above and have the distinct air of “Been there – Done that” around you with the appropriate management support within your back pocket venture in to the brand new entrants such as LXD!

Of course you might also just be lucky enough to be in my position: Make recommendations and then sit back to enjoy the show! And, of course, come back to this blog for more …