As Market Dojo expands, as all good startups hopefully do, they have begun to need some additional technical staff on hand to support, expand and maintain the eponymous software on which the business depends.
I am the first of these. James, at your service. Well, more at Market Dojo’s service, but since their goal (now, indeed, our goal) is to provide the best possible service to their clients there is a certain amount of inheritance.
One of my first duties in this role has been to take at least partial ownership of the question “how and where do we host Market Dojo?”. Owning and managing our own ‘bare metal’ servers would be uneconomical both in terms of outlay and maintenance hours, so we have for the last few years outsourced that particular problem to $MediumHostingProvider.
For a while now, however, the prevailing view within Market Dojo has been that our contract with them has been ‘coming to an end’. This is in some ways a shame as they were an early partner in our journey and have provided us with excellent service over the last few years. Nevertheless, as our platform expands, with new features each release and a growing number of clients, the demands we place on our infrastructure itself grow accordingly.
Currently the Dojo operates as a single shard on a single VPS (Virtual Private Server); this has been fine and workable for a long time, but our desire for resilience and scalability is pushing us to look further afield for solutions – there’s only so much peace of mind a good backup system can produce.
It was suggested and indeed attempted, that we could improve performance at peak times by increasing our various provisions. Since at Market Dojo we use Ruby on Rails as our app framework the expectation was that we would see increased responsiveness with an increase in memory in particular, as although the number of CPU cores available makes some difference Rails is less parallel than one might like.
On evaluation, this solution struck the team as decidedly suboptimal. In particular the difference in usage between times of high demand, such as the Monday morning rush of auctions from our larger clients getting the head start on the week, and times of low demand such as that experienced late on a Sunday afternoon grows linearly with the number of users; the latter case is negligibly close to zero usage by comparison. This would mean that simply increasing our VPS size was more apt to waste money and electricity than to provide a noticeable all-around benefit to both us and our clients.
Better solutions are out there. There are some very large cloud providers offering both IaaS (Infrastructure as a Service; Virtual Private Servers, for example, managed by the user) and PaaS (Platform as a Service; managed services which take care of the deployment, building and running of the background pieces behind an app, such as databases and web servers, as well as hosting) for various needs, most of which are intended to be scaleable.
After some initial winnowing, we settled on a shortlist of three: IBM Bluemix/Cloud Foundry, AWS Elastic Beanstalk and Google App Engine for PaaS, and the same companies’ offerings for IaaS – SoftLayer, EC2 and Compute Cloud respectively.
We had a number of requirements to consider in order to bring this down to a final choice.
That last point was underpinned by a side concern of backward compatibility. Portions of the Dojo are in the process of being refactored, improved and reconstructed, others have already been seen to, but others still are yet to be addressed in this cycle. It’s an undeniable fact that any piece of software older than one or two years will be at least partly outdated, unless it is particularly small or has a particularly deific development team behind it. For our purposes this ruled out any platform which enforced restrictions on versions of Ruby, Rails, or any of the various Gems upon which we rely.
The decisions became easier from there on.
IaaS | PaaS | |
Benefits | Full control | Someone else administers |
Choice of web server | Provided managed web server | |
Billed by spec (predictable) | Billed by usage (auto-flexible) | |
Powerful CLI (Usually) | Easy GUI (Usually) | |
Typically cheaper than PaaS | Low technical knowledge barrier to administration | |
Can be expanded in real time to cope with changes in demand | Can expand themselves in real time to cope with changes in demand | |
Easy to automate setup via images or Puppet | Easy to automate setup via config file and push hooks | |
Drawback | Have to administer manually | No control over software |
Responsible for software and dependencies | Often hard to install additional dependencies | |
Requires technically competent administrator to make any changes | Narrow access routes | |
Typically more expensive than VPS | ||
Can require official technical support to get anything major done |
Tables make everything better. There wasn’t a justifiable reason for a good chart or map in this particular investigation, but there were plenty of data to work with all the same.
There were also a great many emails. An important thing to consider when you’re contemplating a move of this kind, especially where it affects the nature of your underlying infrastructure, is how your larger clients might be disposed towards the changes.
In this case, we consulted with two of our most active large clients, who were helpful in providing both recommendations and requests for the new infrastructure, as both companies have their own security teams. The ability to engage in this sort of collaboration with clients is a boon to all involved, relying heavily on the fact that commercial relationships in general, and security in particular, are inherently positive sum.
Elements of the feedback are confidential, but from our perspective one of the most important points raised was that there is an expectation of external assessment. If you happen to be planning a move of this kind, it would be advisable to budget from the beginning for engaging an independent assessor after the move has been made. We encountered this request before it became relevant, which is another advantage of having this consultation early in the process, so it didn’t constitute a temporal setback despite being effectively a change in scope.
In the ‘known knowns’ column, some of our clients have a strong preference for their data being stored in the EU, subject to the Data Protection Act. This rules out the major US data centres. Fortunately, all the services under consideration had data centres in either the UK, Ireland or Belgium. We also had to consider encryption and colocation of data, such as would occur in a shared database or shared hosting. An encrypted-at-rest VPS is the very minimum expected, however, so those were not significant barriers.
Had we been looking at smaller hosting providers, or at self-hosting, we might also have had to consider redundancy of hardware in addition to the redundancy of data; one of the major advantages of using the large cloud providers is that they have all those concerns in hand; barring natural disasters, enemy action or Outside Context Problems they are highly unlikely to suffer full loss of data.
Mitigation of at least the first risk, and in many cases the second, can be performed relatively simply by having backups in a datacentre in another environment. Our particular use case rules out the optimal configuration of having replicas on every continent and under multiple jurisdictions, which lowers the risk of coordinated attack or localised natural disaster (albeit raising the chance of encountering a subpoena); however, all three of our potential suppliers had multiple EU data centres.
All this in hand, we engaged in a deeper dive.
For testing, we set a procedure:
The nice thing about establishing a procedure beforehand is that there isn’t much wiggle room for preference, not enough coffee, or whatever else might be distracting at a given moment. I’m always mindful of a study demonstrating that the simple act of having and using a checklist correctly reduces the incidence of mistakes and negative outcomes.
The result of all this data collection is still to be determined, but so far we’re quite pleased with the results. Overall, we expect to see both an annual cost saving and an increase in performance, at least initially; as we begin to use the potential of the cloud services to mirror, expand and scale the former of those gains may be sacrificed to the latter. That, however, is another article.
Market Dojo helps procurement professionals negotiate better with our on-demand eSourcing tools. If you’d like to find out more, get in touch or register for free and play around with our software for yourself!