The market is swarming with cheap Virtual Private Server (VPS) or Virtual Dedicated Server (VDS) offers  these days, and the specs don’t look bad: you can get as much as 2GB RAM, 50GB storage space and a double core 2GHz CPU for less than 10$ / 8€ and nearly unlimited traffic. That’s ideal for developing applications, while offering access to testers and coworkers to new release candidates. This post deals with the non-obvious shortcomings of VPSs and workarounds.
You’d usually run a database, a web server and/or servlet container and maybe a source control system on the box, also taking advantage of the automated backup facilities the hosters frequently offer.
The deal looks fair at first look: you get to share the CPU and RAM with other users, so depending on your luck or the hoster’s generosity your box will get a varying amount of resources which might not be perfect in all cases, but should be enough to get your work done. Right?
Unfortunately not quite so. The type of virtualization the casual user of VMWare or VirtualBox (like yours truly) is used to simulates an entire set of hardware where the guest operating system can make itself at home. This type of virtualization goes under the name of “full virtualization” and offers the advantage that any type of OS can be run within it offering full resource control to the guest OS.
The downside is that resources assigned to the guest OS cannot be shared between multiple instances of that or other guest OSs. Since a webhoster runs a business and (hopefully) trims his business model for efficiency, he’ll run a different type of virtualization called “paravirtualization”. In this type of virtualization the guest OS is fully aware that it runs in a simulated machine. Instead of talking to simulated hardware it talks directly to an API provided by the host OS. As a matter of fact, you could imagine the guest OS as a process running in the host OS – a far fetched analogy for a guest OS would be a Java VM running a servlet container which runs multiple web applications.
While this obviously makes it possible for the hoster to run much more guest OS instances on any hardware, these instances now compete not only for real resources but also for logical resources like connections, file handles, number of threads and processes.
I’ve been playing around with a low budget Virtuozzo (Ubuntu 8.04 64bit) instance and found a couple of caveats:
1. Memory Fragmentation
What looks nice in the ad is that you get a “guaranteed minimum of $ramMin MB RAM” which then goes “up to $ramMax MB RAM”. As the guest OS does not manage the memory on its own but rather leaves memory allocation and management to the host OS, an increased number of proccesses running on the host OS makes it harder and harder to get coherent chunks of memory for your application. While you’ll be able to run a lot of small processes it can get impossible to run a single large process. Unfortunately I have not found a way around that – my box will refuse to run a java VM with anything more than 32MB heap. However it’s easy to run multiple instances of the same VM, as long as each instance stays below that limit. So better be a chap and take it as an exercise in load balancing 🙂
2. Maximum Number of Processes
This one hurts almost more that the memory fragmentation problem. Linux tends to run a process for nearly anything, be it an SSH login, a simple ls – Apache can create a separate process for every Http request it handles. After load-testing a Tomcat installation, the very same installation refused to accept any further connections or start any further processes which was due to Tomcat’s default worker thread pool of 150 threads. Virtuozzo apparently counts threads against the process pool, which causes the box to refuse to start any further threads. The solution is to reduce the worker thread pool in server.xml :
<Connector protocol="HTTP/1.1" connectionTimeout="20000" maxThreads="4" maxKeepAliveRequests="20" enableLookups="false"/>
A note of caution: the setting above will just get you through a load test under the conditions described in this post, it’s probably unusable for production.
3. Maximum Number of Filehandles
Linux counts TCP connections and open files against the same pool of file handles. While your webapp probably won’t keep too many open files, connections from browsers to the web server, connections from a reverse proxy or load balancer to the web server and connections from the web server to the database all count against that connection pool. Unfortunately, one of my favourite performance boosts is a sure death in restricted VPS environment: keep alive (see
Keep alive is intended to reduce TCP handshakes between browsers and servers. When keep alive is enabled (by default it is!) then a browser will keep a connection open to the server for a substantial amount of time even if it does not transfer anything over that connection. While it might be best to find the limit for your setup, the example below will disable keep alive (see ):
<Connector protocol="HTTP/1.1" connectionTimeout="20000" maxThreads="4" maxKeepAliveRequests="1" enableLookups="false"/>
4. Virtual memory [update 01.01.2011]
An issue I overlooked entirely in the first draft of this post was the availability of virtual memory on a VPS. The JVM (at least Sun’s JVM) is allocating a surprisingly high amount of virtual memory, even when instructed to reserve merely a few MB of heap via the –Xmx switch. Here is a top report from an Ubuntu 8.04 VPS running a JVM with 64MB heap:
PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16 0 10296 772 648 S 0 0.1 0:00.01 init
18 0 552m 258m 11m S 0 25.2 2:01.76 java
… which wouldn’t be particular of a drama if the VPS itself wasn’t constrained even on virtual memory. Up to my knowledge, a VPS operator will not allow you to reconfigure your instance to trade in disk space for virtual memory.
 Wikipedia on virtualization
 Tomcat connector configuration
 JVM memory usage on Bug Parade