Cloud-specific software architecture patterns

cloud patterns

This post is about software application architecture patterns that simplify application design by leveraging cloud features.

New: Download the cloud pattern cheat sheet

 

Packaged configuration

Packaged configuration cue card
What
Configuration is packaged with deployment artefacts
Motivation
Simplify system, increase resilience by removing runtime dependency on configuration service
How
Configuration is managed in configuration repository, CI/CD combines generic application artefact with stage/tenant-specific configuration and deploys it
When
Multiple stages / tenants, build pipeline is flexible
Cons
No runtime update of configuration, configuration changes require redeployment

PaaS products often come with a facility for distributing configuration such as 3rd-party systems (e.g. spring cloud config [1]), tuning parameters and i18n (locale-specific translations of UI elements) to services and applications at run-time, which means that:

  • facility needs to be maintained
  • it is critical for operations and uptime
  • it need a service discovery mechanism
  • applications need to implement a specific configuration API
packaged configuration

Application configuration in a cloud environment leverages the fact that deployments are fast and cheap enough that configuration can be deployed together with the deployment artefact (e.g. a WAR file). The illustration depicts a possible implementation of that pattern: a CI server produces the application binary and stores it in an artefact repository, a further CI stage then pulls the deployment artefact from said repository and combines it with configuration files from a dedicated configuration repository into a stage/tenant-specific deployment artefact. That artefact is then deployed to the cloud runtime and alternatively stored again in the artefact repository.

The “Packaged configuration” architectural pattern moves a runtime concern into the build phase, thus eliminating a central runtime service, simplifying the system design and increasing its reliability. It does, however, require a powerful build pipeline and managed access to the configuration repository. Also, configuration updates require redeployments which may affect system uptime (but not necessarily, see later about “Swarm Uptime”).

Natural multi-tenancy

Multi-tenancy pattern cue card
What
Each logical tenant and/or stage is an isolated installation
Motivation
Simplify system by designing only for single tenant use
How
Each tenant runs in a dedicated and isolated environment
When
Groups of users requiring isolated setups
Cons
Problematic when users have access to multiple tenants
multi-tenancy

A typical multi-tenant-capable application is able to run multiple tenants on a single appliance, which requires that the tenant differentiator is present on all APIs, sessions and persistence entities. Furthermore, the application needs to include access control logic which hides other tenants from a logged in user. Furthermore multi-tenant applications can’t scale easily for selected tenants but only as a system.

Cloud platforms simplify multi-tenancy by provisioning isolated and distinct environments for each tenant or stage, allowing application designers to push tenant access control logic to the cloud environment (also see “Security and access control”).

Possible limits to this pattern are use cases where information needs to be exchanged between tenants, e.g. users who can access multiple tenants during a session, request reports that span multiple tenants and exchange information between tenants.

Swarm uptime

Uptime pattern cue card
What
Combining uptime of multiple, unreliable service instances into high application uptime
Motivation
Lower cost by using cheap, volatile instances
How
The cloud runtime sends tasks to running service instances, but not to failing instances. Failing instances are restarted automatically.
When
App instances don’t rely on internal state, or that state can be restored easily. App instances boot quickly.
Cons
Containerised applications must be able to handle arbitrary restarts

In the good, old days of computing, application uptime was bound by the upper limit of server uptime… thus server maintenance also brought down the applications running on it . Cloud-aware applications distribute tasks to multiple service instances on multiple servers, so a true application downtime occurs only when large parts of the data centre are offline. For all of that to work, service instances need to be able to start quickly and be either stateless or resume conversations with clients from a serialized persistence store.

Automated maintenance

Automation pattern cue card
What
Offload maintenance tasks to runtime platform
Motivation
Simplify application & project design
How
The cloud runtime handles maintenance tasks transparently
When
Periodic instance restarts to combat latent resource leaks, clean filesystem, backup, store logs, produce metrics
Cons
Can’t handle overly specific tasks

Regardless how diligent a programmer I am, at some point my application is going to fail inexplicably in production; it won’t handle more requests, won’t connect to a database, run out of memory or just consume all CPU without any obvious reason. The immediate solution is almost always to restart the application and, in tougher cases, the server. Some times the issue might occur too rarely for a debugging and fixing cycle to make economic sense, then a nightly cron job restarting the server is an acceptable workaround. This is just one example of a maintenance tasks, more being making backups of local data and configuration, rotating and storing log files, and collecting metrics. Much of that had to be previously implemented either in the application or scripted by administrators in cron jobs or shell scripts.

Cloud runtimes handle such jobs routinely as part of their service description, which simplifies both developers’ and administrators’ life, not to mention change management. Tying in with “Swarm uptime”, that’s another great example how cloud runtimes simplify application design.

Outscale caching

Pattern cue card
What
Achieve high application performance
Motivation
Simplify application design by avoiding the complexity of caching
How
Scale service instances instead of caching data
When
Caching (and cache invalidation) are overly complex for the domain, service instances don’t rely on internal state, service invocation cascades are shallow, latency requirements can be met
Cons
Potentially resource-hungry

The primary motivation for the “Outscale caching” pattern is to throw hardware at a problem. While caching is the second best way to speed up a computation (the first best way being avoiding the computation in the first place…), it doesn’t come without issues: caches need to be primed first (slowing down startup), they consume memory and they hinder scalability as they need to be synced. Caching is a trade-off between staleness and performance. Invalidating caches can be tricky, syncing caches between app instances too.

Service discovery

Service discovery pattern cue card
What
Broker between service instances
Motivation
Simplify application design by avoiding the complexity of service discovery
How
The cloud platform injects service location information as part of “Packaged configuration” and routes requests dynamically at runtime
When
Always
Cons
None

Configuring an application can involve providing IP addresses and ports of other services, with DNS helping a bit by mapping names to IP addresses. When service locations change, then all dependent services need to get the new configuration, or wait till DNS updates the stale entry. More downtime ahead.

A cloud runtime will orchestrate dependent services between each other by not only injecting aliases for their locations as part of “Packaged configuration”, but by also dynamically routing requests to life service instances and around failing instances (see “Swarm Uptime”). Some modern micro service architectures I have seen implement service discovery with brokers like Eureka in order to remove the dependency on a particular cloud runtime, introducing more complexity.

Security and access control

Security pattern cue card
What
Secure application easily
Motivation
Simplify application design by delegating (more) access control to the cloud runtime
How
Less infrastructure is exposed, access is controlled by the platform
When
Always
Cons
Reduced flexibility

Cloud runtimes famously eliminate the server from a system overview as servers are replaced by containers which are volatile and run with minimal software and privileges. Access paths to applications are strictly defined through the cloud network routers and subject to the cloud’s own access control. Last not least, environment isolation reduces the need for application-level access check logic because the cloud runtime won’t permit certain users to access an application.

Infrastructure as code

Infrastructure pattern cue card
What
Configure infrastructure through (versioned & audited) code
Motivation
Simplify change management
How
Infrastructure as code allows defining entire systems through declarative scripts
When
In a cloud environment
Cons
Hard to get used to, slows down development, won’t work on non-cloud-native infrastructure

Infrastructure as code is a major contributor towards repeatable deployments and software defined infrastructure. Developers or administrators prescribe the exact constitution of a system in a set of configuration files with all its components, deployment artefacts and dependencies. The configuration resides in a versioned and audited source code repository from when there cloud runtime obtains configuration and constructs suitable application environments.

A possible downside is that trial and error development can slow down to a standstill when every minor change has to be coded into configuration and then deployed to a cloud environment as, despite all advances, typical enterprise applications will still take several minutes to start even on contemporary cloud platforms. Furthermore, the development lifecycle of applications relying on 3rd party systems that do not reside in the cloud (as is sometimes the case with mainframe services and databases) won’t benefit as much from infrastructure as code.

 

Resources

[1] Spring cloud config
https://cloud.spring.io/spring-cloud-config/

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.