Continuous Delivery: my notes
Summary
Continuous Delivery, by Jez Humble and David Farley is about three big ideas to get your code into production more reliably:
- Make a deployment pipeline: commit -> unit test -> acceptance test -> … -> deploy -> release
- Automate everything.
- DevOps. Project team should be mix of development, operations and quality assurance / test. Involve operations (sysadmins) from the start.
Stages of the deployment pipeline:
Stage 1: Commit Tests
Trigger off a version control push. Usually happens in Continuous Integration server.
-
Static analysis (lint, code metrics like cyclomatic complexity & coupling)
-
Compile
-
Unit test (output code coverage):
- Check that a single part of the app does what the programmer intended.
- Should be very fast.
- Do not touch the database, filesystem, frameworks, system time or external systems. Mock or stub these, or use in-memory db.
- Avoid the UI.
- Try to avoid testing async code. Should never need to sleep in unit tests.
- Include one or two end-to-end tests to prove app basically runs
-
Package a release candidate. Bake in version number.
Use OS’s packaging tools (
deb
,rpm
). Operations team will be familiar with it, all the tools support it. -
Push release candidate to artifact store (a file system, or full fledged artifact repository)
Stage 2. Automated Acceptance Tests
All later stages use the same steps.
-
A human user (for production) or automated system (for testing) selects a release candidate to deploy, from the available ones in artifact store.
-
Prepare the environment. Provision machines, install libraries, middleware, etc. A tool like Ansible or Puppet is essential here.
-
Deploy the release candidate binaries, pulling from the artifact store.
-
Configure the application. During acceptance testing this means stub/mock out interface to third-party systems.
-
Run Data migrations
Have a SQL script to create the database, and load reference data if necessary, to setup new environments.
Have automated database migration scripts for each step, with a forward and a back migration. Version the database (ID in a table). To get to a specific version you run the scripts forwards/backwards from version the database is at new to desired version.
Try to do non-destructive or additive work on the database; copy data to an archive table instead of deleting it, rename old columns instead of dropping, etc. Cleanup a few versions later once sure that rollback not needed. Interim version of app should maintain both schemas (maybe write to two tables, etc). This allows de-coupling of database migrations and app deploys.
Use Expand / Contract pattern. Add fields / tables in one release (Expand), remove them in a later release (Contract).
Rails / Rake database migrations seem most popular tool for all this, but there’s also Django migrations, Flyway, and others.
Application binaries should know which version of the database they work with, or keep an external mapping between application version and database version.
Try very hard not to share the database with other apps. Use service-oriented architecture instead.
-
Environment tests / Smoke tests
Did everything deploy and start correctly?
Is the environment as expected? Are all our dependencies available? Measure how long app takes to start (including cache priming). Ideally these tests are part of the application’s own startup code.
-
Run the Tests, or Release to end-users
Test whole stories at a time against a running version of the application in a production-like environment. Can be functional (does feature x do what is expected) or non-functional (security, resilience, etc). Check that the app does what the customer meant it to: “Given some initial context, When event occurs, Then there are some outcomes.”
Developed in partnership with the customer. Good acceptance tests need good requirements, and a shared understanding of what value the software delivers. Pushed to the edge, acceptance tests are an executable specification (aka Behaviour Driven Design).
Tests should use business language to talk to a driver layer. The driver will either talk to the same API the UI uses (best, UI should be thin layer anyway), or fill in UI fields (brittle, painful).
Make each test atomic; create data it needs and tidy up after. Use different user / naming for each test so you can run them in parallel. Be tolerant of initial conditions – if test creates three items don’t check that 3 exist, check that num_before + 3 exist. Can use database transaction isolation for this, each test in own transaction.
Acceptance tests use real versions of everything we control (database, cache, etc) but still stub out external / third-party systems. We need to be able to read and control their state. This replacement would happen at configuration time.
Stage 3. Automated Capacity Tests
Capacity testing needs realistic scenarios. Replay live traffic, or analyse it to define the scenarios. Choose a handful of key acceptance tests that match common production scenarios and improve them into capacity tests (ideally keeping them as acceptance tests too). They should be able to run in parallel and sequentially with each other.
Define minimum acceptable performance for each one and fail the test if below that. Optimizing code beyond that level is not in customers best interest.
We want to know the maximum load we can handle, and how that changes between versions.
The bottleneck in capacity tests is often the test itself. Have a no-op version of the application to see how fast the tests themselves can run.
Longevity testing is capacity testing but with less load, over a much longer period of time.
Stage 4. Integration Tests
During acceptance testing we stub out third party systems so we can direct their behaviour. Need to check external systems behave as expected.
We can’t read or control their state, and operator usually doesn’t welcome our scripts. Integration testing must be basic, low intensity, and probably run less often than acceptance testing.
In a service oriented architecture setup, this stage might be much bigger, and the Acceptance Tests stage correspondingly smaller.
Stage 5. Manual tests
The QA team do their thing. If successful, they mark that on the build / release candidate in the artifact store, so we know this build is ready for the next stage.
Stage 6. Deploying and Releasing
Continuous Delivery enables incremental regular deployments, which are far less risky than occasional, big-bang deployments. Deploy small units, often.
Use a single way to build and deploy to all your environments. This way the process gets a lot of testing. The process may involve several tools, such as a Continuous Integration server, a configuration management tool (Ansible, Puppet), probably some bash.
Ensure deployment process is idempotent. Tools like Ansible do this. Can also build a new (virtual) cluster from scratch each time (see blue-green deployment).
Don’t login to beta and production machines, all access is via automation tools.
Roll back is deploying the last previous-known-good version.
Backup data assets (db, files) before releasing. Use Expand/Contract (see Stage 2 / Run Data Migrations) so that you never have to roll back data.
Decouple deployment from release. Aim for Zero-downtime / Hot deployment. Run new versions side-by-side with old ones. Several ways to do this:
- Blue / Green deployments Don’t necessarily need two sets of machines (although that’s best). Can run two copies of the app on same boxes, using different directories and ports. Releasing or rollback is just a load balancer / router config change.
- Canary Release: Release to a small sub-set of servers. Switch a small sub-set of users to those. If all is well switch more servers, more users. To rollback just switch the users off those servers.
- Dark Launch: Release non-user visible parts first, so that you get real production traffic, but failures don’t impact the users.
- Feature Toggles Have a runtime way of switching behavior back to previous version.
However you do it, the shared resources (database, cache, message bus, external services) need to work with both versions of the application at the same time.
Continuous Deployment: The end-goal of the deployment pipeline is fully automated releases to production. Even if you can’t actually fully release, go as far as possible. Maybe auto-deploy to the ‘blue’ cluster, but don’t flip the switch. Or auto-deploy to your canary users (maybe internal company users).
Misc
Monitoring:
- Collect data. This comes from:
- Hardware: temperatures, fan speed, etc.
- OS: CPU, memory and swap usage, disk I/O, net I/O, etc.
- Middleware: Num DB connections, slow queries, redis latency, etc.
- App: Version, garbage collection, external systems info (liveness, connection pool), and lots of business level metrics.
- Logs: Treat writing good logs as a first-class requirement. Must be human readable and machine parseable.
- Display it: Dashboards for dev / ops and business. Show only the most important.
Making changes without branching (because branching breaks continuous integration):
- Hide changes until they are ready. Different namespace (URL), config switch.
- Break big changes into small incremental changes.
- Branch by abstraction:
- Create abstraction over part of the system that needs to change
- Refactor rest of the system to use the abstraction layer
- Create new implementation
- Switch abstraction layer to use new implementation
- Remove old implementation, and maybe abstraction layer
Treat build and deploy tools as seriously as the main project.
The continuous integration server the authors use is open source: Go.CD