For decades it was assumed that architecture was something decided at the start and then extremely costly to change: "the important thing is to get it right from day one." The reality is that no design survives contact with the evolution of requirements intact. Evolutionary architecture, formulated by Neal Ford, Rebecca Parsons, and Patrick Kua, proposes the opposite: instead of trying to predict the future, we build systems that support guided, incremental change across multiple dimensions. How do we prevent the architecture from degrading, as it evolves, without anyone noticing? With fitness functions: objective, automated mechanisms that measure whether an architectural characteristic stays within acceptable limits. In this lesson we will look at what guided incremental change is, the types of fitness functions (atomic, holistic, triggered, continuous), real examples of architecture tests with ArchUnit, and how to automate all of this in the pipeline.

Contents

  1. What evolutionary architecture is
  2. Guided incremental change
  3. What a fitness function is
  4. Types of fitness functions
  5. Architecture tests with ArchUnit
  6. Other fitness functions: performance, coupling, security
  7. Automation in the pipeline
  8. Common Mistakes and Tips
  9. Exercises
  10. Conclusion

  1. What evolutionary architecture is

An evolutionary architecture is one that supports guided, incremental change as a first-class principle, across multiple dimensions. Let's break down that definition:

  • Guided change: not just any change, but change in the right direction. We need something that tells us whether we are on the right track: that is where fitness functions come in.
  • Incremental: the architecture changes in small, reversible steps, just as we saw with the Strangler Fig pattern, not in huge leaps.
  • Multiple dimensions: not just the code. Also performance, security, scalability, accessibility... Each important characteristic is a dimension to watch.

The underlying premise is humble: we cannot predict the future, so instead of over-designing for imaginary requirements, we design to be able to change and we set up guardrails that warn us if the architecture drifts.

  1. Guided incremental change

"Incremental" and "guided" are two separate properties that need each other:

  • Without incrementality, each change is large and risky: you need zero-downtime deployments, test automation, and reliable pipelines to move the system in small steps.
  • Without guidance, incremental change can take you slowly but surely... toward disaster. The guidance is provided by fitness functions: they check that each increment maintains the desired architectural characteristics.

A useful analogy: fitness functions are to architecture what unit tests are to code. Tests do not guarantee that the code is good, but they prevent what already worked from breaking. Fitness functions do not guarantee a perfect architecture, but they prevent it from degrading silently as it evolves.

  1. What a fitness function is

The term comes from genetic algorithms, where a fitness function measures how close a solution is to the objective. In architecture, a fitness function is any mechanism that provides an objective assessment that one or more architectural characteristics stay within acceptable limits. The key word is objective: it must give a measurable result, not an opinion.

Examples of characteristics and their fitness function:

Architectural characteristic Fitness function (example)
Modularity Test that forbids cyclic dependencies between packages
Performance The 95th percentile of latency stays < 200 ms
Coupling The afferent coupling of a module does not exceed N
Security No dependency with a known critical CVE
Maintainability The domain layer does not import infrastructure frameworks

The important thing is that each one can be measured and automated. If you cannot write a check that returns pass/fail (or a number comparable to a threshold), it is not a fitness function, it is a wish.

  1. Types of fitness functions

Fitness functions are classified along several axes. The most useful ones:

Axis Types Meaning
Scope Atomic vs. Holistic Atomic: measures a single characteristic in isolation. Holistic: measures the interaction of several at once (e.g., security under load).
Execution Triggered vs. Continuous Triggered: runs on an event (a commit, a deployment). Continuous: monitors in production in real time.
Metric Static vs. Dynamic Static: fixed result (pass/fail, like a test). Dynamic: depends on the context (a threshold that varies with load).
Automation Automated vs. Manual The vast majority should be automated; manual ones (reviews) are reserved for what cannot be measured.

The most common and cheapest to implement are the atomic, triggered, and automated ones: tests that run on every build and check a specific rule. Architecture tests with ArchUnit fall right here, and they are the best starting point.

  1. Architecture tests with ArchUnit

ArchUnit is a Java library that lets you write architecture rules as normal tests (JUnit). It analyzes the bytecode of your classes and checks assertions about packages, dependencies, names, layers, etc. It is the most direct way to turn an ADR into an executable fitness function.

import com.tngtech.archunit.core.importer.ClassFileImporter;
import com.tngtech.archunit.core.domain.JavaClasses;
import org.junit.jupiter.api.Test;
import static com.tngtech.archunit.lang.syntax.ArchRuleDefinition.*;

class ArchitectureRulesTest {

    // We import all the project's classes just once
    private final JavaClasses classes =
        new ClassFileImporter().importPackages("com.fiatc.store");

    @Test
    void the_domain_does_not_depend_on_the_infrastructure() {
        noClasses()
            .that().resideInAPackage("..domain..")
            .should().dependOnClassesThat().resideInAPackage("..infrastructure..")
            .check(classes);
    }
}

Let's analyze this test, which materializes a key rule of hexagonal architecture:

  • importPackages("com.fiatc.store") loads the bytecode of all the classes under that package. ArchUnit does not run your code; it inspects it statically.
  • noClasses().that().resideInAPackage("..domain..") selects the domain classes. The .. is an ArchUnit wildcard: "any package that contains domain at any level."
  • .should().dependOnClassesThat().resideInAPackage("..infrastructure..") defines what is forbidden: depending on the infrastructure.
  • .check(classes) runs the rule and fails the test if any domain class imports something from infrastructure.

If tomorrow someone adds to a domain class an import of Spring or of a JPA repository, this test will turn red in the pipeline. The architectural rule stops depending on human discipline.

@Test
void application_services_are_named_correctly() {
    classes()
        .that().resideInAPackage("..application..")
        .and().areAnnotatedWith(Service.class)
        .should().haveSimpleNameEndingWith("ApplicationService")
        .check(classes);
}

@Test
void there_must_be_no_cyclic_dependencies_between_modules() {
    slices()
        .matching("com.fiatc.store.(*)..")
        .should().beFreeOfCycles()
        .check(classes);
}

These two tests add more guardrails:

  • The first imposes a naming convention: every @Service class in the application layer must end in ApplicationService. It seems minor, but naming consistency is a real maintainability characteristic.
  • The second is among the most valuable: slices() divides the system into "slices" by the first subpackage (Orders, Catalog, Payments...) and beFreeOfCycles() checks that there are no cyclic dependencies between them. Cycles are the beginning of the "big ball of mud"; detecting them automatically protects modularity.

ArchUnit even offers a high-level API for layers:

@Test
void the_layers_are_respected() {
    layeredArchitecture().consideringAllDependencies()
        .layer("Presentation").definedBy("..presentation..")
        .layer("Application").definedBy("..application..")
        .layer("Domain").definedBy("..domain..")
        // The domain cannot be accessed by anyone outside except Application
        .whereLayer("Domain").mayOnlyBeAccessedByLayers("Application")
        .whereLayer("Presentation").mayNotBeAccessedByAnyLayer()
        .check(classes);
}

Here we declare the layers and their access rules: the presentation cannot be accessed by anyone (it is at the top) and the domain is only accessible from application. ArchUnit verifies that the code's actual dependencies respect that hierarchy. It is an ADR about layers turned into a test that watches over itself.

  1. Other fitness functions: performance, coupling, security

Not everything is measured with ArchUnit. Other dimensions need other tools:

  • Performance (holistic, continuous): a load test (Gatling, k6) that fails if the p95 of latency exceeds a threshold. It can run in CI (triggered) or be monitored in production (continuous).
  • Coupling (atomic, triggered): afferent/efferent coupling metrics with tools like JDepend or ArchUnit itself, compared against a maximum.
  • Security (atomic, triggered): a dependency scanner (OWASP Dependency-Check, Trivy) that fails the build if a critical CVE appears.
  • Coverage/size (atomic): rules that prevent a module from growing beyond a certain number of classes without review.
# Security fitness function in the pipeline: dependency scanning
dependency-scan:
  stage: verification
  script:
    - trivy fs --severity CRITICAL --exit-code 1 .
  # exit-code 1 makes the job (and the build) fail if there are critical CVEs

This job turns the characteristic "the system does not use dependencies with known critical vulnerabilities" into an automated fitness function: trivy scans the dependencies and, with --exit-code 1, makes the build fail if it finds a critical CVE. Security stops being a one-off audit and comes to be verified on every change.

  1. Automation in the pipeline

A fitness function that has to be run by hand will end up being forgotten. The real value appears when they live in the CI/CD pipeline and fail the build automatically.

graph LR
    Commit[Commit / PR] --> Build[Compile]
    Build --> Unit[Unit tests]
    Unit --> Arch[Fitness: ArchUnit]
    Arch --> Sec[Fitness: security]
    Sec --> Perf[Fitness: performance]
    Perf -->|all green| Deploy[Deploy]
    Arch -->|red| Stop[Build fails]

The diagram shows the fitness functions as pipeline stages, at the same level as the unit tests. If the architecture rule (ArchUnit) fails, the build stops and the change does not reach production. Recommendations to make this work:

  • Fast first. Put the atomic and cheap fitness functions (ArchUnit, linters) at the start; the expensive ones (load) at the end or in parallel.
  • Clear messages. When a rule fails, the error must explain which rule and why, not just "test red."
  • Versioned with the code. The rules live in the repo and evolve in pull requests, just like ADRs.
  • Few and meaningful at first. Start with 3-4 rules that truly matter; add more as they start to hurt.

  1. Common Mistakes and Tips

  • Writing subjective fitness functions. "The code must be readable" is not a fitness function. If it does not return pass/fail or a number, it is useless.
  • Too many rules at once. A project that adds 50 ArchUnit rules on day one ends up with red builds everywhere and people disabling them. Start with few.
  • Rules that do not fail the build. A fitness function that only generates a report nobody reads is decorative. It must be able to stop the pipeline.
  • Confusing fitness functions with unit tests. Unit tests test behavior; fitness functions test architectural characteristics (structure, performance, security).
  • Not maintaining them. When an architectural decision changes (a new ADR), the associated rules must be updated. An obsolete rule that fails for no reason erodes trust.
  • Tip: pair each important ADR with a fitness function that verifies it. That way the documented decision and the actual decision never diverge.

  1. Exercises

Exercise 1. Classify these fitness functions by scope (atomic/holistic) and execution (triggered/continuous): (a) an ArchUnit test in CI that forbids cycles; (b) a production monitor that alerts if the p95 exceeds 300 ms; (c) a load test that measures latency while a security scan runs.

Exercise 2. Write (in pseudocode or with the ArchUnit API) a rule that forbids any class in the presentation layer (..presentation..) from directly accessing the persistence layer (..persistence..), bypassing the application layer.

Exercise 3. Your organization has accepted an ADR that says "no dependency with a critical CVE in production." How would you turn it into an automated fitness function and at what stage of the pipeline would you put it?

Solutions

Solution 1. (a) Atomic and triggered (it measures a characteristic—modularity—and runs on a commit). (b) Atomic and continuous (it measures performance permanently in production). (c) Holistic and triggered (it measures the interaction of performance and security at the same time, triggered by the pipeline).

Solution 2.

noClasses()
    .that().resideInAPackage("..presentation..")
    .should().dependOnClassesThat().resideInAPackage("..persistence..")
    .check(classes);

The presentation must always go through application; this rule fails the build if a presentation class directly imports something from persistence.

Solution 3. With a dependency scanner (Trivy, OWASP Dependency-Check) run in the pipeline with a critical severity threshold and --exit-code 1, so that the build fails if a critical CVE appears. It goes in a verification stage, before deployment, to prevent the vulnerable artifact from reaching production.

  1. Conclusion

Architecture is not a fixed snapshot decided at the start, but something alive that evolves. Evolutionary architecture embraces that reality by relying on guided incremental change, and fitness functions are the compass that keeps the direction: objective, automated checks that prevent architectural characteristics from degrading silently. We have seen how to classify them (atomic/holistic, triggered/continuous), how to write real architecture tests with ArchUnit to protect layers, names, and cycles, how to cover other dimensions such as performance and security, and how to integrate it all in the pipeline so that the architecture defends itself. With this we close the conceptual block of the module. In the last lesson of the course we will put everything learned into practice—styles, data, deployment, governance, and evolution—in an end-to-end case study: the complete design of an e-commerce platform.

Application Architecture Course

Module 1: Fundamentals of Application Architecture

Module 2: Design Principles and Tactics

Module 3: Architectural Styles and Patterns

Module 4: Distributed Architectures and Microservices

Module 5: Event-Driven Architectures and Messaging

Module 6: Domain-Driven Design (DDD)

Module 7: Data and Persistence

Module 8: Cloud Architecture and Deployment

Module 9: Quality, Security and Observability

Module 10: Evolution, Governance and Case Studies

© Copyright 2026. All rights reserved