If there is one phrase that sums up the architect's craft, it is this: "it depends." Hardly any architectural decision is good or bad in absolute terms; it is good or bad for a specific context and specific priorities. Designing a system consists, at bottom, of a chain of decisions where gaining in one attribute almost always means giving up another. We call these deliberate sacrifices trade-offs or compromises. In this lesson you will learn what makes a decision "architectural," what the most universal trade-offs are (starting with the CAP theorem), how to document decisions with ADRs, and how to evaluate architectures in a structured way with the ATAM method.
Contents
- What a significant architectural decision is
- The concept of trade-off
- Classic trade-offs
- The CAP theorem
- Documenting decisions: ADRs
- Evaluating the architecture: the ATAM method
- What a significant architectural decision is
Not every technical decision is architectural. A decision is architecturally significant (what the literature calls an Architecturally Significant Decision) when it meets one or more of these characteristics:
- High cost of change: reversing it would mean rewriting important parts of the system.
- Broad scope: it affects several components, teams, or quality attributes.
- Impact on quality attributes: it shapes performance, security, availability, etc.
- Creates or removes constraints: it opens or closes future options.
Examples of significant decisions: choosing between a monolith and microservices, choosing the data consistency model, defining the authentication mechanism, selecting the style of communication between services. Examples of non-significant decisions: the name of a variable, the indentation format, which minor utility library to use.
- The concept of trade-off
A trade-off is an exchange: to gain something, you give up something else. In architecture, resources and quality attributes are in permanent tension.
graph LR
D["Decision:<br/>add caching"]
D --> R["+ Performance"]
D --> C["- Consistency<br/>(possibly stale data)"]
D --> M["+ Complexity<br/>(cache invalidation)"]This Mermaid diagram (graph LR, left to right) shows that a single decision—adding a cache—simultaneously produces a positive effect (more performance) and two negative ones (less consistency and more complexity). The key lesson is that every decision has consequences across several dimensions at once, and the architect's job is to make them explicit so as to decide with eyes open.
The right question is never "is this option good?" but "what do I gain, what do I lose, and is it worth it given my priorities?"
- Classic trade-offs
These compromises appear again and again in any system:
| Trade-off | One extreme | The other extreme | How to decide |
|---|---|---|---|
| Consistency vs. availability | Data always correct | System always responds | Depending on the domain (banking vs. social network) |
| Cost vs. performance | Cheap infrastructure | Ultra-fast response | Depending on the value of latency to the business |
| Simplicity vs. flexibility | Direct, rigid solution | Generic, configurable solution | Depending on how much change is anticipated |
| Time-to-market vs. technical debt | Ship now, take shortcuts | Build well, slower | Depending on the product's phase |
| Coupling vs. autonomy | Integrated components | Independent services | Depending on the need for autonomous deployment |
| Monolith vs. microservices | Simple to develop/deploy | Scalable and autonomous in parts | Depending on team size and scaling needs |
Let's look at the last one in detail:
| Criterion | Monolith | Microservices |
|---|---|---|
| Initial complexity | Low | High |
| Deployment | A single unit | Many independent units |
| Scaling | All or nothing | Selective per service |
| Data consistency | Simple (one DB, transactions) | Complex (eventual consistency) |
| Operational cost | Low | High (orchestration, observability) |
| Best for... | Small teams, not-huge domains | Large organizations with well-defined domains |
The usual conclusion: start with a well-structured monolith (sometimes called a "modular monolith") and extract microservices only when a concrete pain point justifies it. Jumping straight to microservices "because it's modern" is a very common and costly mistake.
- The CAP theorem
The CAP theorem, formulated by Eric Brewer, is the most famous trade-off in distributed systems. It states that a distributed system cannot guarantee the following three properties simultaneously, only two:
- C — Consistency: every read receives the most recent data or an error.
- A — Availability: every request receives a response (even if it is not the most recent data).
- P — Partition tolerance: the system keeps working even if communication between nodes is lost.
The key that many overlook: in a real distributed system, network partitions do happen—they are not optional. Therefore, P is mandatory, and the real choice is between C and A when there is a partition.
graph TD
P["Is there a network partition?"]
P -->|"No"| N["Works with C and A normally"]
P -->|"Yes"| E["You must choose:"]
E --> CP["CP: prioritize Consistency<br/>(reject requests to avoid returning wrong data)"]
E --> AP["AP: prioritize Availability<br/>(respond even if the data is not up to date)"]This Mermaid diagram illustrates that the CAP dilemma only manifests during a partition. When there is no partition, a system can offer both consistency and availability. When there is one, it must choose: a CP system prefers to reject requests rather than return incorrect data (typical in banking); an AP system prefers to keep responding even if the data may be out of date (typical in social networks or catalogs).
| Type | Chooses | Example use | Example technology |
|---|---|---|---|
| CP | Consistency | Bank balance, critical inventory | Clustered relational databases, MongoDB (configurable), HBase |
| AP | Availability | Shopping cart, social feed | Cassandra, DynamoDB, Riak |
An important nuance: in practice, consistency is not binary. Many systems use eventual consistency, where data converges to the correct value after a brief period. The PACELC model extends CAP to also consider the trade-off between latency and consistency when there is no partition.
- Documenting decisions: ADRs
An architectural decision that lives only in the head of the person who made it is a lost decision. ADRs (Architecture Decision Records) are brief documents that capture a decision and, above all, its why. Popularized by Michael Nygard, they are now a standard practice.
A typical ADR has this structure:
# ADR-007: Use asynchronous messaging between Orders and Shipping ## Status Accepted (2026-03-15) ## Context The Shipping service suffers occasional outages. Today Orders calls it synchronously, so a Shipping outage causes errors visible to the customer and blocks order creation. We need to decouple both services and improve the availability of order creation. ## Decision Orders will publish an "OrderCreated" event to a messaging broker. Shipping will consume that event asynchronously. Orders will no longer call Shipping directly. ## Consequences Positive: - Order creation no longer depends on Shipping's availability (higher availability). - The services become decoupled and can be deployed separately. Negative: - Eventual consistency is introduced: shipping is processed with a small delay. - Greater operational complexity: the broker must be operated and monitored. - Duplicate messages and event ordering must be handled.
This ADR is written in Markdown and follows Nygard's format. The key sections are: Status (proposed, accepted, deprecated, superseded), Context (the problem and the forces at play), Decision (what is decided, in the present tense and affirmatively), and Consequences (the trade-offs, both good and bad, written honestly). The most valuable part is the Negative Consequences section: documenting what we sacrifice prevents someone in the future from "rediscovering" the problem and questioning the decision without understanding why it was made. ADRs are kept versioned alongside the code (for example, in a docs/adr/ folder).
- Evaluating the architecture: the ATAM method
How do you know whether a proposed architecture is suitable before building it? ATAM (Architecture Tradeoff Analysis Method), developed by the SEI, is a structured method for evaluating architectures by analyzing precisely their trade-offs against the priority quality attributes.
Essential steps of ATAM (simplified):
- Present the method to the participants (business and technical).
- Present the business drivers: business goals and priority quality attributes.
- Present the proposed architecture.
- Identify architectural approaches (patterns and tactics used).
- Build the utility tree: quality attributes are broken down into concrete scenarios, prioritized by importance and by risk.
- Analyze the approaches against the priority scenarios.
- Identify key points, in three categories.
- Present results.
In step 7, ATAM classifies findings into three types that are useful even outside the formal method:
| Concept | Definition |
|---|---|
| Sensitivity point | A decision that decisively affects one quality attribute |
| Tradeoff point | A decision that affects several attributes in opposite directions (an explicit trade-off) |
| Risk | A decision that could have negative consequences given the priorities |
Full ATAM is a heavyweight process, designed for critical systems. But its philosophy is always applicable: evaluate the architecture against prioritized quality scenarios and make the tradeoff points and risks explicit. Even a lightweight version, on a whiteboard and in a couple of hours, provides enormous value.
Common Mistakes and Tips
- Making decisions by fashion. "Microservices because they're trendy" ignores context. Decide according to your priority attributes, not the hype.
- Not documenting the why. Without ADRs, decisions are questioned endlessly and knowledge is lost when the team rotates.
- Hiding the negative consequences. A good ADR is honest about what is sacrificed. Glossing over trade-offs leads to painful surprises.
- Believing the "correct" decision exists. Almost everything is a compromise. Seek the decision appropriate for your priorities, not the perfect one.
- Tip: prioritize before deciding. Ask the business to rank the quality attributes. Without priorities, trade-offs cannot be resolved.
Exercises
Exercise 1. For a bank balance system and for a social network feed, state whether you would choose CP or AP in the event of a network partition, and justify it.
Exercise 2. Write a brief ADR (status, context, decision, consequences) for the decision "Adopt a single relational database instead of a database per microservice" in an application that is still small.
Exercise 3. For the decision "introduce a distributed cache in front of the database," identify a sensitivity point, a tradeoff point, and a risk according to ATAM terminology.
Solutions
Solution 1. Bank balances: CP. It is preferable to reject an operation (be momentarily unavailable) rather than show or allow operating on an incorrect balance; consistency is critical. Social network feed: AP. It is preferable to keep showing content (even if the latest post is missing) rather than stop responding; availability outweighs seeing the exact data instantly.
Solution 2. Example: Status: Accepted. Context: The application is small, the team is small, and the domain is not yet stabilized; managing several databases would add operational and consistency complexity with no clear benefit. Decision: Use a single relational database shared by the modules, keeping them logically separated (modular monolith). Consequences: Positive: simple transactions, strong consistency, lower operational cost, faster development. Negative: coupling to a single schema, a possible bottleneck when scaling, and greater future effort if we want to migrate to a database per service.
Solution 3. Sensitivity point: the cache's expiration time (TTL) decisively affects performance (how much the cache is leveraged). Tradeoff point: the cache improves performance but harms consistency (potentially stale data) and increases complexity. Risk: an incorrect invalidation strategy could serve stale data to users or, worse, cause inconsistencies that are hard to detect.
Conclusion
You have learned to recognize what makes a decision architecturally significant, to reason in terms of trade-offs, to handle the classic compromises and the CAP theorem, to document decisions with honest ADRs, and to evaluate architectures with the ATAM philosophy. The idea that runs through the whole chapter is that architecture is the art of consciously choosing what to sacrifice. But brilliant decisions are useless if no one understands them or if they cannot be communicated to the team: in the next and final lesson of the module you will learn to document architecture through views and the C4 model, so that your decisions are conveyed clearly.
Application Architecture Course
Module 1: Fundamentals of Application Architecture
- What Is Application Architecture?
- The Role of the Software Architect
- Quality Attributes and Non-Functional Requirements
- Architectural Decisions and Trade-offs
- Architecture Documentation: Views and the C4 Model
Module 2: Design Principles and Tactics
- Coupling, Cohesion and Separation of Concerns
- SOLID Principles Applied to Architecture
- DRY, KISS, YAGNI and Other Design Principles
- Architectural Tactics for Quality Attributes
- Managing Technical Debt
Module 3: Architectural Styles and Patterns
- Monolithic Architecture
- Layered Architecture (N-Tier)
- Client-Server Architecture
- Hexagonal Architecture (Ports and Adapters)
- Clean and Onion Architecture
Module 4: Distributed Architectures and Microservices
- Introduction to Distributed Systems
- Microservices Architecture
- Service Decomposition and Bounded Contexts
- API Gateway, Service Discovery and Inter-Service Communication
- Resilience Patterns: Circuit Breaker, Retry and Bulkhead
- The CAP Theorem and Data Consistency
Module 5: Event-Driven Architectures and Messaging
- Fundamentals of Event-Driven Architecture
- Asynchronous Messaging: Queues and Brokers
- Event Patterns: Event Sourcing and CQRS
- Managing Distributed Transactions: The Saga Pattern
- Real-Time Data Streaming
Module 6: Domain-Driven Design (DDD)
- Core DDD Concepts
- Strategic Design: Bounded Contexts and Ubiquitous Language
- Tactical Design: Entities, Aggregates and Repositories
- Context Mapping
Module 7: Data and Persistence
- Persistence Strategies: SQL vs NoSQL
- Data Access Patterns: Repository, Unit of Work and DAO
- Database per Service and Distributed Data Management
- Caching and Invalidation Strategies
Module 8: Cloud Architecture and Deployment
- Cloud Computing Fundamentals (IaaS, PaaS, SaaS)
- Containers and Orchestration with Docker and Kubernetes
- Serverless Architecture
- Cloud-Native Design Patterns
- Infrastructure as Code (IaC)
Module 9: Quality, Security and Observability
- Scalability: Horizontal vs Vertical and Load Balancing
- High Availability and Fault Tolerance
- Security by Design and Authentication/Authorization
- Observability: Logging, Metrics and Tracing
- Performance and Load Testing
