Additionally, I want to share examples of how to define rules and responsibilities from the previous video overview. I believe it's a valuable tool to use in practice.
#architecture #tools
#architecture #tools
❤1
Architecture Without an End State
`Architecture Without an End State` is a concept introduced by M. Nygard. It describes architecture as a mix of past decisions, the current state, and future vision. This means transitioning a system from one architecture to another isn't possible; instead, iterative steps must be taken toward current business goals. But business goals and requirements may change over time, and this series of changes is endless.
Nygard suggests 8 rules for designing systems that adapt to change:
✏️Embrace Plurality: Avoid creating a single source of truth. Each system should have its own entity model with its own related data. Different representations are connected through a system of global identifiers (URNs and URIs). Assume an open world that can be extended anytime: new entities will be added, new integrations will appear, and new consumers will start using the system.
✏️Contextualize Downstream: Upstream changes can affect all downstream systems. Minimize the impact by localizing the context (since business rules have different interpretations and requirements across business units) and reducing the number of entities that all systems need to know about.
✏️Beware Grandiosity: Avoid enterprise modeling attempts and global object model creation, as they are often too complex to be practical. Instead, narrow the scope and build a model for a particular problem or area. Start small and make incremental changes based on feedback.
✏️Decentralize: This allows to make local optimization of resources and budgets.
✏️Isolate Failure Domains: Use modularity to define proper system boundaries and isolate failures. Resilient architecture admits componentwise changes
✏️Data Outlives Applications: Data lives longer than technology and specific applications. Use a hexagonal architecture (ports & adapters) to separate business logic from data storage.
✏️Applications Outlive Integrations: Integrations with other systems will change over time. Build a hexagonal architecture so that all systems are connected via explicit boundaries, making communication a boundary problem rather than a domain layer problem.
✏️Increase Discoverability of Information: Poor information flow within an organization leads to duplicate system functions and reinventing the wheel between different teams. Improving information discovery reduces costs of development (fewer duplicates). Some ideas include internal blogs, open code repositories, and modern search engines.
The only true end state for a system is when the company shuts down and all services are turned off. Main recommendation from the author is to stop chasing the end state; it's not achievable. Continuous adaptation is the most effective way to build a system, and these rules can help guide that process.
#architecture #systemdesign
`Architecture Without an End State` is a concept introduced by M. Nygard. It describes architecture as a mix of past decisions, the current state, and future vision. This means transitioning a system from one architecture to another isn't possible; instead, iterative steps must be taken toward current business goals. But business goals and requirements may change over time, and this series of changes is endless.
Nygard suggests 8 rules for designing systems that adapt to change:
✏️Embrace Plurality: Avoid creating a single source of truth. Each system should have its own entity model with its own related data. Different representations are connected through a system of global identifiers (URNs and URIs). Assume an open world that can be extended anytime: new entities will be added, new integrations will appear, and new consumers will start using the system.
✏️Contextualize Downstream: Upstream changes can affect all downstream systems. Minimize the impact by localizing the context (since business rules have different interpretations and requirements across business units) and reducing the number of entities that all systems need to know about.
✏️Beware Grandiosity: Avoid enterprise modeling attempts and global object model creation, as they are often too complex to be practical. Instead, narrow the scope and build a model for a particular problem or area. Start small and make incremental changes based on feedback.
✏️Decentralize: This allows to make local optimization of resources and budgets.
✏️Isolate Failure Domains: Use modularity to define proper system boundaries and isolate failures. Resilient architecture admits componentwise changes
✏️Data Outlives Applications: Data lives longer than technology and specific applications. Use a hexagonal architecture (ports & adapters) to separate business logic from data storage.
✏️Applications Outlive Integrations: Integrations with other systems will change over time. Build a hexagonal architecture so that all systems are connected via explicit boundaries, making communication a boundary problem rather than a domain layer problem.
✏️Increase Discoverability of Information: Poor information flow within an organization leads to duplicate system functions and reinventing the wheel between different teams. Improving information discovery reduces costs of development (fewer duplicates). Some ideas include internal blogs, open code repositories, and modern search engines.
The only true end state for a system is when the company shuts down and all services are turned off. Main recommendation from the author is to stop chasing the end state; it's not achievable. Continuous adaptation is the most effective way to build a system, and these rules can help guide that process.
#architecture #systemdesign
InfoQ
Architecture Without an End State
Michael Nygard outlines 8 rules for dealing with complex systems: Embrace Plurality, Contextualize Downstream, Beware Grandiosity, Decentralize, Isolate Failure Domains, Data Outlives Applications, Applications Outlive Integrations, Increase Discoverability.
❤1
Creative Thinking
According to the World Economic Forum Survey creative thinking is the most demanded skill in 2024. But is creativity important in software development?
Definitely yes. Robert E. Franken, in his book “Human Motivation,” defines creativity as
In our work, we frequently generate ideas to meet business requirements and solve technical issues (and occasionally to entertain others too! 😄). The more creative we are, the more ideas we can produce that lead to better and more efficient solutions.
The human brain generates new ideas based on existing knowledge, connecting it in innovative ways. This means the foundation for creativity is so-called `visual experience`: the more different elements we have in our minds, the more ideas we can produce.
In software development, it's essential to continually learn new things, even if they don't directly relate to our current technology stack or tasks. While it may seem pointless to study something unrelated to our immediate work because we might forget it quickly, this isn't entirely true. New knowledge serves as building blocks for generating new ideas. Designers even have a practice to study the works of others to develop a sense of good quality before creating their own style. Similarly, in software development, studying various use cases and technologies broadens our mind and enhances our visual experience.
Like any other skill, creativity can be trained. Many exercises to train creativity can be found on the Internet. Here are a few examples to start with:
✏️Study real use cases and patterns from your area of professional interest to expand your visual experience and sense of quality.
✏️Extend your standard subnoscriptions with content unrelated to your primary interests, such as science, drawing, medicine, etc. This helps introduce fresh ideas from other fields.
✏️ Select a simple object like a ruler, pen, or paper clip, and spend 10 minutes brainstorming as many alternative uses for it as possible. Try this exercise daily for a month and track whether the number of generated ideas increases.
To summarize, creativity is a valuable skill that need to be trained to enhance problem-solving abilities and boost professional efficiency.
#softskills #creativity
According to the World Economic Forum Survey creative thinking is the most demanded skill in 2024. But is creativity important in software development?
Definitely yes. Robert E. Franken, in his book “Human Motivation,” defines creativity as
the tendency to generate or recognize ideas, alternatives, or possibilities that may be useful in solving problems, communicating with others, and entertaining ourselves and others
In our work, we frequently generate ideas to meet business requirements and solve technical issues (and occasionally to entertain others too! 😄). The more creative we are, the more ideas we can produce that lead to better and more efficient solutions.
The human brain generates new ideas based on existing knowledge, connecting it in innovative ways. This means the foundation for creativity is so-called `visual experience`: the more different elements we have in our minds, the more ideas we can produce.
In software development, it's essential to continually learn new things, even if they don't directly relate to our current technology stack or tasks. While it may seem pointless to study something unrelated to our immediate work because we might forget it quickly, this isn't entirely true. New knowledge serves as building blocks for generating new ideas. Designers even have a practice to study the works of others to develop a sense of good quality before creating their own style. Similarly, in software development, studying various use cases and technologies broadens our mind and enhances our visual experience.
Like any other skill, creativity can be trained. Many exercises to train creativity can be found on the Internet. Here are a few examples to start with:
✏️Study real use cases and patterns from your area of professional interest to expand your visual experience and sense of quality.
✏️Extend your standard subnoscriptions with content unrelated to your primary interests, such as science, drawing, medicine, etc. This helps introduce fresh ideas from other fields.
✏️ Select a simple object like a ruler, pen, or paper clip, and spend 10 minutes brainstorming as many alternative uses for it as possible. Try this exercise daily for a month and track whether the number of generated ideas increases.
To summarize, creativity is a valuable skill that need to be trained to enhance problem-solving abilities and boost professional efficiency.
#softskills #creativity
❤1🔥1
UUID vs ULID
UUID (Universally Unique Identifier) and ULID (Universally Unique Lexicographically Sortable Identifier) are unique identifier formats widely used in software development. While both are based on a 128-bit length and random generation, they differ fundamentally in structure and use cases.
UUID Key Implementation Details:
- Represented as a 32-character hexadecimal string. Example:
- Typically generated using the current time and a random number, but may include other input parameters
- Has 8 versions with different generation logic (e.g., v1 uses a MAC address and timestamp, while v4 uses only random or pseudorandom numbers), v4 is the most popular now
- Has an official RFC
ULID Key Implementation Details:
- Represented as a 26-character string. Example:
- Generated based on the current time and a random number
- Common structure: 48 bits timestamp, 80 bits random number
- Lexicographically sortable, meaning they can be sorted in alphabetical order
- No official standard, but there is a spec on GitHub
Key Differences
✏️ Length: ULID is shorter.
✏️ Generation Speed: Some sources claim ULID is generated faster, but this depends on the implementation and technology used.
✏️ URL Safety: ULID is URL-safe without any encoding or escaping.
✏️ Sortability: ULID is sortable, which is useful for quickly sorting and searching large numbers of identifiers, it can be really efficient for data partitioning cases.
✏️ Security: UUID is generally safer from a security perspective as it is less predictable.
✏️ Adoption: UUID is widely adopted and can be generated at various levels (database, application).
Ultimately, the best choice depends on specific project requirements. Consider the importance of uniqueness, sortability, performance, and string length to make a final decision.
#architecture #systemdesign
UUID (Universally Unique Identifier) and ULID (Universally Unique Lexicographically Sortable Identifier) are unique identifier formats widely used in software development. While both are based on a 128-bit length and random generation, they differ fundamentally in structure and use cases.
UUID Key Implementation Details:
- Represented as a 32-character hexadecimal string. Example:
8ba8b814-9dad-11d1-80b4-00c04fd430c8- Typically generated using the current time and a random number, but may include other input parameters
- Has 8 versions with different generation logic (e.g., v1 uses a MAC address and timestamp, while v4 uses only random or pseudorandom numbers), v4 is the most popular now
- Has an official RFC
ULID Key Implementation Details:
- Represented as a 26-character string. Example:
01ARZ3NDEKTSV4RRFFQ69G5FAV- Generated based on the current time and a random number
- Common structure: 48 bits timestamp, 80 bits random number
- Lexicographically sortable, meaning they can be sorted in alphabetical order
- No official standard, but there is a spec on GitHub
Key Differences
✏️ Length: ULID is shorter.
✏️ Generation Speed: Some sources claim ULID is generated faster, but this depends on the implementation and technology used.
✏️ URL Safety: ULID is URL-safe without any encoding or escaping.
✏️ Sortability: ULID is sortable, which is useful for quickly sorting and searching large numbers of identifiers, it can be really efficient for data partitioning cases.
✏️ Security: UUID is generally safer from a security perspective as it is less predictable.
✏️ Adoption: UUID is widely adopted and can be generated at various levels (database, application).
Ultimately, the best choice depends on specific project requirements. Consider the importance of uniqueness, sortability, performance, and string length to make a final decision.
#architecture #systemdesign
✍3❤1👍1
Snowflake ID
One more popular algorithm for generating IDs in distributed systems is Snowflake.
It was initially created by Twitter (X) to generate IDs for tweets.
In 2010, Twitter migrated their infrastructure from MySQL to Cassandra. Since Cassandra doesn’t have an in-built id generator, the team needed an approach for ID generation that met the following requirements:
- Generate tens of thousands of ids per second in a highly available manner
- IDs need to be roughly sortable by time, the accuracy of the sort should be about 1 second (tweets within the same second are not sorted)
- ID have to be compact and fit into 64 bits
As a solution Snowflake service was introduced that generates IDs with the following structure:
✔️ Sign bit: The first bit is always 0 to keep the ID a positive integer.
✔️ Timestamp (41 bits): Time when ID was generated
✔️ Node ID (10 bits): Unique identifier of the worker node generating the ID
✔️Step (12 bits): A counter that is incremented for each ID generated within the same timestamp
Snowflake IDs are sortable by time, because they are based on the time they were created. Additionally, the creation time can be extracted from the ID itself. This can be used to get objects that were created before or after a particular date.
There is no official standard for Snowflake approach, but there are several implementations available on Github. The approach has also been adopted by major companies like Discord, Instagram and Mastodon.
#architecture #systemdesign
One more popular algorithm for generating IDs in distributed systems is Snowflake.
It was initially created by Twitter (X) to generate IDs for tweets.
In 2010, Twitter migrated their infrastructure from MySQL to Cassandra. Since Cassandra doesn’t have an in-built id generator, the team needed an approach for ID generation that met the following requirements:
- Generate tens of thousands of ids per second in a highly available manner
- IDs need to be roughly sortable by time, the accuracy of the sort should be about 1 second (tweets within the same second are not sorted)
- ID have to be compact and fit into 64 bits
As a solution Snowflake service was introduced that generates IDs with the following structure:
✔️ Sign bit: The first bit is always 0 to keep the ID a positive integer.
✔️ Timestamp (41 bits): Time when ID was generated
✔️ Node ID (10 bits): Unique identifier of the worker node generating the ID
✔️Step (12 bits): A counter that is incremented for each ID generated within the same timestamp
Snowflake IDs are sortable by time, because they are based on the time they were created. Additionally, the creation time can be extracted from the ID itself. This can be used to get objects that were created before or after a particular date.
There is no official standard for Snowflake approach, but there are several implementations available on Github. The approach has also been adopted by major companies like Discord, Instagram and Mastodon.
#architecture #systemdesign
👍2🔥2
GeoSharding in Tinder
Let’s continue improving our
Main Challenges:
📍 Location-Based Search with a maximum distance of 100 miles. For example, when serving a user in California, there is no need to include the users in London.
📍 Performance: index size growth decreases performance linearly. Multiple smaller indexes demonstrated better performance results.
Decisions Made:
✏️ Split Data: Storing users who are physically near each other in the same geoshard (a Tinder-specific term for their sharding implementation).
✏️ Limit Numbers of Geoshards: 40–100 geoshards globally results a good balance of P50, P90, and P99 performance under average production load.
✏️ Use the Google S2Geometry library to work with geo data:
- The library is based on the Hilbert curve: two points that are close on the Hilbert curve are close in physical space.
- It allows hierarchical decomposition of the sphere into "cells", each cell can also be decomposed on smaller cells.
- Each smallest cell represents a small area of the earth.
- The library provides built-in functionality for location mapping.
- S2 supports different-sized cells, ranging from square centimeters to miles.
✏️ Balance Geoshards: Not all locations have the same population density. So it’s needed to define the proper size of the shard and balance data to avoid a hot-shard issue. S2 cells were scored and combined to the geoshards, as a result each geoshard can have a different number of cells of the same size.
✏️ Mapping: Create a mapping between geoshards and S2 cells. For queries, the data service gets S2 cells to cover the query circle using S2 library, then map all the S2 cells to geoshards using the shard mapping.
It’s reported that the new approach improved performance by 20 times compared to the previous implementation with a single index setup. More importantly, it has the capacity to scale more in the future by extending the number of the geoshards. For me it was really interesting to read about location-based search approach, S2Geometry library and concepts under its implementation.
#architecture #systemdesign #scaling #usecase
Let’s continue improving our
visual experience in system design. Today we’ll check the Tinder search architecture. I hope everyone knows what Tinder is. One of its main components is the search with real-time recommendations. Initial implementation was based on a single Elasticsearch cluster with only 5 shards. Over time the number of shards grew and more replicas were added until the system met its scaling limits. In 2019 this led to a decision to re-architect the component to satisfy new performance requirements.Main Challenges:
📍 Location-Based Search with a maximum distance of 100 miles. For example, when serving a user in California, there is no need to include the users in London.
📍 Performance: index size growth decreases performance linearly. Multiple smaller indexes demonstrated better performance results.
Decisions Made:
✏️ Split Data: Storing users who are physically near each other in the same geoshard (a Tinder-specific term for their sharding implementation).
✏️ Limit Numbers of Geoshards: 40–100 geoshards globally results a good balance of P50, P90, and P99 performance under average production load.
✏️ Use the Google S2Geometry library to work with geo data:
- The library is based on the Hilbert curve: two points that are close on the Hilbert curve are close in physical space.
- It allows hierarchical decomposition of the sphere into "cells", each cell can also be decomposed on smaller cells.
- Each smallest cell represents a small area of the earth.
- The library provides built-in functionality for location mapping.
- S2 supports different-sized cells, ranging from square centimeters to miles.
✏️ Balance Geoshards: Not all locations have the same population density. So it’s needed to define the proper size of the shard and balance data to avoid a hot-shard issue. S2 cells were scored and combined to the geoshards, as a result each geoshard can have a different number of cells of the same size.
✏️ Mapping: Create a mapping between geoshards and S2 cells. For queries, the data service gets S2 cells to cover the query circle using S2 library, then map all the S2 cells to geoshards using the shard mapping.
It’s reported that the new approach improved performance by 20 times compared to the previous implementation with a single index setup. More importantly, it has the capacity to scale more in the future by extending the number of the geoshards. For me it was really interesting to read about location-based search approach, S2Geometry library and concepts under its implementation.
#architecture #systemdesign #scaling #usecase
Medium
Geosharded Recommendations Part 1: Sharding Approach
Authors: Frank Ren|Director, Backend Engineering, Xiaohu Li|Manager, Backend Engineering, Devin Thomson| Lead, Backend Engineer, Daniel…
👍3🔥2❤🔥1
Behavior Hormones
When I reviewed `Leaders Eat Last` I highlighted that chemical elements like endorphin, dopamine, serotonin, and oxytocin can drive human behavior, so it’s good to know how they work. So let’s go into more details.
Endorphin:
- Biological painkiller that masks real pain to continue moving forward
- Produced during physical activities such as exercising, running, laughing
- Decreases stress level (it’s not possible to feel fear and laughing at the same time)
- Has short-time effect
- Practical Usage: make exercises to switch focus from the problem; use humor to defuse tense situations
Dopamine:
- Satisfaction from achievements
- Generated on getting things done
- Helps to focus on the task and end goal: the closer and clearer the goal, the more dopamine is released, increasing motivation
- Has short-time effect: intermediate doses are constantly needed to keep focus and motivation
- Can have negative effects as it causes different types of addictions: social media, KPIs, shopping, etc. Each event generates dopamine, leading to repetitive behaviors
- Practical Usage: set clear goals (for own tasks or for the team), plan intermediate steps, visualize the progress
Serotonin:
- Feeling of pride, respect, recognition
- Released when personal achievements are recognized by a group
- Improves self-confidence and responsibility towards the group
- Typically a person who supports and assists others is generally accepted as a leader by the group
- Has a long-time effect
- Practical Usage: recognize team members’ achievements, offer help and support when needed
Oxytocin:
- Friendship, trust, love. It’s everything that makes us social
- Produced when an individual is accepted as part of the group
- Being part of a group provides a sense of safety, calm, and happiness, reducing stress levels. Physical contacts (handshake, hugs, etc.) can also support those feelings
- Has a long-time effect
- Practical Usage: build an atmosphere of trust within your group, organize team buildings, invent something that indicates group membership
Endorphin, dopamine, serotonin, and oxytocin have significant impact on human behavior. Understanding basic principles of their work can help to improve team cohesion, motivation, collaboration and most importantly establish trust within the team.
#softskills #leadership
When I reviewed `Leaders Eat Last` I highlighted that chemical elements like endorphin, dopamine, serotonin, and oxytocin can drive human behavior, so it’s good to know how they work. So let’s go into more details.
Endorphin:
- Biological painkiller that masks real pain to continue moving forward
- Produced during physical activities such as exercising, running, laughing
- Decreases stress level (it’s not possible to feel fear and laughing at the same time)
- Has short-time effect
- Practical Usage: make exercises to switch focus from the problem; use humor to defuse tense situations
Dopamine:
- Satisfaction from achievements
- Generated on getting things done
- Helps to focus on the task and end goal: the closer and clearer the goal, the more dopamine is released, increasing motivation
- Has short-time effect: intermediate doses are constantly needed to keep focus and motivation
- Can have negative effects as it causes different types of addictions: social media, KPIs, shopping, etc. Each event generates dopamine, leading to repetitive behaviors
- Practical Usage: set clear goals (for own tasks or for the team), plan intermediate steps, visualize the progress
Serotonin:
- Feeling of pride, respect, recognition
- Released when personal achievements are recognized by a group
- Improves self-confidence and responsibility towards the group
- Typically a person who supports and assists others is generally accepted as a leader by the group
- Has a long-time effect
- Practical Usage: recognize team members’ achievements, offer help and support when needed
Oxytocin:
- Friendship, trust, love. It’s everything that makes us social
- Produced when an individual is accepted as part of the group
- Being part of a group provides a sense of safety, calm, and happiness, reducing stress levels. Physical contacts (handshake, hugs, etc.) can also support those feelings
- Has a long-time effect
- Practical Usage: build an atmosphere of trust within your group, organize team buildings, invent something that indicates group membership
Endorphin, dopamine, serotonin, and oxytocin have significant impact on human behavior. Understanding basic principles of their work can help to improve team cohesion, motivation, collaboration and most importantly establish trust within the team.
#softskills #leadership
🔥3👍1
Our behavior drivers is also a good topic to visulize. Check it out!
#softskills #leadership #sketchnote
#softskills #leadership #sketchnote
🔥2❤🔥1👍1
Distroless Images
The concept of distroless images was initially presented at the swampUP conference in 2017.
So what does it mean?
That’s how Google deploys software in production. But what problem does it solve? Why is a small distro image like alpine not enough?
So let’s start with what application actually needs to run:
- Compiled sources
- Dependencies
- Language runtime
It doesn’t need any package manager, shell utilities or other tools from OS distribution. But their existence increases image size and image download time, extends scope of compliance (security hardening, CVE scans). That's the problem Google tried to solve.
Initially Google used alpine as the smallest available distro. But Alpine contains an unnecessary package manager, BusyBox and it is based on musl libc that makes glibc usage mostly impossible. So Google decided to create images that contain only what is really needed. That’s how distroless was created.
Distroless images are based on Debian Linux distribution and have a variety of language support: go, python, java, cpp, nodejs. The smallest distroless image,
Since March 2023, Distroless images are based on oci manifests and support multiple architectures (more about multi-arch images in the post ).
We have been using distroless for some time already, and the experience is really positive. Of course, in some cases I regret about bash absence in runtime 😃, but it drives us to improve other debug and observability tools. Additionally, I want to highlight that distros are actively supported by Google and they have regular updates including actual security patches.
#engineering
The concept of distroless images was initially presented at the swampUP conference in 2017.
So what does it mean?
"Distroless" images contain only your application and its runtime dependencies. They do not contain package managers, shells or any other programs you would expect to find in a standard Linux distribution.
That’s how Google deploys software in production. But what problem does it solve? Why is a small distro image like alpine not enough?
So let’s start with what application actually needs to run:
- Compiled sources
- Dependencies
- Language runtime
It doesn’t need any package manager, shell utilities or other tools from OS distribution. But their existence increases image size and image download time, extends scope of compliance (security hardening, CVE scans). That's the problem Google tried to solve.
Initially Google used alpine as the smallest available distro. But Alpine contains an unnecessary package manager, BusyBox and it is based on musl libc that makes glibc usage mostly impossible. So Google decided to create images that contain only what is really needed. That’s how distroless was created.
Distroless images are based on Debian Linux distribution and have a variety of language support: go, python, java, cpp, nodejs. The smallest distroless image,
gcr.io/distroless/static-debian11, is around 2 MiB. That's about 50% of the size of alpine (~5 MiB), and less than 2% of the size of debian (124 MiB).Since March 2023, Distroless images are based on oci manifests and support multiple architectures (more about multi-arch images in the post ).
We have been using distroless for some time already, and the experience is really positive. Of course, in some cases I regret about bash absence in runtime 😃, but it drives us to improve other debug and observability tools. Additionally, I want to highlight that distros are actively supported by Google and they have regular updates including actual security patches.
#engineering
GitHub
GitHub - GoogleContainerTools/distroless: 🥑 Language focused docker images, minus the operating system.
🥑 Language focused docker images, minus the operating system. - GitHub - GoogleContainerTools/distroless: 🥑 Language focused docker images, minus the operating system.
👍4❤2
Hexagonal Architecture
The hexagonal architecture (also known as Ports & Adapters) is a quite popular concept pattern in the modern software world. Let’s check what it is and when it can be helpful.
The pattern was originally introduced by Alistair Cockburn in 2005 in his blog post Hexagonal Architecture and intended to solve the following problems:
- Undesired dependencies between system layers
- Mixed business logic with external interactions logic
- Poor business logic testability by automated integration tests
Key points of the suggested solution:
✏️ Split system representation on what is “inside” application and what is “outside”.“Inside” logic should not leak to “outside” part
✏️ Introduce “ports” - communication channels with outside entities better known as boundaries of application: contracts and interfaces
✏️ Introduce “adapters” - entities that transform one interface into another (web UI, mobile UI, databases, queue system integrations, etc.). Adapters are the components that allow the application to interact with specific technologies.
✏️ Each port can have multiple adapters
✏️ Adapters can be primary and secondary.
✏️ Primary adapters (also called “driving”) are responsible to receive external signals and convert them to a Port, such adapters initiate communication flow. Examples: different types of UI, direct calls to REST APIs, CLIs, etc.
✏️ Secondary adapters (also called “driven”) convert commands from Ports to a specific technology. Examples: read\write to database, message queue integration, storing files on s3 storage, etc.
✏️ Outside entities can interact with application only via defined ports
Usage examples: different OS support via separate adapters, integration with different types of storages like filesystem, s3, azure blob storage and others, separation of application logic and UI development, proper contracts between systems that are developed by different teams.
Pros:
- Testability improvements: it’s possible to test application logic without external dependencies
- Increased maintainability: separation of concerns and business logic decoupling make it easier to locate the code to be modified
- Flexibility: adapters can be easily swapped without any impact on core logic
- Adaptation to technology evolution: Technologies are changed more frequently than business logic. As the technology part is encapsulated in a specific adapter, it’s enough to change the adapter only.
- Put-off technologies decisions: focus on business logic, postpone decision about particular technologies for real production usage.
Cons:
- Complexity: application has a complex structure with a lot of modules and explicit dependencies defined between them
- Indirection: extra calls to methods when an adapter converts between port and specific technology interfaces
- In some contexts may be not applicable or inefficient: approach works well for cases with complex and stable business logic.
So the main idea of the pattern is to design a software around domain logic and isolate it from external factors and constantly changing technologies.
#architecture #systemdesign #patterns
The hexagonal architecture (also known as Ports & Adapters) is a quite popular concept pattern in the modern software world. Let’s check what it is and when it can be helpful.
The pattern was originally introduced by Alistair Cockburn in 2005 in his blog post Hexagonal Architecture and intended to solve the following problems:
- Undesired dependencies between system layers
- Mixed business logic with external interactions logic
- Poor business logic testability by automated integration tests
Key points of the suggested solution:
✏️ Split system representation on what is “inside” application and what is “outside”.“Inside” logic should not leak to “outside” part
✏️ Introduce “ports” - communication channels with outside entities better known as boundaries of application: contracts and interfaces
✏️ Introduce “adapters” - entities that transform one interface into another (web UI, mobile UI, databases, queue system integrations, etc.). Adapters are the components that allow the application to interact with specific technologies.
✏️ Each port can have multiple adapters
✏️ Adapters can be primary and secondary.
✏️ Primary adapters (also called “driving”) are responsible to receive external signals and convert them to a Port, such adapters initiate communication flow. Examples: different types of UI, direct calls to REST APIs, CLIs, etc.
✏️ Secondary adapters (also called “driven”) convert commands from Ports to a specific technology. Examples: read\write to database, message queue integration, storing files on s3 storage, etc.
✏️ Outside entities can interact with application only via defined ports
Usage examples: different OS support via separate adapters, integration with different types of storages like filesystem, s3, azure blob storage and others, separation of application logic and UI development, proper contracts between systems that are developed by different teams.
Pros:
- Testability improvements: it’s possible to test application logic without external dependencies
- Increased maintainability: separation of concerns and business logic decoupling make it easier to locate the code to be modified
- Flexibility: adapters can be easily swapped without any impact on core logic
- Adaptation to technology evolution: Technologies are changed more frequently than business logic. As the technology part is encapsulated in a specific adapter, it’s enough to change the adapter only.
- Put-off technologies decisions: focus on business logic, postpone decision about particular technologies for real production usage.
Cons:
- Complexity: application has a complex structure with a lot of modules and explicit dependencies defined between them
- Indirection: extra calls to methods when an adapter converts between port and specific technology interfaces
- In some contexts may be not applicable or inefficient: approach works well for cases with complex and stable business logic.
So the main idea of the pattern is to design a software around domain logic and isolate it from external factors and constantly changing technologies.
#architecture #systemdesign #patterns
❤3👍1
Some good visualization of Hexagonal Architecture pattern.
Sources: Ports and Adapters Pattern and Hexagonal Architecture
#architecture #systemdesign #patterns
Sources: Ports and Adapters Pattern and Hexagonal Architecture
#architecture #systemdesign #patterns
❤2
Hexagonal Architecture: Implementation Tips
When I have been preparing the overview of Hexagonal Architecture I read multiple articles about the topic and some of them contain useful implementation recommendations for the pattern. So let's go trough them.
The starting implementation point is the core application logic, with port interfaces defined around it, at both driving and driven sides. New ports and adapters may appear during implementation.
Tips for ports implementation:
- Do not mention the particular technology in port naming
- Do not include word ‘Port’ in the name
- Define port as a contract (interface) with no default behavior
- Port is part of inside logic, so keep it close to application logic
- Use the "this port is for …ing something" trick to define the name. For example, "this port is for notifying about blog posts" will result
- Port should follow Single Responsibility principle
- Port should follow Interface Segregation Principle: narrow-focused ports are preferable than generic ones
Tips for adapters implementation:
- Do not use
- If possible specify technology in the name
- Adapter should be used for a single port only
- Use Adapter pattern for implementation
- Do not introduce dependencies between different adapters
- Follow SOLID principles
- It should be possible to change and add adapters without impact on business logic
- Technology specific errors should not leave adapter boundaries (e.g., application doesn’t know anything about particular SQL errors)
- Cover adapters by integration tests
As a result it should be possible to run the application with any combination of configured ports and adapters.
References:
- Ports and Adapters as They Should Be
- Ports and Adapters Pattern
#architecture #systemdesign #patterns
When I have been preparing the overview of Hexagonal Architecture I read multiple articles about the topic and some of them contain useful implementation recommendations for the pattern. So let's go trough them.
The starting implementation point is the core application logic, with port interfaces defined around it, at both driving and driven sides. New ports and adapters may appear during implementation.
Tips for ports implementation:
- Do not mention the particular technology in port naming
- Do not include word ‘Port’ in the name
- Define port as a contract (interface) with no default behavior
- Port is part of inside logic, so keep it close to application logic
- Use the "this port is for …ing something" trick to define the name. For example, "this port is for notifying about blog posts" will result
BlogPostNotifier interface name - Port should follow Single Responsibility principle
- Port should follow Interface Segregation Principle: narrow-focused ports are preferable than generic ones
Tips for adapters implementation:
- Do not use
Adapter word in the name - If possible specify technology in the name
- Adapter should be used for a single port only
- Use Adapter pattern for implementation
- Do not introduce dependencies between different adapters
- Follow SOLID principles
- It should be possible to change and add adapters without impact on business logic
- Technology specific errors should not leave adapter boundaries (e.g., application doesn’t know anything about particular SQL errors)
- Cover adapters by integration tests
As a result it should be possible to run the application with any combination of configured ports and adapters.
References:
- Ports and Adapters as They Should Be
- Ports and Adapters Pattern
#architecture #systemdesign #patterns
❤1
Estimates or NoEstimates?
As a developers, techleads, teamleads, and architects we often provide different kinds of estimates: for sprint tasks, release planning, new requirements, project ideas, etc. But are they really helpful? Sounds a bit provocative, but that's what video Estimates or NoEstimates is about.
The author starts with the investigation of the real meaning for the term and comes up with a conclusion that estimates are just a guesses about uncertainty:
An interesting note here is that we should remember that task effort is not the same as task duration:
- Effort (Work time): the amount of time to do the actual work, not including time waiting
- Duration (Elapsed Time): the amount of time that passes from when we first start on some work until it's completed (this time includes interruptions, clarifications, waiting for some resources, etc.)
If estimates are just guesses, why do we need them? The answer is that they help us make decisions:
- Should we do project A or project B?
- Should we do this project or feature?
- Do we need to hire more people?
- Do we need to fire some people?
So estimates give us a felling of control. But is there any control?
The author shares a funny story to illustrate how estimates are really made. When a manager asked him to provide estimates, he asked his more experienced colleagues how to do it. The answers were like "I just make up some numbers really quick because I'm not going to waste my time on that", "I just write down a bunch of of small numbers because he [the manager] seems to love small numbers", "I'll quickly look through each of the requirement and try to think about something I did before and how long it took me and I write that number down". As a result the author "quickly wrote down a bunch of numbers that were high" and gave them to the manager. This type of information is what we base our plans and decisions on 😃
So the main idea is that we cannot rely on estimations to make serious decisions. Moreover, there is no sense to use estimations accuracy as a metric for project success or team effectiveness. Instead of that we should build innovative and creative environment to promote culture for better software creation (where estimates are not needed probably).
The talk is really nice, it makes you rethink some everyday routines, but it's more about problem statement then about particular solution (there is also a hashtag #NoEstimates to follow in X to read more). As for me I don't think we can get rid of estimations in the near future but it's good to remember what estimations really are and not heavily rely on them for important decisions.
Also I revise how I do estimations. Typically I use scale like 1, 3, 5, 10, 15, 20, 30, 50, 100+ to quickly define the size of the task. The more uncertainty in the task, the higher my estimates 😉
#management #engineering
As a developers, techleads, teamleads, and architects we often provide different kinds of estimates: for sprint tasks, release planning, new requirements, project ideas, etc. But are they really helpful? Sounds a bit provocative, but that's what video Estimates or NoEstimates is about.
The author starts with the investigation of the real meaning for the term and comes up with a conclusion that estimates are just a guesses about uncertainty:
An estimate is a guess of the amount of time [work time or elapsed time] to create a project, a feature or some bit of work in developing software
An interesting note here is that we should remember that task effort is not the same as task duration:
- Effort (Work time): the amount of time to do the actual work, not including time waiting
- Duration (Elapsed Time): the amount of time that passes from when we first start on some work until it's completed (this time includes interruptions, clarifications, waiting for some resources, etc.)
If estimates are just guesses, why do we need them? The answer is that they help us make decisions:
- Should we do project A or project B?
- Should we do this project or feature?
- Do we need to hire more people?
- Do we need to fire some people?
So estimates give us a felling of control. But is there any control?
The author shares a funny story to illustrate how estimates are really made. When a manager asked him to provide estimates, he asked his more experienced colleagues how to do it. The answers were like "I just make up some numbers really quick because I'm not going to waste my time on that", "I just write down a bunch of of small numbers because he [the manager] seems to love small numbers", "I'll quickly look through each of the requirement and try to think about something I did before and how long it took me and I write that number down". As a result the author "quickly wrote down a bunch of numbers that were high" and gave them to the manager. This type of information is what we base our plans and decisions on 😃
So the main idea is that we cannot rely on estimations to make serious decisions. Moreover, there is no sense to use estimations accuracy as a metric for project success or team effectiveness. Instead of that we should build innovative and creative environment to promote culture for better software creation (where estimates are not needed probably).
The talk is really nice, it makes you rethink some everyday routines, but it's more about problem statement then about particular solution (there is also a hashtag #NoEstimates to follow in X to read more). As for me I don't think we can get rid of estimations in the near future but it's good to remember what estimations really are and not heavily rely on them for important decisions.
Also I revise how I do estimations. Typically I use scale like 1, 3, 5, 10, 15, 20, 30, 50, 100+ to quickly define the size of the task. The more uncertainty in the task, the higher my estimates 😉
#management #engineering
YouTube
Estimates or No Estimates? • Woody Zuill • YOW! 2017
This presentation was recorded at YOW! 2017. #GOTOcon #YOW
https://yowcon.com
Woody Zuill - Agile Guide, Senior Consultant, Agile Expertise & Coaching @woodyz53
RESOURCES
https://x.com/WoodyZuill
https://www.linkedin.com/in/woodyzuill
https://mobprogramming.org…
https://yowcon.com
Woody Zuill - Agile Guide, Senior Consultant, Agile Expertise & Coaching @woodyz53
RESOURCES
https://x.com/WoodyZuill
https://www.linkedin.com/in/woodyzuill
https://mobprogramming.org…
❤1
GreenOps: Renewable Energy Trend
In the recent news it's reported that Amazon fully switched on renewable energy usage:
Amazon have invested billions of dollars in more than 500 solar and wind projects globally, which together are capable of generating enough energy to power the equivalent of 7.6 million U.S. homes.
Any data center infrastructure comes with an environmental cost; the IT sector alone is responsible for 1.4% of carbon emissions worldwide. Carbon-aware approach became a trend not only for Amazon but also for all major cloud providers, such as Google and Microsoft. Of course, this shift is primarily driven by government regulations, social responsibility and company reputation.
A new concept, GreenOps, has been introduced to make applications
References:
- Using GreenOps to Improve Your Operational Efficiency and Save the Planet
- Amazon Renewable Energy Goal
- What Is GreenOps? Putting a Sustainable Focus on FinOps
#news #engineering
In the recent news it's reported that Amazon fully switched on renewable energy usage:
All of the electricity consumed by Amazon’s operations, including its data centers, was matched with 100% renewable energy in 2023.
Amazon have invested billions of dollars in more than 500 solar and wind projects globally, which together are capable of generating enough energy to power the equivalent of 7.6 million U.S. homes.
Any data center infrastructure comes with an environmental cost; the IT sector alone is responsible for 1.4% of carbon emissions worldwide. Carbon-aware approach became a trend not only for Amazon but also for all major cloud providers, such as Google and Microsoft. Of course, this shift is primarily driven by government regulations, social responsibility and company reputation.
A new concept, GreenOps, has been introduced to make applications
greener. It's mostly an evolution of FinOps but with the focus on environmentally friendly optimizations : using less energy, being aware of carbon emissions, and utilizing more efficient hardware.References:
- Using GreenOps to Improve Your Operational Efficiency and Save the Planet
- Amazon Renewable Energy Goal
- What Is GreenOps? Putting a Sustainable Focus on FinOps
#news #engineering
👍2🔥1
Make Architecture Reliable
Reliability is the top priority feature for modern systems. Your customers always expect service reliability, even if they don't realize it. Nobody interests in the super cool feature that cannot be used (because of service unavailability or bad performance for example).
Reliability can be defined as the ability of a system to carry out its intended function without interruption. Good definition, but not really actionable. I prefer Google definition
The logic is simple: if system is not reliable, users will not use it. If users don't use it, it's worth nothing. So reliability matters.
Let's check what reliable architecture usually includes:
📍Measurable Reliability Targets: SLO, SLI and error budget
📍High-Availability:
- Redundancy: multiple replicas for the same service
- Self-Healing: the ability to remediate issues without manual interventions
- Graceful Degradation: degrade service levels gracefully when overloaded
- Fail Safe: be ready for unexpected failure, no data or system corruption
- Retriable APIs: make your operations idempotent, allow retries
- Critical Dependencies Minimization: the reliability level of a service is defined by the reliability of its least reliable component or dependency.
- Multiple Availability Zones (AZ): spread instances across multiple AZ, ability to survive in case of AZ outage
📍Disaster Recovery:
- Multiple Regions: spread instances across multiple regions (each region has multiple AZ), ability to survive in case of region failure
- Data Replication Across Regions
📍Scalability: ability to scale for increased workload
📍Observability: code instrumentation, tools for data collection and analysis, fast failure detection
📍Recovery Procedures: rollback strategies, recovery from outages
📍Chaos Engineering: practices to test failures internally
📍Operational Excellence: a fully automated operational experience with minimal manual steps and low cognitive complexity
References:
- Google Cloud Architecture Framework: Reliability
- AWS Well-Architected Framework
- AWS Well-Architected Framework: Reliability Pillar
- Azure Well-Architected Framework: Reliability
#architecture #systemdesign #reliability
Reliability is the top priority feature for modern systems. Your customers always expect service reliability, even if they don't realize it. Nobody interests in the super cool feature that cannot be used (because of service unavailability or bad performance for example).
Reliability can be defined as the ability of a system to carry out its intended function without interruption. Good definition, but not really actionable. I prefer Google definition
Your service is reliable when your customers are happy.
The logic is simple: if system is not reliable, users will not use it. If users don't use it, it's worth nothing. So reliability matters.
Let's check what reliable architecture usually includes:
📍Measurable Reliability Targets: SLO, SLI and error budget
📍High-Availability:
- Redundancy: multiple replicas for the same service
- Self-Healing: the ability to remediate issues without manual interventions
- Graceful Degradation: degrade service levels gracefully when overloaded
- Fail Safe: be ready for unexpected failure, no data or system corruption
- Retriable APIs: make your operations idempotent, allow retries
- Critical Dependencies Minimization: the reliability level of a service is defined by the reliability of its least reliable component or dependency.
- Multiple Availability Zones (AZ): spread instances across multiple AZ, ability to survive in case of AZ outage
📍Disaster Recovery:
- Multiple Regions: spread instances across multiple regions (each region has multiple AZ), ability to survive in case of region failure
- Data Replication Across Regions
📍Scalability: ability to scale for increased workload
📍Observability: code instrumentation, tools for data collection and analysis, fast failure detection
📍Recovery Procedures: rollback strategies, recovery from outages
📍Chaos Engineering: practices to test failures internally
📍Operational Excellence: a fully automated operational experience with minimal manual steps and low cognitive complexity
References:
- Google Cloud Architecture Framework: Reliability
- AWS Well-Architected Framework
- AWS Well-Architected Framework: Reliability Pillar
- Azure Well-Architected Framework: Reliability
#architecture #systemdesign #reliability
🔥3
S3 for Kafka Storage
In version 3.6.0 Kafka introduced early access to the Tiered Storage feature (KIP-405) that significantly improves operation experience and decreases cluster costs.
Existing problems with scalability and efficiency that this change is intended to solve:
📍Huge disk capacity that is required to keep data for long period of time (retention policy for days, weeks or even months).
📍Processing speed may be impacted by big amount of data kept in the cluster
📍Home-grown implementations to copy old data to external storages like HDFS
📍Expensive scaling approach. Kafka is scaled by adding new brokers that also require RAM and CPU, not possible to scale disks only
📍Copying a lot of data in case node failure as new node must copy all the data that was on the failed broker from other replicas
📍High recovery time. The time for recovery and rebalancing is proportional to the amount of data stored locally on a Kafka broker.
Suggested solution:
✏️ Use Tiered Storage pattern. Split data management on separate tires based on performance and access requirements, cost considerations. Most commonly used tiers:
-"Hot": Local storage to keeps the most critical and frequently accessed data
- "Warm": Remote lower-cost storage that keeps less critical or infrequently accessed data
-"Cold": Low-cost storage to keep periodic backup data
✏️ Kafka storage is split on local and remote storages. Local storage is the same as it's in Kafka now. Remote storage is a pluggable storage that can be HDFS, S3, Azure blob, etc.
✏️ Inactive segments are copied to the remote storage according to the configured retention policy
✏️ Remote and local storages have their own retention policies. So local storage can be very short like few hours.
✏️ Any data that exceeds the local retention threshold will not be removed until successfully uploaded to the remote storage
✏️ Clients can still get older data, it will be read from the remote storage
✏️ Feature is enabled by
✏️ New metrics are introduced to monitor integration performance with the remote storage
Current limitations:
* Compacted topics are not supported
* To disable remote storage, the topic must be recreated
To sum up, Tiered Storage allows scale storage independently from cluster size that reduces overall usage costs. It is still in an early access state (3.8.0 version) and is not recommended for production use. However, there is significant interest in it. AWS has announced S3 support in their MSK service, and Uber has reported successfully running the feature in production.
#news #architecture #technologies #kafka
In version 3.6.0 Kafka introduced early access to the Tiered Storage feature (KIP-405) that significantly improves operation experience and decreases cluster costs.
Existing problems with scalability and efficiency that this change is intended to solve:
📍Huge disk capacity that is required to keep data for long period of time (retention policy for days, weeks or even months).
📍Processing speed may be impacted by big amount of data kept in the cluster
📍Home-grown implementations to copy old data to external storages like HDFS
📍Expensive scaling approach. Kafka is scaled by adding new brokers that also require RAM and CPU, not possible to scale disks only
📍Copying a lot of data in case node failure as new node must copy all the data that was on the failed broker from other replicas
📍High recovery time. The time for recovery and rebalancing is proportional to the amount of data stored locally on a Kafka broker.
Suggested solution:
✏️ Use Tiered Storage pattern. Split data management on separate tires based on performance and access requirements, cost considerations. Most commonly used tiers:
-"Hot": Local storage to keeps the most critical and frequently accessed data
- "Warm": Remote lower-cost storage that keeps less critical or infrequently accessed data
-"Cold": Low-cost storage to keep periodic backup data
✏️ Kafka storage is split on local and remote storages. Local storage is the same as it's in Kafka now. Remote storage is a pluggable storage that can be HDFS, S3, Azure blob, etc.
✏️ Inactive segments are copied to the remote storage according to the configured retention policy
✏️ Remote and local storages have their own retention policies. So local storage can be very short like few hours.
✏️ Any data that exceeds the local retention threshold will not be removed until successfully uploaded to the remote storage
✏️ Clients can still get older data, it will be read from the remote storage
✏️ Feature is enabled by
remote.log.storage.system.enable on the cluster and remote.storage.enable on the topic✏️ New metrics are introduced to monitor integration performance with the remote storage
Current limitations:
* Compacted topics are not supported
* To disable remote storage, the topic must be recreated
To sum up, Tiered Storage allows scale storage independently from cluster size that reduces overall usage costs. It is still in an early access state (3.8.0 version) and is not recommended for production use. However, there is significant interest in it. AWS has announced S3 support in their MSK service, and Uber has reported successfully running the feature in production.
#news #architecture #technologies #kafka
👍4
Architecture Decision Records
One of the most popular tools for documenting architectural decisions is an Architectural Decision Record (ADR). An ADR is a document that describes a choice made by the team regarding a significant aspect of the software architecture they’re planning to build. "Significant" means that the decision has a measurable impact on the architecture and quality of a software or hardware system.
The collection of ADRs created and maintained for a project is referred to as the project decision log.
A basic ADR typically includes the following parts:
- Title: The name of the change.
- Status: The current status from the ADR lifecycle, such as draft, accepted, rejected, deprecated, etc.
- Context: The purpose of the ADR, the issue it aims to solve, business priorities, team skills, and limitations.
- Decision: A denoscription of the proposed change.
- Consequences: The effects of the change, including what becomes easier or more difficult, as well as any outputs and after-review actions
It is a good practice to review ADR after implementation to compare the documented information with what was actually implemented.
Company-specific ADR process examples:
- Github ADR
- AWS ADR Process
- Google Cloud ADR Recommendations
ADR is a good tool to document architectural decisions, improve internal communications and facilitate knowledge sharing within the team or across the organization.
#architecture #documentation
One of the most popular tools for documenting architectural decisions is an Architectural Decision Record (ADR). An ADR is a document that describes a choice made by the team regarding a significant aspect of the software architecture they’re planning to build. "Significant" means that the decision has a measurable impact on the architecture and quality of a software or hardware system.
The collection of ADRs created and maintained for a project is referred to as the project decision log.
A basic ADR typically includes the following parts:
- Title: The name of the change.
- Status: The current status from the ADR lifecycle, such as draft, accepted, rejected, deprecated, etc.
- Context: The purpose of the ADR, the issue it aims to solve, business priorities, team skills, and limitations.
- Decision: A denoscription of the proposed change.
- Consequences: The effects of the change, including what becomes easier or more difficult, as well as any outputs and after-review actions
It is a good practice to review ADR after implementation to compare the documented information with what was actually implemented.
Company-specific ADR process examples:
- Github ADR
- AWS ADR Process
- Google Cloud ADR Recommendations
ADR is a good tool to document architectural decisions, improve internal communications and facilitate knowledge sharing within the team or across the organization.
#architecture #documentation
👍3❤1