TechLead Bits – Telegram
TechLead Bits
421 subscribers
63 photos
1 file
160 links
About software development with common sense.
Thoughts, tips and useful resources on technical leadership, architecture and engineering practices.

Author: @nelia_loginova
Download Telegram
STAMP Framework

As I wrote previously I think that the STAMP framework deserves its own overview.

The framework was introduced by MIT professor Nancy Levenson in the book "Engineering a Safer World" in 2011.

STAMP (The System Theoretic Accident Model and Processes) is a functional model of controllers which interact with each other through control actions and feedback:
- A system is considered as a control system.
- The control system consists of hierarchical control-feedback loops.
- Control system enforces safety constraints and prevents accidents.

STAMP is based on 2 main methodologies: Systems-Theoretic Process Analysis (STPA) and Causal Analysis Using System Theory (CAST).

Systems-Theoretic Process Analysis (STPA) - a hazard analysis technique performed at the design stage of the development (proactive analysis):
🔸 Define the purpose. For example, meet RPO\RTO requirements, prevent data loss, meet GDPR requirements, etc.
🔸 Model control structure. Describe and document interactions between key components for this type of hazard.
🔸 Identify unsafe control actions. For example, deploying a new version before testing, failing to scale up during high load, routing traffic to unhealthy backend, etc.
🔸 Identify loss scenarios. Find scenarios that could lead to unsafe actions. For example, failed update causes service outage, broken autoscaling leads to dropped users, etc.
🔸 Define safety constraints. Create controls and design changes to prevent unsafe control actions. For example, any deployment must have rollback strategy, service load must be monitored and alarms must be sent if resources usage exceeds 80%, etc.

Causal Analysis Using System Theory (CAST) - an accident investigation method performed after the incident is occurred (reactive analysis):
🔸 Collect information about the incident.
🔸 Model control structure.
🔸 Analyze each component in loss. Define the reason why the component didn't prevent the incident.
🔸 Identify control structure flaws: communication and coordination, safety management, culture, environment, etc.
🔸 Create improvement program. Prepare recommendations for changes to prevent similar loss in the future.

To sum up, the STAMP framework suggests to enforce safety constraints instead of just trying to prevent system failures. What I really like about this approach is that it allows to incorporate reliability into the system design itself.

P.S. If the topic sounds interesting for you, "Engineering a Safer World" book is available for free at MIT Press.

#engineering #reliability #systemdesign
🔥2❤‍🔥1
Three Layers of Architectural Debt

Technical debt is a very popular topic in software development. Architectural debt is less popular, but I see more and more interest in it for the last year.

Today I want to share the article Architectural debt is not just technical debt by Eoin Woods. The author highlights the importance of identifying and managing architectural debt as it can impact the whole organization.

The author defines architectural debt as "structural decisions that come back to bite you six months later". It's not very scientific 😃, but provides a good feeling of what it is.

According to the article this debt can be split on 3 layers according to its impact: application layer, business layer and strategic layer (please, refer to the attached picture with the model in the next post).

Application Layer
It's the level of particular service, its integrations and technologies. Problems are easy to detect, issues there directly impact delivery time and day-to-day operations.

Business Layer
It's a level of organizational structure and team topologies, defined ownership and stewardship. Poorly designed structures produce heavy communication flows (Conway law, remember?) that can impact overall system architecture, produce duplication in functionality, and conflicts of interests between the teams. Issues here will multiply issues on the operational side.

Strategy Layer
Debt at this level may impact the whole organization. A single strategic misstep creates a cascade of misalignment that amplifies at each level:
Strategy debt (wrong capability decisions) -> Wrong Business Assumptions -> Faulty Requirements -> Technical Issues -> Operational Chaos


The responsibility of an architect there is to raise a red flag, describe the debt with AS-IS and a TO-BE states and explain the risks to he business of not handling it.

One more important idea from the article is that modern architecture cannot be a responsibility of one person or a small group of people. To be successful, architecture should be a shared activity of understanding and learning, guided by common principles. And at this point it's more about company culture then technical knowledge:
Trust and curiosity allow principles to live and decisions to evolve. This turns architecture from a static artefact into an ongoing activity.


#architecture
🔥3
👍2🔥2
ML Nested Learning Approach

Recently Google published quite interesting research "Introducing Nested Learning: A new ML paradigm for continual learning" .
So let's check what it is about and why it can be interesting for ML society.

Modern LLMs tend to loose efficiency in old tasks while learning new tasks. This effect is called catastrophic forgetting (CF). Researchers and data scientists spend significant amount of time to invent some architectural tweaks or better optimizations to deal with it.

Authors define the root cause of this issue as a separation of model's architecture and optimization algorithms for two different things.

The Nested Learning treats ML model as a system of interconnected, multi-level learning problems that are optimized simultaneously. And in that approach the model's architecture and the rules used to train it are fundamentally the same concepts.

The overall idea is heavily based on associative memory concept: the ability to map and recall one thing based on another (like recalling a name when you see someone's face):
🔸 The training process is modeled as an associative memory. The model learns to map a given data point to the value of its local error, which serves as a measure of how "surprising" or unexpected that data point was.
🔸 Key architectural components (like transformers) are also formalized as simple associative memory modules that learn the mapping between tokens in a sequence.

Proof-of-concept tests shows superior performance in language modeling and demonstrates better long-context memory management than in existing models (you can check the measurements in the article or in the full text of the research there ).

ML keeps evolving. What’s interesting is that the best architecture ideas come from systems theory and attempts to copy human brain behavior.

#ai #architecture
🔥3
What Changes as You Grow

When I was first an engineer then a team lead I used to think that my managers and senior architects always know what to do and how to do. I can ask them for directions, bring the problems and request for help (I'm lucky to have really great managers during my career). And, of course, it is because they are super smart, experienced and kind of "grown-up".

What I realized that on each level there are people just like us with their own problems, fears and they also may not know what to do. Moreover they can also make mistakes.

The key difference is the ability to continue working in situations with a high level of uncertainty: good leaders stay focused, accept risks, take the responsibility, choose a way to go forward and engage the right people.

And here we can really help our leaders with the right details and expertise, new ideas, suggestions. Believe me, it would be really appreciated. I say this as someone who’s now expected to always know what to do for my team 😉.

#selfreflection #offtop
👍41
Architectural Debt

Let's continue the topic with architectural debt. The previous post was focused on the impact and importance of the debt itself, but it has no information about what to do with it.

To fix that I suggest to read Technical Debt vs. Architecture Debt: Don’t Confuse Them. The article maybe not fully answer this question but provide actionable recommendations of how to measure the architectural debt and what strategies can be applied to decrease it.

Architectural debt indicators:
🔸 Duplicated functionality: Count how many systems perform overlapping functions.
🔸 Integration complexity: Measure the number of point-to-point connections vs API gateways, enterprise service buses (ESBs) or event-driven models.
🔸 Principle violations: Track how many systems lack defined owners, documented interfaces or compliance with internal architectural standards.
🔸 Latency chains: Calculate end-to-end data flow time between multiple hops.
🔸 Configuration management completeness: Measure the percentage of applications with filled ownership, life cycle and dependency fields.

What can be done:
🔸 Officially define architecture debt. The first step to fixing a problem is admitting it. :)
🔸 Build metrics and dashboards.
🔸 Practice architecture observability. Track system dependencies, integration bottlenecks and principal compliance in near-real time.
🔸 Run architecture reviews.
🔸 Manage debt as a portfolio. Not all debt needs immediate repayment. Like managing a project portfolio, organizations should prioritize the debt by business impact.
🔸 Link debt to business KPIs.

As you can see there is no rocket science: standard make the problem evident -> measure -> improve cycle.

I think the article is a good point to start analyze whether you have architectural debt in your organization and prepare first steps to work with it.

#architecture
👍3👀1
AI Impact on Developer's Productivity

Over the past year, almost everyone has predicted that AI will replace developers. New tools appear every week, and the hype continues to grow. But where are these tools in real life? Do they really help developers to be more productive?

Anthropic recently published a research how AI tools transformed development in their own company. Anthropic is well-known by its Claude Code agent and it's quite interesting how AI tools impact development in AI company (they definitely should know how todo it in an effective way, right?).

Key points:
🔸 AI is mostly used for fixing bugs and code understanding.
🔸 Engineers reported 50% productivity boost (subjective opinion).
🔸 27% of assisted work consists of tasks that wouldn't have been done otherwise: minor issues, small refactoring, nice-to-have tools, additional tests, documentation.
🔸 Only 20% of real work can be delegated to the AI assistant. Moreover, engineers tend to delegate tasks that can be easily verified and reviewed.
🔸 Everyone is becoming more “full-stack": for example, backend developer can do simple frontend tasks.
🔸 Claude Code became the first point to ask questions, decreasing mentorship and collaboration experience.
🔸 Many engineers shows deep uncertainty about their future and how development will look like in several years.

So according to the survey AI assistants can significantly help with routine tasks, can act as a knowledgebase about the code. But there is still not enough trust to delegate them complex tasks or architectural decisions.

#ai #engineering
🔥21
Personal Goals & Well-Being

Do you already have some plans to start doing something from January? 😉
December is traditionally a time to sum up the year and start planning next achievements.

So today I want to share Gallup's key elements of well-being that can help to define the areas for personal global goals. Gallup institute made a huge research to define aspects of human life that we can  do something about to make our life better:
🔸 Career: You like what you do every day.
🔸 Social: You have meaningful friendships in your life.
🔸 Financial wellbeing: You manage your money well.
🔸 Physical: You have energy to get things done.
🔸Community: You like where you live.

For each area you can define several goals for the year. To make them real decompose goals to particular steps (plan) and activities to start with (better to immediately add them to the calendar).

Many years I was focused on career and finance only to have more expertise, more experience, interesting tasks to solve, get money to feel safe. As a result this year I have different health issues.
So I made my lessons learnt and for the next year I prepare a separate plan for other areas of well-being especially the physical part.

Take care and be balanced!

#softskills #productivity
2👍2
Dear Friends, Happy New Year! 🎄

I wish you motivation that doesn’t burn out, career growth in the direction you want, and progress you can actually be proud of.
Interesting challenges, reasonable deadlines, clean architecture and teams you enjoy working with.

I hope you stay healthy, have enough energy, and keep your closest people nearby.

Take care, rest well, and have a great 2026. 🎄🎄🎄

Warm wishes,
Nelia
Please open Telegram to view this post
VIEW IN TELEGRAM
🎉16🎄3
Stanford Engineering: Transformers & LLMs

The New Year holidays are over, and it’s the perfect time to start learning something new 🤔.

Stanford Engineering fully opened a course CME 295: Transformers & LLMs with explanation of LLMs core components, their limitations and how to use them effectively in real-world applications.
Course instructors are engineers with work experience in Uber, Google and Netflix, so they they really know what they are talking about.

Topic covered in the course:
- Transformers architecture
- Decoding strategies & MoEs
- LLMs finetuning & optimizations
- Results evaluation & Reasoning
- RAG & Agentic workflows

To really understand a topic, I need to know how everything is organized under the hood and what the core architectural principles are. That's why I really like such courses as they provide a structured and systematic view of the topic with all the necessary theory.

#ai #engineering
🔥2
A2UI Protocol

Google has introduced a new protocol for AI - A2UI. It allows agents to generate rich user interfaces that they can be displayed in different host applications. Now Lit, Angular, and Flutter renderers are supported, others are in the roadmap.

The main idea is that LLMs can generate a UI from a catalog of predefined widgets and send them as a message to the client.

The workflow looks as follows:
🔸 User sends a message to an AI agent
🔸 Agent generates A2UI messages describing the UI (structure + data in JSON lines format)
{"surfaceUpdate": 
{"surfaceId": "booking",
"components": [
{"id": "root", "component": {"Column": {"children": {"explicitList": ["header", "guests-field"]}}}},
{"id": "header", "component": {"Text": {"text": {"literalString": "Confirm Reservation"}, "usageHint": "h1"}}},
{"id": "guests-field", "component": {"TextField": {"label": {"literalString": "Guests"}, "text": {"path": "/reservation/guests"}}}}
]}}


🔸 Messages stream to the client application
🔸 Client renders it using native components (Angular, Flutter, React, etc.)
🔸 User interacts with the UI, sending actions back to the agent
🔸 Agent responds with updated A2UI messages

According to the article, the main benefits of the protocol are:
🔸 Security: No LLM-generated code, there is only a declaration passed to the client.
🔸 LLM-friendly: Flat structure, easy for LLMs to generate incrementally
🔸 Framework-agnostic: Separation of the UI structure from the UI implementation.

Right now the project is in early public preview. It looks promising, especially once REST protocol will be supported (now only A2A and AG-UI are supported).

#ai #news
👍5👀4🔥1