NEW BOT Телеграм, страница

DevOps & SRE notes

Have you ever heard that company migrate from microservice architecture to monolith?
Moving our service to a monolith reduced our infrastructure cost by over 90%. It also increased our scaling capabilities. Today, we’re able to handle thousands of streams and we still have capacity to scale the service even further. Moving the solution to Amazon EC2 and Amazon ECS also allowed us to use the Amazon EC2 compute saving plans that will help drive costs down even further.
https://www.primevideotech.com/video-streaming/scaling-up-the-prime-video-audio-video-monitoring-service-and-reducing-costs-by-90

Amazon News

Entertainment

We create and provide access to world-class entertainment through Amazon Originals, Prime Video, Audible, Amazon Games, Twitch, Amazon Music, Prime Gaming, and more. Amazon’s digital entertainment products enable customers to access the latest apps and games…

288 viewstutunak, 14:37

DevOps & SRE notes

macOS and Linux VMs on Apple Silicon to use in CI and other automations

https://github.com/cirruslabs/tart

GitHub

GitHub - cirruslabs/tart: macOS and Linux VMs on Apple Silicon to use in CI and other automations

macOS and Linux VMs on Apple Silicon to use in CI and other automations - cirruslabs/tart

🔥1

281 viewstutunak, 06:04

DevOps & SRE notes

In this post, the author explores various load balancing algorithms, including round robin, weighted round robin, dynamic weighted round robin, and least connections. The simulations demonstrate how these algorithms perform in different scenarios, highlighting their strengths and weaknesses. Round robin performs well in terms of median latency but struggles with higher percentiles. Least connections offer a good balance between simplicity and performance but may not be optimal in terms of latency. The PEWMA algorithm, which combines techniques from dynamic weighted round robin and least connections, shows significant improvements across all latency percentiles but has additional complexity and may not handle dropped requests as well as least connections. Ultimately, the choice of load balancing algorithm depends on the specific requirements of a workload and the performance characteristics that need to be optimized.

https://samwho.dev/load-balancing/

👍1

303 viewstutunak, 14:06

DevOps & SRE notes

Adrien "ZeratoR" Nougaret's annual charity event, Zevent, returned this year with a new addition called Zevent Place. Inspired by Reddit's r/place, Zevent Place is a collaborative canvas where donors can draw pixels based on the amount they donate. Developers William Traoré and Alexandre Moghrabi created the platform with several features, such as Pixel Upgrade system and real-time updates, to protect community creations and enhance user experience.

The team utilized various technologies like GraphQL, NestJS, Redis, and MinIO, and managed to handle massive amounts of updates while maintaining a low CPU and bandwidth footprint. Although there were challenges, such as unexpected rate limit errors with Cloudflare, the event achieved 98.4% uptime, with the downtime being addressed and resolved promptly.

Overall, Zevent Place was a successful project, and valuable lessons were learned throughout its development and implementation.

https://medium.com/@alexmogfr/zevent-place-how-we-handled-100k-ccu-on-a-real-time-collective-canvas-71d3d346e0ab

Medium

ZEvent Place: How we handled 100k+ CCU on a real-time collective canvas

Each year, Adrien “ZeratoR” Nougaret runs a charity event named Zevent.

301 viewstutunak, 06:08

DevOps & SRE notes

Debug a target container in a Kubernetes cluster by automatically creating a new, non-invasive, 'debug' container in the same pid, network, user, and ipc namespace as the target container without disrupting the target container.

https://github.com/JamesTGrant/kubectl-debug

GitHub

GitHub - JamesTGrant/kubectl-debug: Debug a target container in a Kubernetes cluster by automatically creating a new, non-invasive…

Debug a target container in a Kubernetes cluster by automatically creating a new, non-invasive, 'debug' container in the same pid, network, user, and ipc namespace as the target con...

👍1

312 viewstutunak, 14:09

DevOps & SRE notes

This is a Helm plugin which map deprecated or removed Kubernetes APIs in a release to supported APIs

https://github.com/helm/helm-mapkubeapis

GitHub

GitHub - helm/helm-mapkubeapis: This is a Helm plugin which map deprecated or removed Kubernetes APIs in a release to supported…

This is a Helm plugin which map deprecated or removed Kubernetes APIs in a release to supported APIs - helm/helm-mapkubeapis

👍1

304 viewstutunak, 06:10

DevOps & SRE notes

Rhel compatible distribution in danger. RedHat change their policy and licenses agreements
https://www.jeffgeerling.com/blog/2023/dear-red-hat-are-you-dumb

293 viewstutunak, 14:02

DevOps & SRE notes

https://www.pagerduty.com/blog/debugging-kubernetes-with-ephemeral-containers/

PagerDuty

Debugging Kubernetes with Automated Runbooks & Ephemeral Containers

Data retrieved during an incident can be useful for both triage as well as post-incident root-cause analysis. But capturing that data from production can be difficult. PagerDuty’s Automated Runbooks make use of new technologies - such as Kubernetes Ephemeral…

303 viewstutunak, 06:13

DevOps & SRE notes

Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more

https://github.com/aquasecurity/trivy

GitHub

GitHub - aquasecurity/trivy: Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories…

Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more - aquasecurity/trivy

284 viewstutunak, 14:41

DevOps & SRE notes

kubectl plugin to list allocations (cpu, memory, gpu,... X utilization, requested, limit, allocatable,...)

https://github.com/davidB/kubectl-view-allocations

GitHub

GitHub - davidB/kubectl-view-allocations: kubectl plugin to list allocations (cpu, memory, gpu,... X utilization, requested, limit…

kubectl plugin to list allocations (cpu, memory, gpu,... X utilization, requested, limit, allocatable,...) - davidB/kubectl-view-allocations

353 viewstutunak, 06:41

DevOps & SRE notes

Interesting article about working with k8s that was located on bare-metal

https://medium.com/geekculture/a-retrospective-of-working-with-bare-metal-kubernetes-or-to-there-and-back-1868c0356eff

Medium

A Retrospective of Working with Bare Metal Kubernetes, or To There and Back

Five and a half years of Kubernetes evolution in Quadcode

392 viewstutunak, 14:42

DevOps & SRE notes

https://www.nslookup.io/learning/the-life-of-a-dns-query-in-kubernetes/

NsLookup.io

The life of a DNS query in Kubernetes

In Kubernetes, DNS queries follow a specific path to resolve the IP address of a hostname. Here are all the steps and components it goes through.

460 viewstutunak, 06:43

DevOps & SRE notes

https://medium.com/exness-blog/ebpf-and-its-capabilities-9a3a1dce3802

Medium

eBPF and its capabilities

Discover modern GNU/Linux kernel capabilities useful for monitoring, observability, security, performance engineering, and profiling using…

384 viewstutunak, 14:51

DevOps & SRE notes

https://enterprisersproject.com/article/2023/2/tech-teams-prioritize-leadership-skills-2023

The Enterprisers Project

Why tech teams must prioritize leadership skills in 2023

Strong leadership will be more important than ever in the year ahead. Here’s how to identify the 'power' skills your IT team needs

426 viewstutunak, 06:54

DevOps & SRE notes

Automatically monitor your local dev environment (services, repos, and more)

https://github.com/tylermumford/localstatus

GitHub

GitHub - tylermumford/localstatus: Automatically monitor your local dev environment (services, repos, and more)

Automatically monitor your local dev environment (services, repos, and more) - tylermumford/localstatus

412 viewstutunak, 14:57

DevOps & SRE notes

An operator for running distributed k6 tests.

https://github.com/grafana/k6-operator

GitHub

GitHub - grafana/k6-operator: An operator for running distributed k6 tests.

An operator for running distributed k6 tests. Contribute to grafana/k6-operator development by creating an account on GitHub.

456 viewstutunak, 06:04

DevOps & SRE notes

https://clickhouse.com/blog/building-clickhouse-cloud-from-scratch-in-a-year

ClickHouse

Building ClickHouse Cloud From Scratch in a Year

Have you ever wondered what it takes to build a serverless software as a service (SaaS) offering in under a year? In this blog post, we will describe how we built ClickHouse Cloud from the ground up

586 viewstutunak, 14:08

DevOps & SRE notes

Example of using n8n platform in the incident management https://touilleio.medium.com/alertmanager-incident-response-automation-with-n8n-c61227e196e9

Medium

Alertmanager incident response automation with n8n

The prometheus stack includes an alert dispatching component. But how to bring easily and efficiently automated responses to these alerts?

👍2

584 viewstutunak, 06:10

DevOps & SRE notes

Comprehensive guide for RBAC in K8s https://learnk8s.io/rbac-kubernetes

LearnKube

Limiting access to Kubernetes resources with RBAC

Learn how to recreate the Kubernetes RBAC authorization model from scratch and practice the relationships between Roles, ServiceAccounts, RoleBindings, etc.

584 viewstutunak, 14:33

DevOps & SRE notes

https://medium.com/lightricks-tech-blog/step-by-step-guide-how-to-create-a-dynamic-service-endpoint-via-k8s-api-1024309cb226

Medium

Step by Step Guide: How to create a Dynamic Service Endpoint via K8S API

By: Andrey Orlov

576 viewstutunak, 05:55

DevOps & SRE notes

SRE Report 2023 Catchpoint.pdf

16.2 MB

Now in its fifth year, The SRE Report has become the trusted source of trends and insights for reliability-as-a-feature practices. This year in partnership with Blameless, the report contains special contributions from Adrian Cockcroft and Steve McGhee and highlights findings from a global community of reliability practitioners, including SREs, managers, architects, and executives. As ever, we found some familiar trends and some thought-provoking anti-patterns.

Key findings include:

Organizations who operate with a “just culture” are 500% more likely to be Elite performing organizations. ‍
Elite-performing organizations are 260% more likely to substantially focus on Customer Experience reliability versus Low-performing organizations.
Organizations (59%) say that maintaining innovation velocity occasionally or often impacts employee productivity or morale – 14% unsure.
Organizations (59%) say tool sprawl is a non-existent or minor problem – challenges other research which simply equates tool sprawl to, ‘how many tools are in the stack’.

562 viewstutunak, 14:30

About

Blog

Apps

Platform