NEW BOT Телеграм, страница

Dear friend, you have built a Kubernetes

I am afraid to inform you that you have built a Kubernetes. I know you wanted to "choose boring tech" to just run some containers. You said that "Kubernetes is overkill" and "it's just way too complex for a simple task" and yet, six months later, you have pile of shell noscripts that do not work—breaking every time there's a slight shift in the winds of production.

https://www.macchaffee.com/blog/2024/you-have-built-a-kubernetes

3.65K views16:00

DevOps&SRE Library

Choosing the right Postgres indexes

Indexes can make a world of difference to performance in Postgres, but it’s not always obvious when you’ve written a query that could do with an index. Here we’ll cover:

- What indexes are
- Some use cases for when they’re helpful
- Rules of thumb for figuring out which sort of index to add
- How to identify when you’re missing an index

https://incident.io/blog/choosing-the-right-postgres-indexes

3.46K views07:00

DevOps&SRE Library

BemiDB

BemiDB is a Postgres read replica optimized for analytics. It consists of a single binary that seamlessly connects to a Postgres database, replicates the data in a compressed columnar format, and allows you to run complex queries using its Postgres-compatible analytical query engine.

https://github.com/BemiHQ/BemiDB

3.22K views15:01

DevOps&SRE Library

65,000 nodes and counting: Google Kubernetes Engine is ready for trillion-parameter AI models

As generative AI evolves, we're beginning to see the transformative potential it is having across industries and our lives. And as large language models (LLMs) increase in size — current models are reaching hundreds of billions of parameters, and the most advanced ones are approaching 2 trillion — the need for computational power will only intensify. In fact, training these large models on modern accelerators already requires clusters that exceed 10,000 nodes.

With support for 15,000-node clusters — the world’s largest — Google Kubernetes Engine (GKE) has the capacity to handle these demanding training workloads. Today, in anticipation of even larger models, we are introducing support for 65,000-node clusters.

With support for up to 65,000 nodes, we believe GKE offers more than 10X larger scale than the other two largest public cloud providers.

https://cloud.google.com/blog/products/containers-kubernetes/gke-65k-nodes-and-counting

3.4K views07:01

DevOps&SRE Library

netavark

Netavark is a rust based network stack for containers.

https://github.com/containers/netavark

3.32K views15:02

DevOps&SRE Library

mise

mise is a polyglot tool version manager. It replaces tools like asdf, nvm, pyenv, rbenv, etc.

mise allows you to switch sets of env vars in different project directories. It can replace direnv.

mise is a task runner that can replace make, or npm noscripts.

https://github.com/jdx/mise

3.54K views07:01

DevOps&SRE Library

Migrating billions of records: moving our active DNS database while it’s in use

According to a survey done by W3Techs, as of October 2024, Cloudflare is used as an authoritative DNS provider by 14.5% of all websites. As an authoritative DNS provider, we are responsible for managing and serving all the DNS records for our clients’ domains. This means we have an enormous responsibility to provide the best service possible, starting at the data plane. As such, we are constantly investing in our infrastructure to ensure the reliability and performance of our systems.

https://blog.cloudflare.com/migrating-billions-of-records-moving-our-active-dns-database-while-in-use

3.51K views15:02

DevOps&SRE Library

Against Incident Severities and in Favor of Incident Types

About a year ago, Honeycomb kicked off an internal experiment to structure how we do incident response. We looked at the usual severity-based approach (usually using a SEV scale), but decided to adopt an approach based on types, aiming to better play the role of quick definitions for multiple departments put together. This post is a short report on our experience doing it.

https://www.honeycomb.io/blog/against-incident-severities-favor-incident-types

3.57K views07:02

DevOps&SRE Library

Way too many ways to wait on a child process with a timeout

https://gaultier.github.io/blog/way_too_many_ways_to_wait_for_a_child_process_with_a_timeout.html

3.93K views15:02

DevOps&SRE Library

How to Build Smaller Container Images: Docker Multi-Stage Builds

https://labs.iximiuz.com/tutorials/docker-multi-stage-builds

4.01K views07:02

DevOps&SRE Library

slackdump

Save or export your private and public Slack messages, threads, files, and users locally without admin privileges.

https://github.com/rusq/slackdump

4.43K views15:00

DevOps&SRE Library

automatisch

The open source Zapier alternative. Build workflow automation without spending time and money.

https://github.com/automatisch/automatisch

4.64K views07:01

DevOps&SRE Library

pglite-fusion

Embed an SQLite database in your PostgreSQL table. AKA multitenancy has been solved.

https://github.com/frectonz/pglite-fusion

4.35K views15:01

DevOps&SRE Library

There’s No Such Thing as a Free Lunch!

How Slack trains engineers in incident response by ordering lunch together.

https://slack.engineering/theres-no-such-thing-as-a-free-lunch

4.19K views07:01

DevOps&SRE Library

lla

lla is a high-performance, extensible alternative to the traditional ls command, written in Rust. It offers enhanced functionality, customizable output, and a plugin system for extended capabilities.

https://github.com/triyanox/lla

3.93K views15:01

DevOps&SRE Library

wesql

WeSQL is an innovative MySQL distribution that adopts a compute-storage separation architecture, with storage backed by S3 (and S3-compatible systems). It can run on any cloud, ensuring no vendor lock-in.

WeSQL has completely replaced MySQL’s traditional disk storage with S3. All MySQL data—binlogs, schemas, storage engine metadata, WAL, and data files—are entirely (not partially!) stored as objects in S3. The 11 nines of durability provided by S3 significantly enhances data reliability. Additionally, WeSQL can start from a clean, empty instance, connect to S3, load the data, and begin serving immediately with no additional setup required.

It is ideal for users who need an easy-to-manage, cost-effective, and developer-friendly MySQL database solution, especially for those needing support for both Serverless and BYOC (Bring Your Own Cloud).

https://github.com/wesql/wesql

3.84K views07:01

DevOps&SRE Library

10 Essential AWS Security Steps for Your AWS Account

After spending years helping teams set up their AWS infrastructure, I've noticed something interesting: many of us face the same security challenges when starting out. You know what I mean if you've ever wondered "Wait, is my S3 bucket actually secure?" or "Should I really be using the root account for this?" (Spoiler: probably not!)

The good news? I've put together this guide to help you build a rock-solid AWS security foundation from day one. We'll cover 10 essential security measures that I've seen make a real difference in protecting AWS environments. While absolute security is a journey rather than a destination, implementing these steps will put you way ahead of the game in defending against common attack vectors.

And I've also created a Terraform project that you can use as baseline for your securing your AWS account!

The best part? It's all under the AWS free tier! 😉

Essentially, I got tired or reading the same posts regarding people (or organizations) getting their account hacked, here's my solution for that!

https://cloudnature.net/blog/10-essential-aws-security-steps-for-your-aws-account

4.09K views15:02

DevOps&SRE Library

terrateam

Terrateam is an open-source GitOps CI/CD platform for automating infrastructure workflows. It integrates with GitHub to orchestrate Terraform, OpenTofu, CDKTF, and Terragrunt operations via pull requests. Use our hosted service or run on-premise.

https://github.com/terrateamio/terrateam

3.66K views07:01

DevOps&SRE Library

How I came to build a cheap server cluster for VDI

https://medium.com/@mnl_584/how-i-came-to-build-a-cheap-server-cluster-for-vdi-ca2ed6028eb2

3.94K views15:02

DevOps&SRE Library

Service Meshes Decoded Part One: A performance comparison of Istio vs Linkerd vs Cilium A service mesh is a dedicated infrastructure layer that facilitates service-to-service communications between services or microservices using a proxy. https://livewy…

Service Meshes Decoded Part Two: Is Istio Ambient worth it?

https://livewyer.io/blog/2024/06/06/comparison-of-service-meshes-part-two

3.68K views07:00

DevOps&SRE Library

Using Sealed Secrets with Your Kubernetes Applications