It describes how to optimize Loki, a log aggregation system, for better performance in Grafana, a data visualization platform.
https://itnext.io/grafana-loki-performance-optimization-with-recording-rules-caching-and-parallel-queries-28b6ebba40c4
https://itnext.io/grafana-loki-performance-optimization-with-recording-rules-caching-and-parallel-queries-28b6ebba40c4
Medium
Grafana Loki: performance optimization with Recording Rules, caching, and parallel queries
Improve the performance and CPU/Memory resources usage by Grafana Loki components with Recording Rules and caching
🔥1
The blog post discusses the complexities of achieving compliance in a dynamic, ephemeral environment such as Kubernetes, and offers insights and guidance on maintaining a secure and compliant cloud environment².
https://www.armosec.io/blog/kubernetes-compliance-challenges
https://www.armosec.io/blog/kubernetes-compliance-challenges
ARMO
Kubernetes Compliance Challenges and Guidance | ARMO
Learn about Kubernetes compliance challenges, consequences of non-compliance, and get guidance on maintaining a secure and compliant cloud environment in a dynamic Kubernetes setup.
The article provides a detailed account of an outage experienced by Cloudflare on November 2, 2023, highlighting the causes and resolution. It discusses the unintended power failure at a data center, its impact on Cloudflare's control plane and analytics systems, and the measures taken to restore services and prevent such incidents in the future
https://blog.cloudflare.com/post-mortem-on-cloudflare-control-plane-and-analytics-outage/
https://blog.cloudflare.com/post-mortem-on-cloudflare-control-plane-and-analytics-outage/
The Cloudflare Blog
Post mortem on the Cloudflare Control Plane and Analytics Outage
Beginning on Thursday, November 2, 2023 at 11:43 UTC Cloudflare's control plane and analytics services experienced an outage. Here are the details
DevOps & SRE notes
The article provides a detailed account of an outage experienced by Cloudflare on November 2, 2023, highlighting the causes and resolution. It discusses the unintended power failure at a data center, its impact on Cloudflare's control plane and analytics systems…
Hacker news discussion
https://news.ycombinator.com/item?id=38138640
https://news.ycombinator.com/item?id=38138640
The article provides insights into optimizing a Kubernetes cluster, including different cluster, node, and tenancy configurations, to improve security, efficiency, and ease of management1.
https://www.armosec.io/blog/kubernetes-cluster-architecture-best-practice
https://www.armosec.io/blog/kubernetes-cluster-architecture-best-practice
ARMO
Kubernetes Cluster Architecture Best Practices | ARMO
In this post, we will explore various key best practices for optimizing a Kubernetes cluster architecture, including different cluster, node, and tenancy configurations
👍2
The blogpost provides insights into creating effective environments that foster productivity, creativity, and well-being1.
https://medium.com/@julian.klas/make-environments-that-work-cd3404fe83e8
https://medium.com/@julian.klas/make-environments-that-work-cd3404fe83e8
Medium
Make Environments That Work
When I think of building an API, I spend most of my time thinking about the entities, fields, and the underlying design of the system…
❤1
The author provides insights into automating Helm dependency updates, minimizing version gaps, and simplifying updates with Helm1. The author shares a nifty Bash noscript that operates in tandem with the artifacthub.io API to identify and update Helm dependencies whenever changes are detected1.
https://blog.devops.dev/charting-the-course-helm-dependencies-updates-made-easy-%EF%B8%8F-48656bfc59c
https://blog.devops.dev/charting-the-course-helm-dependencies-updates-made-easy-%EF%B8%8F-48656bfc59c
Medium
Charting the Course: Helm Dependencies Updates Made Easy 🗺️
Maximizing Security, Minimizing Version Gaps, and Simplifying Updates with Helm
👍1
The blogpost presents a comprehensive guide to Python’s role in DevOps, including its use cases, resources, and learning roadmap1.
https://blog.devgenius.io/python-for-devops-a-definitive-guide-f4785a60007e
https://blog.devgenius.io/python-for-devops-a-definitive-guide-f4785a60007e
Medium
Python for DevOps — A Definitive Guide
Python Python Python! Apple might have to add the pronunciation of the word Python in its noise-canceling AirPods!
Good article about improving a terraform "code"
https://medium.com/@seifeddinerajhi/a-guide-to-using-pre-commit-hooks-for-terraform-save-time-and-improve-code-quality-ba658ce41a77
https://medium.com/@seifeddinerajhi/a-guide-to-using-pre-commit-hooks-for-terraform-save-time-and-improve-code-quality-ba658ce41a77
Medium
A Guide to Using Pre-Commit Hooks for Terraform: Save Time and Improve Code Quality
Automate Terraform Code Checks and and Security Standards ↪️
The comprehensive guide to PagerDuty’s role in SRE practices, including setting up schedules, integrating with existing tools, automating routine tasks, and managing incidents1. The guide is a must-read for anyone starting their journey in SRE or looking to sharpen their existing skills1.
https://blog.devgenius.io/pagerduty-101-the-ultimate-guide-for-first-time-site-reliability-engineers-c8864dceebf0
https://blog.devgenius.io/pagerduty-101-the-ultimate-guide-for-first-time-site-reliability-engineers-c8864dceebf0
Medium
PagerDuty 101: The Ultimate Guide for First-Time Site Reliability Engineers
Empowering Your On-Call Experience: A Step-by-Step Introduction to Streamlining Incident Management with PagerDuty
The article provides insights into maintaining highly available applications in a Kubernetes cluster, including dealing with different types of disruptions, such as involuntary and voluntary disruptions, and using Pod Disruption Budget (PDB) to make applications always available.
https://dev.to/oshi36/pod-disruption-budget-in-kubernetes-6kg
https://dev.to/oshi36/pod-disruption-budget-in-kubernetes-6kg
DEV Community
Pod Disruption Budget in Kubernetes
Maintaining highly available applications in a Kubernetes cluster can be hard, especially when nodes...
Have you ever thought about pass the PCA exam?
https://medium.com/@dgoscn/prometheus-certified-associate-pca-tips-on-how-to-pass-the-exam-72cc573e2d06
https://medium.com/@dgoscn/prometheus-certified-associate-pca-tips-on-how-to-pass-the-exam-72cc573e2d06
Medium
Prometheus Certified Associate (PCA)— Tips on how to pass the exam
Welcome to a journey that promises to elevate your understanding and mastery of Prometheus! Whether you’re just starting your Prometheus…
A kubernetes operator for creating and managing a cache of container images directly on the cluster worker nodes, so application pods start almost instantly
https://github.com/senthilrch/kube-fledged
https://github.com/senthilrch/kube-fledged
GitHub
GitHub - senthilrch/kube-fledged: A kubernetes operator for creating and managing a cache of container images directly on the cluster…
A kubernetes operator for creating and managing a cache of container images directly on the cluster worker nodes, so application pods start almost instantly - senthilrch/kube-fledged
🤯1
Interesting thoughts about the second wave of DevOps:
1. DevOps culture is at risk
2. DevOps tools are outdated
https://www.systeminit.com/blog-second-wave-devops/
1. DevOps culture is at risk
2. DevOps tools are outdated
https://www.systeminit.com/blog-second-wave-devops/
👍3
Good guide for troubleshooting k8s deployments
https://learnk8s.io/troubleshooting-deployments
https://learnk8s.io/troubleshooting-deployments
LearnKube
A visual guide on troubleshooting Kubernetes deployments
Troubleshooting in Kubernetes can be a daunting task. In this article you will learn how to diagnose issues in Pods, Services and Ingress.
👍1