The challenge of making artificial intelligence more transparent is at the heart of Andrew Mallaband's exploration of the "black box" dilemma. This insightful editorial delves into the real-world implications of explainability in AI systems.
https://www.linkedin.com/pulse/explainability-black-box-dilemma-real-world-andrew-mallaband-ogvae/
https://www.linkedin.com/pulse/explainability-black-box-dilemma-real-world-andrew-mallaband-ogvae/
Linkedin
Explainability: The Black Box Dilemma in the Real World
The software industry is at a crossroads. I believe those who embrace explainability as a key part of their strategy will emerge as leaders.
👍1
Optimizing autoscaling in Kubernetes involves much more than just monitoring CPU and memory, as this blogpost by Cristian Sepulveda demonstrates through a practical application workflow. By leveraging KEDA to scale based on real-world metrics like message queue length, teams can achieve faster, cost-effective scaling tailored to specific application needs.
https://medium.com/@csepulvedab/how-to-optimize-autoscaling-in-kubernetes-using-metrics-based-on-application-workflows-7f899fdef4d9
https://medium.com/@csepulvedab/how-to-optimize-autoscaling-in-kubernetes-using-metrics-based-on-application-workflows-7f899fdef4d9
Medium
How to Optimize Autoscaling in Kubernetes Using Metrics Based on Application Workflows
One of the key advantages of using Kubernetes in modern infrastructure is the ease with which we can scale computing resources. Both the…
👍2
As the complexity of modern software systems grows, the meaning and practice of "observability" have become increasingly muddled. In this personal essay, Charity Majors argues that it's time to "version" observability—differentiating the traditional metrics-logs-traces approach (Observability 1.0) from a new, more flexible model built on wide, structured log events (Observability 2.0).
https://charity.wtf/2024/08/07/is-it-time-to-version-observability-signs-point-to-yes/
https://charity.wtf/2024/08/07/is-it-time-to-version-observability-signs-point-to-yes/
charity.wtf
Is It Time To Version Observability? (Signs Point To Yes)
Augh! I am so behind on so much writing, I’m even behind on writing shit that I need to reference in order to write other pieces of writing. Like this one. So we’re just gonna do this quick and dir…
👍2
Designing a robust network architecture for K3s multi-cluster environments can be challenging, especially when integrating Layer 2 and BGP routing on Unifi UDM devices. In this guide, David Elizondo walks through practical considerations and strategies for planning private RFC 1918 address spaces and achieving effective communication between clusters using tools like Cilium and native routing.
https://medium.com/@david-elizondo/planning-a-k3s-multi-cluster-network-with-l2-and-bgp-on-unifi-udm-ae4480a7b4f7
https://medium.com/@david-elizondo/planning-a-k3s-multi-cluster-network-with-l2-and-bgp-on-unifi-udm-ae4480a7b4f7
Medium
Planning a K3s Multi-Cluster Network with L2 and BGP on Unifi UDM
In my journey to rebuild my Kubernetes Lab to use a multi cluster design, I needed to put some thought into where in my network, services…
Virtual Kubelet is an open source Kubernetes kubelet implementation.
https://github.com/virtual-kubelet/virtual-kubelet
https://github.com/virtual-kubelet/virtual-kubelet
GitHub
GitHub - virtual-kubelet/virtual-kubelet: Virtual Kubelet is an open source Kubernetes kubelet implementation.
Virtual Kubelet is an open source Kubernetes kubelet implementation. - virtual-kubelet/virtual-kubelet
👍2
Learning from unexpected service failures can be a catalyst for long-term improvement, as Tines software engineer Shayon Mukherjee shares in this blog post. The story reveals how a Redis upgrade exposed a hidden point of failure in their webhook system, ultimately leading to stronger resilience and more comprehensive testing practices.
https://www.tines.com/blog/engineering-incidents-improvement/
https://www.tines.com/blog/engineering-incidents-improvement/
Tines
Thankful for incidents: embracing chaos to find clarity | Tines
How lessons from a recent incident led to improved platform resilience and more comprehensive testing practices.
👍2❤1
Slow container startup times can cripple the productivity of Kubernetes teams managing large Docker images—sometimes dragging deployments out for hours. In this feature, Kazakov Kirill shares a practical strategy for pre-warming nodes and leveraging image caching, dramatically reducing cold starts and disk pressure during mass pod rollouts in Amazon EKS clusters.
https://hackernoon.com/how-to-optimize-kubernetes-for-large-docker-images
https://hackernoon.com/how-to-optimize-kubernetes-for-large-docker-images
Hackernoon
How to Optimize Kubernetes for Large Docker Images
Discover how a creative warm-up process transformed our Kubernetes deployments, addressing ContainerCreating issues, reducing cold start times, and minimizing d
❤2
Kaniko is dead
https://github.com/GoogleContainerTools/kaniko
🧊 This project is archived and no longer developed or maintained. 🧊https://github.com/GoogleContainerTools/kaniko
GitHub
GitHub - GoogleContainerTools/kaniko: Build Container Images In Kubernetes
Build Container Images In Kubernetes. Contribute to GoogleContainerTools/kaniko development by creating an account on GitHub.
😢15👍3🔥2
Tail-based sampling unlocks deeper insights into distributed systems by allowing OpenTelemetry users to prioritize traces that matter most, such as those with errors or slow responses. This guide explains how tail-based sampling works, its differences from head-based sampling, and provides a practical walkthrough for setting up a two-tier OpenTelemetry Collector architecture that intelligently filters traces for more actionable observability.
https://itnext.io/empower-your-observability-tail-based-sampling-for-better-tracing-with-opentelemtry-243ca2cc55d1
https://itnext.io/empower-your-observability-tail-based-sampling-for-better-tracing-with-opentelemtry-243ca2cc55d1
Medium
Empower Your Observability: Tail-Based Sampling for Better Tracing with Opentelemetry
In the era of microservices and distributed systems, observability has become a cornerstone for maintaining robust, reliable, and scalable…
👍1
Achieving end-to-end visibility for Python data pipelines is essential for ensuring quality and reliability in modern data architectures. This hands-on walkthrough from Elastic Observability Labs explains how to implement OpenTelemetry (OTEL) in your Python ETL noscripts—covering automatic instrumentation, manual tracing, performance metrics, and anomaly-driven alerting—to proactively monitor, troubleshoot, and optimize your entire pipeline lifecycle using Elastic’s platform.
https://www.elastic.co/observability-labs/blog/monitor-your-python-data-pipelines-with-otel
https://www.elastic.co/observability-labs/blog/monitor-your-python-data-pipelines-with-otel
www.elastic.co
Monitor your Python data pipelines with OTEL — Elastic Observability Labs
Learn how to configure OTEL for your data pipelines, detect any anomalies, analyze performance, and set up corresponding alerts with Elastic.
👍1
Generate JSON Schema files based on a Terraform configuration
https://github.com/HewlettPackard/terraschema
https://github.com/HewlettPackard/terraschema
GitHub
GitHub - HewlettPackard/terraschema: Generate JSON Schema files based on a Terraform configuration
Generate JSON Schema files based on a Terraform configuration - HewlettPackard/terraschema
While GitOps has brought consistency and innovation to Kubernetes deployments, its reliance on git-based workflows and tools like ArgoCD and Flux still leaves important challenges unsolved. This article explores both the real-world progress and the limitations of GitOps, from deployment strategies and multi-cluster rollouts to issues around permissions, secrets management, and the need for solutions that go beyond git as the sole source of truth.
https://itnext.io/realizing-the-potential-of-gitops-263051baff04
https://itnext.io/realizing-the-potential-of-gitops-263051baff04
Medium
Realizing the potential of GitOps
GitOps hasn’t realized its full potential yet. What else is needed or needs to be improved?
❤2👍2
Meeting customers’ rising expectations for security, speed, and personalization demands a new approach to computing infrastructure, which is exactly where distributed cloud comes in. This feature explains why developers must look beyond traditional centralized cloud models—adopting distributed cloud computing to optimize performance, comply with data regulations, and deliver truly customized services at scale.
https://thenewstack.io/why-developers-need-to-care-about-distributed-cloud-computing/
https://thenewstack.io/why-developers-need-to-care-about-distributed-cloud-computing/
The New Stack
Why Developers Need To Care About Distributed Cloud Computing
Gathering and processing customers’ data via distributed cloud enables real-time experience no matter where the customers are on the globe.
👍1