KubeFM – Telegram
KubeFM
300 subscribers
83 photos
813 videos
1.01K links
Podcast episodes, fireside chats, roundtables and educational programs about Kubernetes.
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
John Platt, CTO at StormForge, discusses the complex infrastructure requirements for running AI workloads on Kubernetes beyond just code deployment.

He highlights several technical challenges, including configuring proper GPU drivers and CUDA versions, implementing GPU virtualization tools, and addressing storage and networking needs for large models. While cloud providers offer proprietary solutions, John emphasizes the need for greater investment from the open source community to develop vendor-neutral alternatives that prevent lock-in and promote portability.

Watch the full interview: https://ku.bz/mt_lTMFwF

This interview is a reaction to John McBride's episode https://ku.bz/wP6bTlrFs
Media is too big
VIEW IN TELEGRAM
Bhavani Indukuri, Staff Platform Engineer at Zscaler, shares her perspective on how much Kubernetes knowledge developers should have in today's platform-oriented world.

While she acknowledges that platforms should handle most of the complexity, Bhavani believes developers still need minimal Kubernetes knowledge for effective debugging—such as listing resources and checking logs. She emphasizes that developers should primarily focus on building features while platform engineers work to enable and simplify their experience.

Watch the full interview: https://ku.bz/Znfx9Z0-x
Media is too big
VIEW IN TELEGRAM
A 45-minute production outage at Weaveworks changed how we deploy software forever.

We just released the first episode of "The Making of Flux," a four–part KubeFM original series in which we interview the people who built, maintained, and deployed Flux at scale.

Episode 1 features Alexis Richardson (former Weaveworks CEO), Chris Aniszczyk (CNCF CTO), and Andrew Martin (ControlPlane CEO) discussing how that production disaster led to GitOps, what CNCF graduation actually means, and how Flux is thriving.

You'll hear about the technical decisions, governance challenges, and production failures that shaped the project and what these practitioners learned the hard way.

Thanks to our guests for their candor, to ControlPlane for making this series possible, and to @Birthmarkb.

Episode 1 is live now: https://ku.bz/5Sf5wpd8y

P.S. If you're going to KubeCon, FluxCon is on November 11th in Salt Lake City https://ku.bz/L843kg0CK
Media is too big
VIEW IN TELEGRAM
Thibault shares the technical details of debugging a complex VPA failure at Adevinta, where webhook timeouts triggered continuous pod evictions across their multi-tenant Kubernetes platform.

You will learn:

- VPA architecture deep dive - How the recommender, updater, and mutating webhook components interact
- Hidden Kubernetes limits - How default QPS and burst rate limits in the Kubernetes Go client can cause widespread failures
- Monitoring strategies for autoscaling - What metrics to track for webhook latency and pod eviction rates to catch similar issues

Watch (or listen to) it here: https://ku.bz/rf1pbWXdN

🌟 This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform https://ku.bz/lnxYK3s0L

With @Birthmarkb "Reading Rainbow" Farrell
Media is too big
VIEW IN TELEGRAM
Jim Bugwadia, Co-Founder & CEO @ Nirmata, discusses the concerning statistic that 71% of Kubernetes security vulnerabilities stem from misconfigurations (according to a 2021 Red Hat report).

He explains how policy engines like Kyverno enable teams to not only "shift left" but also "shift down" by building security directly into the platform layer. Jim shares how Nirmata customers implement this approach by enforcing policies upfront, scanning in pipelines, and blocking problematic configurations at admission control points, resulting in cleaner Kubernetes environments.

Watch the full interview: https://ku.bz/hYZXTmPV9
Media is too big
VIEW IN TELEGRAM
Phil Estes, Principal Engineer at Amazon Web Services (AWS), explains why container security extends far beyond using minimal images.

He emphasizes examining the entire supply chain including dependencies, software composition analysis, and software bill of materials (SBOM).

He discusses the importance of image signing, package signing, and certificate management initiatives, including the OpenSSF's work providing maintainers with physical keys for proper package signing.

Watch the full interview: https://ku.bz/K4LmmL2NN

This interview is a reaction to Harsha Koushik's episode https://ku.bz/n_sJ04xMY
Forwarded from LearnKube news
This week on Learn Kubernetes Weekly 149:

🔥 More DevOps than I Bargained For
🧪 Testing to See if You Can Run a MariaDB Cluster on a $150 Kubernetes Lab
Ceph on NVMe Made No Sense to Us — So We Built a 40x Better Alternative
🌐 Observing Egress Traffic with Istio
🐍 Trying to Break Out of the Python REPL Sandbox in a Kubernetes Environment: A Practical Journey

Read it now: https://learnkube.com/issues/149

⭐️ This newsletter is brought to you by Tigera, the Creators of Project Calico — Learn how Calico uses eBPF for high performance, low latency, & enhanced networking https://ku.bz/b7Nm3GkwL
Media is too big
VIEW IN TELEGRAM
Andy Suderman, CTO @ Fairwinds, discusses the transition from playground to production Kubernetes environments. He agrees that clusters only feel real once they're serving actual customer traffic, but expands beyond just Ingress and DNS to emphasize the broader ecosystem requirements.

Andy explains that vanilla Kubernetes clusters are not 100% functional out of the box. Production readiness requires a comprehensive suite of add-ons including ingress controllers, DNS controllers, cert managers, and various operators.

The key insight is that the ecosystem of add-ons is crucial to any functional Kubernetes platform, and once these components are in place serving production workloads, the cluster graduates from playground status to a real operational environment.

Watch the full interview: https://ku.bz/ZQTRkMpz5

This interview is a reaction to Dan Garfield's episode https://ku.bz/m3YNgCh1W
KubeFM
Andy Suderman, CTO @ Fairwinds, discusses the transition from playground to production Kubernetes environments. He agrees that clusters only feel real once they're serving actual customer traffic, but expands beyond just Ingress and DNS to emphasize the broader…
This interview is brought to you with support from Fairwinds — expert-led, fully managed Kubernetes that frees your engineers from infrastructure headaches and puts you on the fast track to production-grade success https://ku.bz/0-rnZ5Sjs
Media is too big
VIEW IN TELEGRAM
Brian Douglas discusses the need for better user interfaces in the Kubernetes ecosystem. He highlights the ongoing challenge of YAML complexity and command-line interfaces that can be barriers for developers who prefer more visual, point-and-click experiences.

Brian specifically mentions Headlamp, an open-source Kubernetes dashboard as an example of the type of community-driven UI solutions the ecosystem needs more of. He emphasizes that while many vendors offer proprietary solutions, open-source alternatives developed by the community are crucial for making Kubernetes more accessible to developers with different backgrounds and preferences.

Watch the full interview: https://ku.bz/qxHdJnnQv
KubeFM
Brian Douglas discusses the need for better user interfaces in the Kubernetes ecosystem. He highlights the ongoing challenge of YAML complexity and command-line interfaces that can be barriers for developers who prefer more visual, point-and-click experiences.…
This interview is brought to you with support from Fairwinds — expert-led, fully managed Kubernetes that frees your engineers from infrastructure headaches and puts you on the fast track to production-grade success https://ku.bz/0-rnZ5Sjs
Media is too big
VIEW IN TELEGRAM
Abdel Sghiouar, Senior Cloud Developer Advocate at Google, discusses the need for more balanced corporate contributions to Kubernetes development, particularly for AI/ML workloads. Speaking from Google's position as one of the largest contributors, he argues that uneven participation from major cloud providers threatens the project's future.

Abdel identifies specific challenges Kubernetes maintainers face: infrastructure management, backward compatibility, and long-term support (LTS). He emphasizes the critical need for companies to converge toward standardization rather than building fragmented specifications, warning that insufficient collaboration could lead to Kubernetes diverging and the loss of its unified community.

Watch the full interview: https://ku.bz/VlwNFGX07

This interview is a reaction to John McBride's episode https://ku.bz/wP6bTlrFs
KubeFM
Abdel Sghiouar, Senior Cloud Developer Advocate at Google, discusses the need for more balanced corporate contributions to Kubernetes development, particularly for AI/ML workloads. Speaking from Google's position as one of the largest contributors, he argues…
This interview is brought to you with support from Fairwinds — expert-led, fully managed Kubernetes that frees your engineers from infrastructure headaches and puts you on the fast track to production-grade success https://ku.bz/0-rnZ5Sjs
This media is not supported in your browser
VIEW IN TELEGRAM
Stéphane Goetz, a Principal Software Engineer, highlights the advantages of a well-managed Jenkins cluster on Kubernetes equipped with Prometheus.

Teams benefit from this setup by obtaining real-time and historical data on network, IO, memory, and CPU usage.

Additionally, resource optimization is facilitated through customizable profiles (big, small, medium, and large), empowering teams to manage their resource allocation efficiently based on their specific needs.

Watch the full episode: https://ku.bz/Rg42-LLvQ
Media is too big
VIEW IN TELEGRAM
Graziano Casto, DevRel Engineer @ Mia-Platform, shares a 5-point framework for closing the gap between developers and platform engineers.

His approach covers GitOps workflows for familiar developer interfaces, creating reusable infrastructure components with tools like Crossplane and Terraform that provide sensible defaults, and implementing developer self-service portals to eliminate bottlenecks.

Watch the full interview: https://ku.bz/7JRQh4626
This media is not supported in your browser
VIEW IN TELEGRAM
Harsha Koushik, a Security Researcher and Technical Product Manager at Palo Alto Networks, discusses the nuanced decision of using empty scratch containers.

He emphasizes that while experienced developers might manage fine with these minimalistic containers, less experienced developers could face significant issues due to missing essential libraries and configurations.

Koushik points out that missing elements like C libraries, /etc/passwd files, and timezone defaults can lead to failures in logging, cron jobs, and other system functions.

Watch the full episode: https://ku.bz/n_sJ04xMY