NEW BOT Телеграм, страница

Мониторим ИТ

How we scaled our new Prometheus TSDB Grafana Mimir to 1 billion active series

Полторы недели назад Grafana анонсировала собственную TSDB Mimir, и вот теперь рассказывает как они затестили Mimir с миллиардом серий данных.

Блог Grafana

2.67K views13:35

Мониторим ИТ

How relabeling in Prometheus works

Relabeling is a powerful tool that allows you to classify and filter Prometheus targets and metrics by rewriting their label set. Блог Grafana.

Grafana Labs

How relabeling in Prometheus works | Grafana Labs

Relabeling in Prometheus is a powerful tool that allows you to classify and filter targets and metrics.

2.95K views18:00

Мониторим ИТ

How summary metrics work in Prometheus

A summary is a metric type in Prometheus that can be used to monitor latencies (or other distributions like request sizes). For example, when you monitor a REST endpoint you can use a summary and configure it to provide the 95th percentile of the latency. If that percentile is 120ms that means that 95% of the calls were faster than 120ms, and 5% were slower. Читать дальше.

4.37K views05:00

Мониторим ИТ

How To Troubleshoot Slow Linux Servers

atop, free, ncdu, iotop и nethogs

4.81K views10:10

Мониторим ИТ

5 Network Performance and Analysis Tools For Linux

iperf, tcpdump, hping, netstat и scapy

4.96K views13:30

Мониторим ИТ

SRE Revisited: SLO in the Age of Microservices

Еще раз о SLI, SLA, SLO, Error Budget и всём таком + видео

Medium

SRE Revisited: SLO in the Age of Microservices

Site Reliability Engineering practice was established by Google nearly 20 years ago. How to apply to microservices and cloud native…

2.73K views08:29

Мониторим ИТ

Упрощаем мониторинг и управление контейнерами Docker при помощи инструментов CLI

Dockly, Dive, Ctop, Dry, Lazy Docker, Poco, Sen и Skopeo.

4.75K views07:16

Мониторим ИТ

Intro to metrics with Grafana: Prometheus, Grafana Mimir, Graphite, and beyond

Вебинар завтра в 19:30 МСК. Регистрация.

Grafana Labs

Intro to metrics with Grafana: Prometheus, Grafana Mimir, and beyond | Grafana Labs

In this webinar, we’ll go over challenges when scaling metrics systems, with a particular focus on Prometheus and Grafana Mimir.

2.62K views18:30

Мониторим ИТ

How to drop and delete metrics in Prometheus

Keeping your Prometheus optimized can be a tedious task over time, but it’s essential in order to maintain the stability of it and also to keep the cardinality under control. Identifying the unnecessary metrics at source, deleting the existing unneeded metrics from your TSDB regularly will keep your Prometheus storage & performance intact.

In this article we’ll look at both identifying, dropping them at source and deleting the already stored metrics from Prometheus.

Читать дальше на Медиуме.

3.66K views07:25

Мониторим ИТ

Культура postmortems или как мы учимся на ̶с̶в̶о̶и̶х̶ факапах

Где-то три года назад я выступал на небольшом митапе с темой, которая вынесена в название этой статьи. В том докладе я рассказывал о том, как мы за несколько лет выстроили работу с инцидентами у себя в привлечении Tinkoff. Ну и чтобы доклад был не таким скучным я поделился несколькими postmortems, которые произошли в командах “моего друга”. Читать дальше.

3.88K views13:30

Мониторим ИТ

Calculating composite SLA

How to serial and parallel dependencies affect the total SLA. Читать дальше.

2.59K views07:18

Мониторим ИТ

15 months of 24x7 Primary On-Call — Here’s How I Survived

I just finished 15 months of primary 24x7 on call. Although it is always stressful to be paged in the middle of the night or on a weekend or holiday I was able to lean on my SRE background to ensure that every alert that woke me up faithfully indicated a critical issue with our system and required human intervention. Here’s how I did it. Читать дальше.

3.65K views08:00

Мониторим ИТ

Мониторинг СУБД Firebird с помощью Zabbix

С повсеместным внедрением средств мониторинга захотелось снимать минимальную статистику и определять работоспособность СУБД и самой БД. Для мониторинга использую Zabbix установленный на Ubuntu 20.04 LTS, а сама СУБД находится на виртуальной машине с Windows 2008 Server. Версии на которых был использован метод мониторинга описанный ниже для СУБД Firebird 2.5.9, версия Zabbix 6.0. Но думаю что и с другими версиями проблем быть не должно. Читать дальше.

3.35K views15:00

Мониторим ИТ

A day in the life of an SRE: updating a production-critical Redis cluster

In this article, I share best practices in how to fully capitalise on your migration efforts. I take you through the steps of our Redis cluster update, explain the challenges I faced and highlight potential pitfalls. After all, operating safely comes with experience. Читать дальше.

3.5K views08:00

Мониторим ИТ

Linux — How to Evaluate Network Performance?

Полезные инструменты для бенчмаркинга сетевой части Linux. Медиум.

6.37K views12:08

Мониторим ИТ

На Ютубе есть интересный канал, посвящённый Заббикс. Это канал Дмитрия Ламберта - руководителя группы технической поддержки из Заббикс. Там регулярно выходят видосы с полезными лайфхаками относительно Заббикс.

Ссылка на канал.

3.36K views18:05

Мониторим ИТ

Monitor Nginx Metrics with GrafanaDR: A Step-by-Step Guide

Let’s imagine that you have a small project where not everything (or nothing) is containerized. Therefore orchestration, convenient Loki, and other tools for monitoring and analytics of requests are not used (but if I missed something, you can correct it in the comments). Читать дальше.

3.17K views15:58

Мониторим ИТ

Новый шаблон от Zabbix для Proxmox

Скачать

3.29K views08:45

Мониторим ИТ

What makes VictoriaMetrics the next leading choice for open-source monitoring

After researching a few solutions like Thanos, Cortex, Grafana-Mimir, and VictoriaMetrics. It’s clear to say that in my opinion, VictoriaMetrics is the winner and the best fit for my purposes & needs. Читать дальше.

4.33K views15:09

Мониторим ИТ

6 Metrics to Watch for on Your K8s Cluster

We’ll be covering the most critical metrics based on k8s’s metadata which form a good baseline for monitoring your workloads and making sure they’re in a healthy state. Читать дальше.

Medium

6 Metrics To Watch for on Your K8s Cluster

The most critical Kubernetes metrics to monitor

4.66K views17:00

Мониторим ИТ

Запуск облачного стека мониторинга с использованием нескольких ЦОДов

Когда я общаюсь с клиентами, они рассказывают мне о том, что их приложения работают в двух центрах обработки данных, но при более детальном изучении оказывается, что их стек наблюдения доступен только в одном из них.

Это знание, как откровение, снизошло на многих в марте 2021 года. Один из крупнейших европейских провайдеров облачных услуг (OVHcloud) пережил масштабный пожар в одном из своих дата-центров, что вызвало серьезные перебои в работе даже таких крупных клиентов, как правительство Франции.

На следующий день после инцидента мой коллега, отвечающий за управление качеством, спросил меня, сможем ли мы выдержать подобную катастрофу. Это побудило меня задуматься о превращении нашего единого стека мониторинга в стек высокой доступности, работающего на базе нескольких центров обработки данных. Читать дальше.

2.91K views06:34

About

Blog

Apps

Platform