Metabase + Clickhouse tutorial on Yandex.Cloud
1. Spin up a Virtual Machine
2. Configure VM: SSH, Docker
3. Download Clickhouse plug-in
4. Deploy Metabase
5. Connect to Clickhouse Playground
6. Visualize a question
https://gist.github.com/kzzzr/ecec7dca8bb70586a23569993df470e8
#bi #clickhouse
1. Spin up a Virtual Machine
2. Configure VM: SSH, Docker
3. Download Clickhouse plug-in
4. Deploy Metabase
5. Connect to Clickhouse Playground
6. Visualize a question
https://gist.github.com/kzzzr/ecec7dca8bb70586a23569993df470e8
#bi #clickhouse
Gist
Metabase + Clickhouse tutorial on Yandex.Cloud
Metabase + Clickhouse tutorial on Yandex.Cloud. GitHub Gist: instantly share code, notes, and snippets.
Prior to diving deep into complex analytics modeling topics like Sessionization, Attribution, RFM, one has to understand the motivation behind it.
It is the questions that drive business decisions first, then instruments and practices to find reliable answers to these questions.
Which questions does your business ask?
#analytics #modeling
It is the questions that drive business decisions first, then instruments and practices to find reliable answers to these questions.
Which questions does your business ask?
#analytics #modeling
[RU] Строим Data Vault на данных TPC-H – Greenplum + dbtVault
В публикации:
- Готовим датасет TPC-H
- Поднимаем кластер Greenplum в Яндекс.Облаке
- Погружаемся в кодогенерацию и макросы dbtVault
- Cимулируем инкрементальное наполнение Data Vault
#dbt #dbtvault
В публикации:
- Готовим датасет TPC-H
- Поднимаем кластер Greenplum в Яндекс.Облаке
- Погружаемся в кодогенерацию и макросы dbtVault
- Cимулируем инкрементальное наполнение Data Vault
#dbt #dbtvault
Хабр
Строим Data Vault на данных TPC-H – Greenplum + dbtVault
Привет! На связи Артемий – энтузиаст в сфере Data Warehousing, Analytics, DataOps. Уже продолжительное время я занимаюсь моделированием DWH с использованием dbt, и сегодня пришло время познакомить вас...
I just tried to upgrade onto 21.10 from 21.08.
After 30 minutes this cluster is never coming back.
I hope I can create a new one and restore backup onto it.
After 30 minutes this cluster is never coming back.
I hope I can create a new one and restore backup onto it.
We are in the middle of migration from Amazon Redshift DC2 nodes (2nd gen) to RA3 nodes (3rd gen) at Wheely.
What this means for us:
– Almost unlimited Disk Space (RA3 separate compute and storage)
– Speeding up Data Marts to 2hrs delay from real-time
– Blue/green deployments
I will follow up as soon as we are finished.
Attached simplified checklist plan.
Any questions welcomed.
What this means for us:
– Almost unlimited Disk Space (RA3 separate compute and storage)
– Speeding up Data Marts to 2hrs delay from real-time
– Blue/green deployments
I will follow up as soon as we are finished.
Attached simplified checklist plan.
Any questions welcomed.
Привет! Сегодня 18 ноября в 15.00 приглашаю на вебинар.
Полуструктурированные данные в Аналитических Хранилищах: Nested JSON + Arrays
- Источники полуструктурированных данных: Events, Webhooks, Logs
- Подходы: JSON functions, special data types, External tables (Lakehouse)
- Оптимизация производительности
Смотрим на примерах Amazon Redshift, Clickhouse.
Ссылка на регистрацию: https://otus.ru/lessons/dwh/#event-1661
Ссылка на youtube-трансляцию будет опубликована здесь за 5 минут до начала.
Полуструктурированные данные в Аналитических Хранилищах: Nested JSON + Arrays
- Источники полуструктурированных данных: Events, Webhooks, Logs
- Подходы: JSON functions, special data types, External tables (Lakehouse)
- Оптимизация производительности
Смотрим на примерах Amazon Redshift, Clickhouse.
Ссылка на регистрацию: https://otus.ru/lessons/dwh/#event-1661
Ссылка на youtube-трансляцию будет опубликована здесь за 5 минут до начала.
Data Apps Design
Привет! Сегодня 18 ноября в 15.00 приглашаю на вебинар. Полуструктурированные данные в Аналитических Хранилищах: Nested JSON + Arrays - Источники полуструктурированных данных: Events, Webhooks, Logs - Подходы: JSON functions, special data types, External…
[RU] Вебинар Полуструктурированные данные в Аналитических Хранилищах: Nested JSON + Arrays
Слайды вебинара: https://docs.google.com/presentation/d/1dUxzGkBgXAp6s-VrFKT8Qw8UF6eZVQywpeQT4-ZohPM/edit?usp=sharing
Запись вебинара: https://youtu.be/dtu0yeFdxvY?t=276
Опрос о вебинаре: https://forms.gle/JPFqoDYhJJjvnMj7A
Слайды вебинара: https://docs.google.com/presentation/d/1dUxzGkBgXAp6s-VrFKT8Qw8UF6eZVQywpeQT4-ZohPM/edit?usp=sharing
Запись вебинара: https://youtu.be/dtu0yeFdxvY?t=276
Опрос о вебинаре: https://forms.gle/JPFqoDYhJJjvnMj7A
Google Docs
DWH Analyst – Полуструктурированные данные в Аналитических Хранилищах
1 Онлайн-образование
Data Apps Design
Привет! Сегодня 18 ноября в 15.00 приглашаю на вебинар. Полуструктурированные данные в Аналитических Хранилищах: Nested JSON + Arrays - Источники полуструктурированных данных: Events, Webhooks, Logs - Подходы: JSON functions, special data types, External…
So the process of Amazon Redshift cluster migration is almost completed.
New cluster is way more powerful. Now seeking ways to fully utilize its resources 😄
I can state that not everything has gone as expected.
The most painful parts turned out to be:
– Migrating S3 bucket with 1M+ files to a new region (took ~4-5 hours) – really challenging
– Not losing data events while switching between clusters
– VPC and network issues (connecting from BI tool)
– Hotfixing several Python UDFs suddenly not working on a new environment
In some time I will publish a detailed reflection on this process.
New cluster is way more powerful. Now seeking ways to fully utilize its resources 😄
I can state that not everything has gone as expected.
The most painful parts turned out to be:
– Migrating S3 bucket with 1M+ files to a new region (took ~4-5 hours) – really challenging
– Not losing data events while switching between clusters
– VPC and network issues (connecting from BI tool)
– Hotfixing several Python UDFs suddenly not working on a new environment
In some time I will publish a detailed reflection on this process.
A nice remark from Dmitry Anoshin @rockyourdata
How one can visualize its own DWH ER (Entity-Relationship) model?
I would use these two ways (applicable to my DWH @ Wheely):
- DBeaver's feature ER diagram
- Looker's LookML Diagram
Both ways require relationships to be modeled in advance i.e. defining FOREIGN KEY / REFERENCES constraints or JOIN conditions.
Can anybody suggest more options?
How one can visualize its own DWH ER (Entity-Relationship) model?
I would use these two ways (applicable to my DWH @ Wheely):
- DBeaver's feature ER diagram
- Looker's LookML Diagram
Both ways require relationships to be modeled in advance i.e. defining FOREIGN KEY / REFERENCES constraints or JOIN conditions.
Can anybody suggest more options?