Play with Docker
▫️Docker 101 Tutorial - Self-paced tutorials to increase your Docker knowledge.
▫️Lab Environment - Complete a workshop without installing anything using this Docker playground.
▫️Community Training - Free and paid learning materials from Docker Captains.
https://www.docker.com/play-with-docker/
▫️Docker 101 Tutorial - Self-paced tutorials to increase your Docker knowledge.
▫️Lab Environment - Complete a workshop without installing anything using this Docker playground.
▫️Community Training - Free and paid learning materials from Docker Captains.
https://www.docker.com/play-with-docker/
Docker
Play with Docker | Docker
Find self-paced tutorials to increase your Docker knowledge, and join a global community of collaborative developers. Play with Docker today!
👍1
Завтра в 12 трансляция
https://youtu.be/jF3YemOVofQ
https://youtu.be/jF3YemOVofQ
YouTube
Обработка данных на Apache Airflow в Yandex Cloud
Для анализа данных в облаке не достаточно СУБД и средств визуализации — нужен ещё и понятный инструмент, который автоматизирует сбор, подготовку и обработку данных. На вебинаре мы расскали о таком сервисе — Apache Airflow.
Эксперты Yandex Cloud обсудили:…
Эксперты Yandex Cloud обсудили:…
Как собрать платформу обработки данных «своими руками»?
@devops_dataops
https://habr.com/ru/company/itsumma/blog/679516/
@devops_dataops
https://habr.com/ru/company/itsumma/blog/679516/
Хабр
Как собрать платформу обработки данных «своими руками»?
Большое количество российских компаний столкнулись с ограничениями в области ПО. Они теперь не имеют возможности использовать многие важные инструменты для работы с данными. Но, как говорится, одна...
Nico_Loubser_Software_Engineering_for_Absolute_Beginners_Your_Guide.epub
1.5 MB
Software Engineering for Absolute Beginners - 2021
What You Will Learn
🔹 Explore the concepts that you will encounter in the majority of companies doing software development
🔹 Create readable code that is neat as well as well-designed
🔹 Build code that is source controlled, containerized, and deployable
🔹 Secure your codebase
🔹 Optimize your workspace
What You Will Learn
🔹 Explore the concepts that you will encounter in the majority of companies doing software development
🔹 Create readable code that is neat as well as well-designed
🔹 Build code that is source controlled, containerized, and deployable
🔹 Secure your codebase
🔹 Optimize your workspace
🔥 Awesome Docker Compose samples
These samples provide a starting point for how to integrate different services using a Compose file and to manage their deployment with Docker Compose.
👉 @devops_dataops
https://github.com/docker/awesome-compose
These samples provide a starting point for how to integrate different services using a Compose file and to manage their deployment with Docker Compose.
👉 @devops_dataops
https://github.com/docker/awesome-compose
GitHub
GitHub - docker/awesome-compose: Awesome Docker Compose samples
Awesome Docker Compose samples. Contribute to docker/awesome-compose development by creating an account on GitHub.
ETL Pipeline with Airflow, Spark, s3, MongoDB and Amazon Redshift
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.
https://github.com/renatootescu/ETL-pipeline
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.
https://github.com/renatootescu/ETL-pipeline
GitHub
GitHub - renatootescu/ETL-pipeline: Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated…
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow. - renatootescu/ETL-pipeline
GitHub - martandsingh/ApacheSpark: This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
https://github.com/martandsingh/ApacheSpark
https://github.com/martandsingh/ApacheSpark
GitHub
GitHub - martandsingh/ApacheSpark: This repository will help you to learn about databricks concept with the help of examples. It…
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We ...
👍1
Проектирование ETL-пайплайна в Apache Airflow / Хабр
https://habr.com/ru/company/otus/blog/679402/
https://habr.com/ru/company/otus/blog/679402/
Хабр
Проектирование ETL-пайплайна в Apache Airflow
Привет, Хабр! На связи Рустем, IBM Senior DevOps Engineer и сегодня я хотел бы продолжить наше знакомство с инструментом в DataOps инженирии — Apache Airflow. Сегодня мы спроектируем ETL-пайплайн. Не...
Глубокое погружение в Data Quality / Хабр
https://habr.com/ru/company/vk/blog/674876/
https://habr.com/ru/company/vk/blog/674876/
Mara Pipelines
This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. It has a number of baked-in assumptions/ principles:
- Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code.
- PostgreSQL as a data processing engine.
- Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines.
- GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows.
- No in-app data processing: command line tools as the main tool for interacting with databases and data.
- Single machine pipeline execution based on Python's multiprocessing. No need for distributed task queues. Easy debugging and output logging.
- Cost based priority queues: nodes with higher cost (based on recorded run times) are run first.
https://github.com/mara/mara-pipelines
This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. It has a number of baked-in assumptions/ principles:
- Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code.
- PostgreSQL as a data processing engine.
- Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines.
- GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows.
- No in-app data processing: command line tools as the main tool for interacting with databases and data.
- Single machine pipeline execution based on Python's multiprocessing. No need for distributed task queues. Easy debugging and output logging.
- Cost based priority queues: nodes with higher cost (based on recorded run times) are run first.
https://github.com/mara/mara-pipelines
GitHub
GitHub - mara/mara-pipelines: A lightweight opinionated ETL framework, halfway between plain noscripts and Apache Airflow
A lightweight opinionated ETL framework, halfway between plain noscripts and Apache Airflow - mara/mara-pipelines