What will you look for to become a data engineer?
Anonymous Poll
31%
A book
19%
A video course
25%
A training or bootcamp
25%
An internship / real project
Azure Synapse Analytics vs Azure Databricks, a detailed comparison of feature and use-cases
https://www.element61.be/en/resource/when-use-azure-synapse-analytics-andor-azure-databricks
https://www.element61.be/en/resource/when-use-azure-synapse-analytics-andor-azure-databricks
element61
When to use Azure Synapse Analytics & Azure Databricks?
What is Azure Synapse Analytics?Azure Synapse Analytics is the Azure SQL Datawarehouse rebranded. Azure Synapse Analytics v2 (workspaces incl. Azure Synapse Studio) is still in preview. This version of Azure Synapse Analytics integrates existing and new analytical…
So as you may know, AWS has Lambda and Step Functions. But it seems Azure took a bit different approach with Durable functions. I am not sure if this is a right comparison, but it is interesting idea to have not only stateless functions but also stateful ones.
Docs
Durable Functions Overview - Azure
Introduction to the Durable Functions extension for Azure Functions.
I wonder how this is implemented under the hood and is AWS planning similar feature for S3?
https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-namespace
https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-namespace
Docs
Azure Data Lake Storage Gen2 Hierarchical Namespace
Describes the concept of a hierarchical namespace for Azure Data Lake Storage Gen2
New data engineering podcast episode about Superset with author of Superset and Airflow.
Data Engineering Podcast
Data Engineering Podcast: Self Service Data Exploration And Dashboarding With Superset
An interview with Maxime Beauchemin about how to use Apache Superset as a platform for self-service data exploration and analytics.
Классный доклад про Kusto (Azure Data Explorer) от eго создателей. Согласно докладy Им пользуются практически все команды в Microsoft. О может заменить ElasticSearch и Solr. А мне он немножко напоминает Druid.
https://youtu.be/Kkd2rYQZAVU
https://youtu.be/Kkd2rYQZAVU
YouTube
Александр Слуцкий, Глеб Лесников — Kusto (Azure Data Explorer): Интерактивная платформа Big Data
Ближайшая конференция: SmartData 2022 – 17–18 октября (Online), 29 октября (Offline)
Билеты – https://bit.ly/3amdcNO — . Kusto — это новая и стремительно набирающая обороты платформа для работы с Big Data. Несколько лет назад она завоевала весь Майкрософт…
Билеты – https://bit.ly/3amdcNO — . Kusto — это новая и стремительно набирающая обороты платформа для работы с Big Data. Несколько лет назад она завоевала весь Майкрософт…
New support for Databricks and Apache Spark in dbt Cloud
https://blog.getdbt.com/analytics-engineering-for-everyone-databricks-in-dbt-cloud/
https://blog.getdbt.com/analytics-engineering-for-everyone-databricks-in-dbt-cloud/
Transform data in your warehouse
Analytics Engineering for Everyone: Databricks in dbt Cloud
This SQL-first integration with Databricks means that analysts can build fully automated data pipelines in the same space that data engineers & data scientists work in their preferred frameworks.
#Rust is paving its way into data engineering
https://arrow.apache.org/blog/2021/04/12/ballista-donation/
https://arrow.apache.org/blog/2021/04/12/ballista-donation/
Apache Arrow
Ballista: A Distributed Scheduler for Apache Arrow
We are excited to announce that Ballista has been donated to the Apache Arrow project. Ballista is a distributed scheduler for the Rust implementation of Apache Arrow.