Long read about Apache Hudi internals with good visuals.
hudi.apache.org
Apache Hudi - The Data Lake Platform | Apache Hudi
As early as 2016, we set out a bold, new vision reimagining batch data processing through a new “incremental” data processing stack - alongside the existing batch and streaming stacks.
👍2
Now you can create a free Azure Data Explore (Kusto) cluster without an Azure subnoscription.
Docs
Create a free Azure Data Explorer cluster. - Azure Data Explorer
This article you'll learn how to create a free cluster, ingest data, and run queries to gain insights into your data using your free cluster.
👍5
Amazon created a game to help learning AWS.
YouTube
AWS Cloud Quest - Cloud Practitioner | Amazon Web Services
Learn more about AWS Cloud Quest - https://bit.ly/3v2tX6X
Play AWS Cloud Quest and have fun building AWS Cloud Skills. AWS Cloud Quest is the first and only role-playing game to help you build in-demand AWS Cloud skills. Collect gems and earn points as you…
Play AWS Cloud Quest and have fun building AWS Cloud Skills. AWS Cloud Quest is the first and only role-playing game to help you build in-demand AWS Cloud skills. Collect gems and earn points as you…
👍1🤩1
Scala, Spark, Books, and Functional Programming: An Essay - Christian Hollinger
https://chollinger.com/blog/2022/02/scala-spark-books-and-functional-programming-an-essay/
https://chollinger.com/blog/2022/02/scala-spark-books-and-functional-programming-an-essay/
Scala, Spark, Books, and Functional Programming: An Essay
Reviewing 'Essential Scala' and 'Functional Programming Simplified', while explaining why Spark has nothing to do with Scala, and asking why learning Functional Programming is such a pain. A (maybe) productive rant (or an opinionated essay).
👍2
Data Quality Unit Tests in PySpark Using Great Expectations | by Karen Bajador Valencia | Feb, 2022 | Towards Data Science
https://towardsdatascience.com/data-quality-unit-tests-in-pyspark-using-great-expectations-e2e2c0a2c102
https://towardsdatascience.com/data-quality-unit-tests-in-pyspark-using-great-expectations-e2e2c0a2c102
Medium
Data Quality Unit Tests in PySpark Using Great Expectations
Integrating Great Expectations with a ubiquitous Big data engineering platform
Kappa Architecture - Where Every Thing Is A Stream
https://milinda.pathirage.org/kappa-architecture.com/
https://milinda.pathirage.org/kappa-architecture.com/
How to Pick a Career (That Actually Fits You) — Wait But Why
https://waitbutwhy.com/2018/04/picking-career.html
https://waitbutwhy.com/2018/04/picking-career.html
Wait But Why
How to Pick a Career (That Actually Fits You)
Our career path is how we spend our time, how we support our lifestyles, how we make our impact, and even sometimes how we define our identity. Let’s make sure we’re on the right track.
Query 10 new data sources with Amazon Athena | AWS Big Data Blog
https://aws.amazon.com/blogs/big-data/query-10-new-data-sources-with-amazon-athena/
https://aws.amazon.com/blogs/big-data/query-10-new-data-sources-with-amazon-athena/
Amazon
Query 10 new data sources with Amazon Athena | Amazon Web Services
When we first launched Amazon Athena, our mission was to make it simple to query data stored in Amazon Simple Storage Service (Amazon S3). Athena customers found it easy to get started and develop analytics on petabyte-scale data lakes, but told us they needed…
So now dbt also works with AWS Glue
Amazon
Build your data pipeline in your AWS modern data platform using AWS Lake Formation, AWS Glue, and dbt Core | Amazon Web Services
dbt has established itself as one of the most popular tools in the modern data stack, and is aiming to bring analytics engineering to everyone. The dbt tool makes it easy to develop and implement complex data processing pipelines, with mostly SQL, and it…