Kubernetes is probably the only major topic in our field that I never had a chance to work or interact with, but it seems it starts to serve as a meta OS or abstraction layer for major data engineering (and not only) platforms or projects.
Amazon Redshift announces public preview of Streaming Ingestion for Kinesis Data Streams
https://aws.amazon.com/about-aws/whats-new/2022/02/amazon-redshift-public-preview-streaming-ingestion-kinesis-data-streams/
https://aws.amazon.com/about-aws/whats-new/2022/02/amazon-redshift-public-preview-streaming-ingestion-kinesis-data-streams/
Amazon
Amazon Redshift announces public preview of Streaming Ingestion for Kinesis Data Streams
👍1
This article contains combination of multiple individually useful techniques. Especially, I like idea of indexing of S3 files with a cluster of Lambda functions.
Amazon
Doing more with less: Moving from transactional to stateful batch processing | Amazon Web Services
Amazon processes hundreds of millions of financial transactions each day, including accounts receivable, accounts payable, royalties, amortizations, and remittances, from over a hundred different business entities. All of this data is sent to the eCommerce…
I think there are three major platforms I would like to work/play with to get more experience:
1. Google Could Platform
2. Databricks (not just Spark)
3. Kubernetes (maybe to run Spark)
1. Google Could Platform
2. Databricks (not just Spark)
3. Kubernetes (maybe to run Spark)
Google Cloud
Google Cloud Platform Services Summary
A complete list of services that form a part of Google Cloud.
Every product in the Google Cloud family described in the visual sketchnote format to grasp the capability of the tools quickly and easily.
GitHub
GitHub - priyankavergadia/GCPSketchnote: If you are looking to become a Google Cloud Engineer , then you are at the right place.…
If you are looking to become a Google Cloud Engineer , then you are at the right place. GCPSketchnote is series where I share Google Cloud concepts in quick and easy to learn format. - priyankaverg...
Long read about Apache Hudi internals with good visuals.
hudi.apache.org
Apache Hudi - The Data Lake Platform | Apache Hudi
As early as 2016, we set out a bold, new vision reimagining batch data processing through a new “incremental” data processing stack - alongside the existing batch and streaming stacks.
👍2
Now you can create a free Azure Data Explore (Kusto) cluster without an Azure subnoscription.
Docs
Create a free Azure Data Explorer cluster. - Azure Data Explorer
This article you'll learn how to create a free cluster, ingest data, and run queries to gain insights into your data using your free cluster.
👍5
Amazon created a game to help learning AWS.
YouTube
AWS Cloud Quest - Cloud Practitioner | Amazon Web Services
Learn more about AWS Cloud Quest - https://bit.ly/3v2tX6X
Play AWS Cloud Quest and have fun building AWS Cloud Skills. AWS Cloud Quest is the first and only role-playing game to help you build in-demand AWS Cloud skills. Collect gems and earn points as you…
Play AWS Cloud Quest and have fun building AWS Cloud Skills. AWS Cloud Quest is the first and only role-playing game to help you build in-demand AWS Cloud skills. Collect gems and earn points as you…
👍1🤩1
Scala, Spark, Books, and Functional Programming: An Essay - Christian Hollinger
https://chollinger.com/blog/2022/02/scala-spark-books-and-functional-programming-an-essay/
https://chollinger.com/blog/2022/02/scala-spark-books-and-functional-programming-an-essay/
Scala, Spark, Books, and Functional Programming: An Essay
Reviewing 'Essential Scala' and 'Functional Programming Simplified', while explaining why Spark has nothing to do with Scala, and asking why learning Functional Programming is such a pain. A (maybe) productive rant (or an opinionated essay).
👍2