What’s new in Amazon Redshift – 2021, a year in review | AWS Big Data Blog
https://aws.amazon.com/blogs/big-data/whats-new-in-amazon-redshift-2021-a-year-in-review/
https://aws.amazon.com/blogs/big-data/whats-new-in-amazon-redshift-2021-a-year-in-review/
Amazon
What’s new in Amazon Redshift – 2021, a year in review | Amazon Web Services
Amazon Redshift is the cloud data warehouse of choice for tens of thousands of customers who use it to analyze exabytes of data to gain business insights. Customers have asked for more capabilities in Redshift to make it easier, faster, and secure to store…
2022 Big Data Predictions from the Cloud
https://www.datanami.com/2021/12/23/2022-big-data-predictions-from-the-cloud/?utm_source=rss&utm_medium=rss&utm_campaign=2022-big-data-predictions-from-the-cloud
https://www.datanami.com/2021/12/23/2022-big-data-predictions-from-the-cloud/?utm_source=rss&utm_medium=rss&utm_campaign=2022-big-data-predictions-from-the-cloud
Datanami
2022 Big Data Predictions from the Cloud
The pandemic marked an inflection point for the growth of cloud platforms in 2020, as organizations scrambled to keep their applications running. That
Amazon EMR now supports Apache Iceberg, a highly performant, concurrent, ACID-compliant table format for data lakes.
Amazon
Amazon EMR now supports Apache Iceberg, a highly performant, concurrent, ACID-compliant table format for data lakes
Cost Efficiency @ Scale in Big Data File Format
https://eng.uber.com/cost-efficiency-big-data/
https://eng.uber.com/cost-efficiency-big-data/
I think I missed this one. So now both Athena and EMR work with Iceberg.
Amazon
Announcing Amazon Athena ACID transactions, powered by Apache Iceberg (Preview)
The Ubiquity of the Delta Standalone Project for Delta Lake - The Databricks Blog
https://databricks.com/blog/2022/01/28/the-ubiquity-of-delta-standalone-java-scala-hive-presto-trino-power-bi-and-more.html
https://databricks.com/blog/2022/01/28/the-ubiquity-of-delta-standalone-java-scala-hive-presto-trino-power-bi-and-more.html
I came across Argo in an AWS blog post. In particular with Argo Workflows which is an orchestration tool like Airflow which you can use if you already have K8s cluster.
argoproj.github.io
Home
Open source Kubernetes native workflows, events, CI and CD
Kubernetes is probably the only major topic in our field that I never had a chance to work or interact with, but it seems it starts to serve as a meta OS or abstraction layer for major data engineering (and not only) platforms or projects.
Amazon Redshift announces public preview of Streaming Ingestion for Kinesis Data Streams
https://aws.amazon.com/about-aws/whats-new/2022/02/amazon-redshift-public-preview-streaming-ingestion-kinesis-data-streams/
https://aws.amazon.com/about-aws/whats-new/2022/02/amazon-redshift-public-preview-streaming-ingestion-kinesis-data-streams/
Amazon
Amazon Redshift announces public preview of Streaming Ingestion for Kinesis Data Streams
👍1
This article contains combination of multiple individually useful techniques. Especially, I like idea of indexing of S3 files with a cluster of Lambda functions.
Amazon
Doing more with less: Moving from transactional to stateful batch processing | Amazon Web Services
Amazon processes hundreds of millions of financial transactions each day, including accounts receivable, accounts payable, royalties, amortizations, and remittances, from over a hundred different business entities. All of this data is sent to the eCommerce…