Data1984 – Telegram
Data1984
787 subscribers
44 photos
1 video
17 files
762 links
This channel is mostly about data related stuff, some of the main topics are #DataEngineering #SQL #Python #cloud .

Contact: @gorros
Download Telegram
Amazon EMR now supports Apache Iceberg, a highly performant, concurrent, ACID-compliant table format for data lakes.
Cost Efficiency @ Scale in Big Data File Format
https://eng.uber.com/cost-efficiency-big-data/
I think I missed this one. So now both Athena and EMR work with Iceberg.
I came across Argo in an AWS blog post. In particular with Argo Workflows which is an orchestration tool like Airflow which you can use if you already have K8s cluster.
Kubernetes is probably the only major topic in our field that I never had a chance to work or interact with, but it seems it starts to serve as a meta OS or abstraction layer for major data engineering (and not only) platforms or projects.
I think there are three major platforms I would like to work/play with to get more experience:
1. Google Could Platform
2. Databricks (not just Spark)
3. Kubernetes (maybe to run Spark)