Empower every BI professional to do more with Microsoft Fabric
https://build.microsoft.com/en-US/sessions/8b23c96e-7c35-463d-88b4-564d23dc14a5
https://build.microsoft.com/en-US/sessions/8b23c96e-7c35-463d-88b4-564d23dc14a5
Choosing an open table format for your transactional data lake on AWS | AWS Big Data Blog
https://aws.amazon.com/blogs/big-data/choosing-an-open-table-format-for-your-transactional-data-lake-on-aws/
https://aws.amazon.com/blogs/big-data/choosing-an-open-table-format-for-your-transactional-data-lake-on-aws/
Amazon
Choosing an open table format for your transactional data lake on AWS | Amazon Web Services
August 2023: This post was updated to include Apache Iceberg support in Amazon Redshift. Disclaimer: Due to rapid advancements in AWS service support for open table formats, recent developments might not yet be reflected in this post. For the latest information…
AWS Glue Data Quality is Generally Available | AWS Big Data Blog
https://aws.amazon.com/blogs/big-data/aws-glue-data-quality-is-generally-available/
https://aws.amazon.com/blogs/big-data/aws-glue-data-quality-is-generally-available/
Amazon
AWS Glue Data Quality is Generally Available | Amazon Web Services
We are excited to announce the General Availability of AWS Glue Data Quality. Our journey started by working backward from our customers who create, manage, and operate data lakes and data warehouses for analytics and machine learning. To make confident business…
Data1984
Choosing an open table format for your transactional data lake on AWS | AWS Big Data Blog https://aws.amazon.com/blogs/big-data/choosing-an-open-table-format-for-your-transactional-data-lake-on-aws/
While AWS tries to support all three formats, and helps to choose right one for your use-case, Databricks introduces unified format, so you will not need to pick 😎
Datanami
Databricks Puts Unified Data Format on the Table with Delta Lake 3.0
Databricks today rolled out a new open table format in Delta Lake 3.0 that it says will eliminate the possibility of picking the wrong one. Dubbed
This reminds me of a solution designed and implemented a couple of years ago. But back then we used DynamoDB streams to capture item-level changes with exactly-one semantics, Lambda to modify data and Kinesis Firehose to deliver data to Redshift. Looks like now things are simpler.
Amazon
Near-real-time analytics using Amazon Redshift streaming ingestion with Amazon Kinesis Data Streams and Amazon DynamoDB | Amazon…
Amazon Redshift is a fully managed, scalable cloud data warehouse that accelerates your time to insights with fast, easy, and secure analytics at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical…
VP and distinguished engineer over at S3 tells the story of building S3.
YouTube
FAST '23 - Building and Operating a Pretty Big Storage System (My Adventures in Amazon S3)
Building and Operating a Pretty Big Storage System (My Adventures in Amazon S3)
Andy Warfield, Amazon
Five years ago I decided to leave my faculty position at UBC and join Amazon. A lot of that time has been spent working as an engineer on the S3 team.…
Andy Warfield, Amazon
Five years ago I decided to leave my faculty position at UBC and join Amazon. A lot of that time has been spent working as an engineer on the S3 team.…
LinkedIn remains one of the coolest places in terms of data engineering where many popular open-source technologies emerge.
https://engineering.linkedin.com/blog/2023/declarative-data-pipelines-with-hoptimator
https://engineering.linkedin.com/blog/2023/declarative-data-pipelines-with-hoptimator
Linkedin
Declarative Data Pipelines with Hoptimator
Microsoft today announced the public preview of Python in Excel. So this is what Guido van Rossum was working on 😁 (Creator of Python joined Microsoft 3 years ago)
TECHCOMMUNITY.MICROSOFT.COM
Announcing Python in Excel
Announcing Python in Excel: Combining the power of Python and the flexibility of Excel.
❤2🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
Looks like data lineage, profile and quality are getting more attention in the data tools.
Finally there is a dedicated certification for data engineers from AWS
https://aws.amazon.com/certification/certified-data-engineer-associate/
https://aws.amazon.com/certification/certified-data-engineer-associate/
Amazon
certified-data-engineer-associate
Category, Associate. Exam duration, 130 minutes. Exam format, 65 questions; either multiple choice or multiple response. Cost, 150 USD.
🔥4
https://www.dremio.com/blog/exploring-the-architecture-of-apache-iceberg-delta-lake-and-apache-hudi/
Dremio
Exploring the Architecture of Apache Iceberg, Delta Lake, and Apache Hudi | Dremio
Understand how different formats handle metadata for ACID transactions, time travel, and schema evolution in data lakehouses.
RAPIDS cuDF Accelerates pandas Nearly 150x with Zero Code Changes | NVIDIA Technical Blog
https://developer.nvidia.com/blog/rapids-cudf-accelerates-pandas-nearly-150x-with-zero-code-changes/
https://developer.nvidia.com/blog/rapids-cudf-accelerates-pandas-nearly-150x-with-zero-code-changes/
NVIDIA Technical Blog
RAPIDS cuDF Accelerates pandas Nearly 150x with Zero Code Changes
At NVIDIA GTC 2024, it was announced that RAPIDS cuDF can now bring GPU acceleration to 9.5M million pandas users without requiring them to change their code. pandas, a flexible and powerful data…
🔥1