Here are some major updates for Lambda which probably will make you rethink your serverless architecture 😉
#reInvent #AWS #Lambda
✅ New for AWS Lambda – Container Image Support
✅ New for AWS Lambda – 1ms Billing Granularity Adds Cost Savings
✅ New for AWS Lambda – Functions with Up to 10 GB of Memory and 6 vCPUs
#reInvent #AWS #Lambda
✅ New for AWS Lambda – Container Image Support
✅ New for AWS Lambda – 1ms Billing Granularity Adds Cost Savings
✅ New for AWS Lambda – Functions with Up to 10 GB of Memory and 6 vCPUs
Amazon
New for AWS Lambda – Container Image Support | Amazon Web Services
February 9, 2021: Post updated with the current regional availability of container image support for AWS Lambda. With AWS Lambda, you upload your code and run it without thinking about servers. Many customers enjoy the way this works, but if you’ve invested…
#reInvent
Amazon S3 Update – Strong Read-After-Write Consistency:
Effective immediately, all S3 GET, PUT, and LIST operations, as well as operations that change object tags, ACLs, or metadata, are now strongly consistent!
This is especially import if you are using S3 as a Data Lake and process data via EMR.
Amazon S3 Update – Strong Read-After-Write Consistency:
Effective immediately, all S3 GET, PUT, and LIST operations, as well as operations that change object tags, ACLs, or metadata, are now strongly consistent!
This is especially import if you are using S3 as a Data Lake and process data via EMR.
Amazon
Amazon S3 Update – Strong Read-After-Write Consistency | Amazon Web Services
When we launched S3 back in 2006, I discussed its virtually unlimited capacity (“…easily store any number of blocks…”), the fact that it was designed to provide 99.99% availability, and that it offered durable storage, with data transparently stored in multiple…
Version control for data.
https://podcasts.google.com/?feed=aHR0cHM6Ly93d3cuZGF0YWVuZ2luZWVyaW5ncG9kY2FzdC5jb20vZmVlZC9tcDMv&ep=14&episode=cG9kbG92ZS0yMDIwLTExLTAydDIzOjUxOjQzKzAwOjAwLTBhYTlmMWYyODQ1ZTEzYQ&pe=1&pep=0
https://podcasts.google.com/?feed=aHR0cHM6Ly93d3cuZGF0YWVuZ2luZWVyaW5ncG9kY2FzdC5jb20vZmVlZC9tcDMv&ep=14&episode=cG9kbG92ZS0yMDIwLTExLTAydDIzOjUxOjQzKzAwOjAwLTBhYTlmMWYyODQ1ZTEzYQ&pe=1&pep=0
Google Podcasts
Data Engineering Podcast - Add Version Control To Your Data Lake With LakeFS
Data lakes are gaining popularity due to their flexibility and reduced cost of storage. Along with the benefits there are some additional complexities to consider, including how to safely integrate new data sources or test out changes to existing pipelines.…
#reInvent #AWS
It seems ML is becoming standard feature everywhere.
https://aws.amazon.com/about-aws/whats-new/2020/12/aws-announces-amazon-redshift-ml-preview/
It seems ML is becoming standard feature everywhere.
https://aws.amazon.com/about-aws/whats-new/2020/12/aws-announces-amazon-redshift-ml-preview/
Amazon
AWS announces Amazon Redshift ML (preview)
#Scala #Kafka
I will leave this here just in case
https://www.confluent.io/blog/kafka-scala-tutorial-for-beginners/
I will leave this here just in case
https://www.confluent.io/blog/kafka-scala-tutorial-for-beginners/
Confluent
Apache Kafka and Scala - A Beginner’s Tutorial
Introduction to Apache Kafka and Scala. Learn about Kafka clients, how to use it in Scala, the Kafka Streams Scala module, and popular Scala integrations with code examples.
#reInvent #AWS
Data analytics and engineering related updates:
✅ Announcing Amazon Redshift data sharing (preview)
✅ Amazon EMR Studio (Preview): A new notebook-first IDE experience with Amazon EMR
✅ Amazon Redshift announces native console integration with partners (Preview)
✅ Amazon Redshift announces support for native JSON and semi-structured data processing (preview)
✅ Simplify running Apache Spark jobs with Amazon EMR on Amazon EKS
Data analytics and engineering related updates:
✅ Announcing Amazon Redshift data sharing (preview)
✅ Amazon EMR Studio (Preview): A new notebook-first IDE experience with Amazon EMR
✅ Amazon Redshift announces native console integration with partners (Preview)
✅ Amazon Redshift announces support for native JSON and semi-structured data processing (preview)
✅ Simplify running Apache Spark jobs with Amazon EMR on Amazon EKS
Amazon
Announcing Amazon Redshift data sharing (preview) | Amazon Web Services
Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL. Amazon Redshift offers up to 3x better price performance than any other cloud data warehouse.…
#AWS #reInvent #Redshift
✅ Amazon Redshift announces Automatic Table Optimization
✅ Amazon Redshift now includes Amazon RDS for MySQL and Amazon Aurora MySQL databases as new data sources for federated querying (Preview)
✅ Amazon Redshift launches RA3.xlplus nodes with managed storage
✅ Amazon Redshift announces Automatic Table Optimization
✅ Amazon Redshift now includes Amazon RDS for MySQL and Amazon Aurora MySQL databases as new data sources for federated querying (Preview)
✅ Amazon Redshift launches RA3.xlplus nodes with managed storage
Amazon
Amazon Redshift announces Automatic Table Optimization
A great review of metadata catalog evolution.
https://engineering.linkedin.com/blog/2020/datahub-popular-metadata-architectures-explained
https://engineering.linkedin.com/blog/2020/datahub-popular-metadata-architectures-explained
Linkedin
DataHub: Popular metadata architectures explained
#Scala
Old but good article about cake pattern for dependency injection in #Scala
https://medium.com/rahasak/scala-cake-pattern-e0cd894dae4e
Old but good article about cake pattern for dependency injection in #Scala
https://medium.com/rahasak/scala-cake-pattern-e0cd894dae4e
Medium
Scala cake pattern
Be simple, look stylish
Forwarded from Инжиниринг Данных (Dmitry Anoshin)
Lakehouse = DW + Data Lake.
Примеры lakehouse:
- Redshift + Redshift Spectrum
- Snowflake
- Databrics Delta Lake
- Azure Synapse Analytics
Попался очень интересный paper, который был только недавно опубликован основателями Databricks.
This paper argues that the data warehouse architecture as we know it today will wither in the coming years and be replaced by a new architectural pattern, the Lakehouse, which will (i) be based on open direct-access data formats, such as Apache Parquet, (ii) have first class support for machine learning and data science, and (iii) offer state-of-the-art performance. Lakehouses can help address several major challenges with data warehouses, including data staleness, reliability, total cost of ownership, data lock-in, and limited use-case support.
Примеры lakehouse:
- Redshift + Redshift Spectrum
- Snowflake
- Databrics Delta Lake
- Azure Synapse Analytics
Попался очень интересный paper, который был только недавно опубликован основателями Databricks.
This paper argues that the data warehouse architecture as we know it today will wither in the coming years and be replaced by a new architectural pattern, the Lakehouse, which will (i) be based on open direct-access data formats, such as Apache Parquet, (ii) have first class support for machine learning and data science, and (iii) offer state-of-the-art performance. Lakehouses can help address several major challenges with data warehouses, including data staleness, reliability, total cost of ownership, data lock-in, and limited use-case support.
Forwarded from Инжиниринг Данных (Dmitry Anoshin)
У data сообщества большие планы на dbt.
Medium
Why DBT will one day be bigger than Spark
The world of data is moving and shaking again. Ever since Hadoop came around, people were offloading workloads from their data warehouses…
Data visualization tools comparison from Dropbox.
https://dropbox.tech/application/why-we-chose-apache-superset-as-our-data-exploration-platform
https://dropbox.tech/application/why-we-chose-apache-superset-as-our-data-exploration-platform
dropbox.tech
Why we chose Apache Superset as our data exploration platform