Recently I posted about Spark 3.0 updates. Here is a detailed article from creator for Spark about most important updates.
Վերջերս գրել էի #Spark 3.0 մասին։ Ահա նրա հեղինակի կողմից հոդվածը առավել կարևոր թարմացումների մասին։
https://databricks.com/blog/2020/06/18/introducing-apache-spark-3-0-now-available-in-databricks-runtime-7-0.html
Վերջերս գրել էի #Spark 3.0 մասին։ Ահա նրա հեղինակի կողմից հոդվածը առավել կարևոր թարմացումների մասին։
https://databricks.com/blog/2020/06/18/introducing-apache-spark-3-0-now-available-in-databricks-runtime-7-0.html
Databricks
Introducing Spark 3.0 - Now Available in Databricks Runtime 7.0
Learn more about the latest release of Apache Spark, version 3.0.0, including new features like AQE and how to begin using it through Databricks Runtime 7.0.
Kinesis Firehose received new interesting feature which can help with real time ETL.
#KinesisFirehose նոր ֆունկցիոնալ ունի որը հարմար է իրական ժամանակում #ETL ի համար։
https://youtu.be/MELPeni0p04?t=1179
#KinesisFirehose նոր ֆունկցիոնալ ունի որը հարմար է իրական ժամանակում #ETL ի համար։
https://youtu.be/MELPeni0p04?t=1179
YouTube
High Performance Data Streaming with Amazon Kinesis: Best Practices and Common Pitfalls
Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. With Amazon Kinesis, you can ingest real-time data such as video, audio, application logs, website…
Koalas 1.0 is here!
If you use Pandas and Spark it worth checking out.
-—
#Koalas 1.0 է դուրս եկել։ Եթե օգտվում եք #Pandas և #Spark արժե ծանոթանալ։
https://databricks.com/blog/2020/06/24/introducing-koalas-1-0.html
If you use Pandas and Spark it worth checking out.
-—
#Koalas 1.0 է դուրս եկել։ Եթե օգտվում եք #Pandas և #Spark արժե ծանոթանալ։
https://databricks.com/blog/2020/06/24/introducing-koalas-1-0.html
Databricks
Koalas 1.0 Introduction, Overview and Quick How-to Guide
Learn more about the latest release of Koalas, version 1.0.0, including new features, and how you begin using it.
Long read about deploying machine learning models to production.
#ML
https://link.oreilly.com/I0C0YFp0W7LLQr0Q0y0MS00
#ML
https://link.oreilly.com/I0C0YFp0W7LLQr0Q0y0MS00
Long awaited feature from Amazon EMR, similar to Google Dataproc.
---
Երկար սպասված թարմացում #AWS #EMR կողմից, որը առկա էր Google #Dataproc ում
https://aws.amazon.com/blogs/big-data/introducing-amazon-emr-managed-scaling-automatically-resize-clusters-to-lower-cost/
---
Երկար սպասված թարմացում #AWS #EMR կողմից, որը առկա էր Google #Dataproc ում
https://aws.amazon.com/blogs/big-data/introducing-amazon-emr-managed-scaling-automatically-resize-clusters-to-lower-cost/
Amazon
Introducing Amazon EMR Managed Scaling – Automatically Resize Clusters to Lower Cost | Amazon Web Services
AWS is happy to announce the release of Amazon EMR Managed Scaling—a new feature that automatically resizes your cluster for best performance at the lowest possible cost. With EMR Managed Scaling you specify the minimum and maximum compute limits for your…
#AWS Dev Day about Databases and Analytics
#Kafka #EMR #Redshift
https://pages.awscloud.com/emea_dba_devday.html
#Kafka #EMR #Redshift
https://pages.awscloud.com/emea_dba_devday.html
Amazon
AWS Support and Customer Service Contact Info | Amazon Web Services
On this page, you’ll find info regarding the different ways to get in touch with AWS support, including Sales, Technical, Compliance, and Login support.
Interesting website for remote job opportunities.
Հետաքրքիր կայք հեռակա աշխատանքի համար։
#remotework
https://www.remotefit.io/
Հետաքրքիր կայք հեռակա աշխատանքի համար։
#remotework
https://www.remotefit.io/
www.remotefit.io
Remotefit: Find a remote job with a great culture fit
Find a remote job in engineering, product, design, management, sales, or customer service, and work from anywhere.
If you still do not know what а columnar data format is you should read this article.
#Parquet #Arrow
https://towardsdatascience.com/apache-arrow-read-dataframe-with-zero-memory-69634092b1a
#Parquet #Arrow
https://towardsdatascience.com/apache-arrow-read-dataframe-with-zero-memory-69634092b1a
Medium
Apache Arrow: Read DataFrame With Zero Memory
Theoretical & practical introduction to Arrow file format
A good summary of Kinesis Data Streams
Մեր հայրենակցի կողմից 😉
https://dev.solita.fi/2020/05/28/kinesis-streams-part-1.html
Մեր հայրենակցի կողմից 😉
https://dev.solita.fi/2020/05/28/kinesis-streams-part-1.html
/dev/solita
Mastering AWS Kinesis Data Streams, Part 1
I have been working with AWS Kinesis Data Streams for several years now, dealing with over 0.5TB of streaming data per day. Rather than telling you about all the reasons why you should use Kinesis Data Streams (plenty is written on that subject), I'll talk…
So #python runtime is overall fastest for #AWS #Lambda. Didn't see it coming.
https://medium.com/the-theam-journey/benchmarking-aws-lambda-runtimes-in-2019-part-i-b1ee459a293d
https://medium.com/the-theam-journey/benchmarking-aws-lambda-runtimes-in-2019-part-ii-50e796d3d11b
https://medium.com/the-theam-journey/benchmarking-aws-lambda-runtimes-in-2019-part-i-b1ee459a293d
https://medium.com/the-theam-journey/benchmarking-aws-lambda-runtimes-in-2019-part-ii-50e796d3d11b
Medium
Benchmarking AWS Lambda runtimes in 2019 (part I)
Have you ever wondered whether your AWS Lambda could be faster if you used a different runtime?
Here is a great resource for getting familiar with AWS Data Analytics solutions even if you are not planning to take exam. It is very interactive and visually pleasing course.
https://www.aws.training/Details/eLearning?id=46612
https://www.aws.training/Details/eLearning?id=46612
Forwarded from DataEng
CAP теорема для дата инженеров: https://www.analyticsvidhya.com/blog/2020/08/a-beginners-guide-to-cap-theorem-for-data-engineering/
Analytics Vidhya
A Beginner's Guide to CAP Theorem for Data Engineering
CAP theorem helps to handle your distributed database systems when a few database servers refuse to communicate with each other.
If you noticed, recently I was not actively posting in the channel because I was busy preparing for an exam. I want to share this guide if you are considering to pass it as well.
https://towardsdatascience.com/becoming-an-aws-certified-data-analytics-new-april-2020-4a3ef0d9f23a?gi=b2f48e1e3986
https://towardsdatascience.com/becoming-an-aws-certified-data-analytics-new-april-2020-4a3ef0d9f23a?gi=b2f48e1e3986
Medium
Becoming an AWS Certified Data Analytics — NEW April 2020
A guide to become the next AWS Certified Data Analytics expert.
Interesting new feature rich data warehouse with unusual pricing approach base on AWS. Especially I like dedup feature.
https://www.dataengineeringpodcast.com/firebolt-cloud-data-warehouse-episode-148/
https://www.firebolt.io/
https://www.dataengineeringpodcast.com/firebolt-cloud-data-warehouse-episode-148/
https://www.firebolt.io/
Data Engineering Podcast
Building A Better Data Warehouse For The Cloud At Firebolt - Episode 148
Data warehouse technology has been around for decades and has gone through several generational shifts in that time. The current trends in data warehousing are oriented around cloud native architectures that take advantage of dynamic scaling and the separation…
Forwarded from DataEng
Карта навыков современного дата инженера: https://github.com/datastacktv/data-engineer-roadmap
Неплохо дополняет мою статью: https://khashtamov.com/ru/data-engineer/
Неплохо дополняет мою статью: https://khashtamov.com/ru/data-engineer/
GitHub
GitHub - datastacktv/data-engineer-roadmap: Roadmap to becoming a data engineer in 2021
Roadmap to becoming a data engineer in 2021. Contribute to datastacktv/data-engineer-roadmap development by creating an account on GitHub.
Forwarded from DataEng
В Amazon Redshift стал доступен функционал работы с БД поверх HTTPS: https://aws.amazon.com/ru/about-aws/whats-new/2020/09/announcing-data-api-for-amazon-redshift/
Amazon
Announcing Data API for Amazon Redshift