NEW BOT Телеграм, страница

Data1984

Very useful and relevant blog post about data deletion in a data lake. Besides suggested solution I would like to mention also using Delta Lake as alternative. And finally, it would be great if the author has mentioned cost considerations .
https://aws.amazon.com/blogs/big-data/how-to-delete-user-data-in-an-aws-data-lake/

Amazon

How to delete user data in an AWS data lake | Amazon Web Services

General Data Protection Regulation (GDPR) is an important aspect of today’s technology world, and processing data in compliance with GDPR is a necessity for those who implement solutions within the AWS public cloud. One article of GDPR is the “right to erasure”…

342 views19:19

🤩🤔

Data1984

Amazing statistics about data.
https://www.datanami.com/2020/09/04/10-big-data-statistics-that-will-blow-your-mind/?utm_source=rss&utm_medium=rss&utm_campaign=10-big-data-statistics-that-will-blow-your-mind

Datanami

10 Big Data Statistics That Will Blow Your Mind

They call it “big data” for a reason--it's really, really big. But getting your head wrapped around the growth of information digitization is not easy.

378 views07:00

🤩 2 🤔

Data1984

https://netflixtechblog.com/analytics-at-netflix-who-we-are-and-what-we-do-7d9c08fe6965?source=rss----2615bd06b42e---4

Medium

Analytics at Netflix: Who We Are and What We Do

An Introduction to Analytics and Visualization Engineering at Netflix

487 views07:00

🤩 1 🤔

Data1984

20x improvement compared to #Spark 2.4
https://techcommunity.microsoft.com/t5/azure-databricks/turbocharge-azure-databricks-with-photon-powered-delta-engine/ba-p/1694929

TECHCOMMUNITY.MICROSOFT.COM

Turbocharge Azure Databricks with Photon powered Delta Engine

Today we are excited to announce the preview of Photon powered Delta Engine on Azure Databricks – fast, easy, and collaborative Analytics and AI service. Built from scratch in C++ and fully compatible with Spark APIs, Photon is a vectorized query engine that…

556 views07:00

🤩 3 🤔

Data1984

Great thread on Python 3 cool features.
https://twitter.com/svpino/status/1308632185113579522?s=19

Twitter

Santiago 🎃

Are you taking full advantage of Python 3? Are you sure? Here are 10 Python 3 features that will change the way you are writing code today. 🧵👇

1.44K views07:00

🤩 1 🤔 1

Data1984

Most of the subscribers know why I've paused posting in the channel. I think most of you are busy now with other important issues. So I would like to create a poll to ask you whether you would like to see new posts or not yet. Thank you for understanding.

Anonymous Poll

54 voters467 views09:53

Data1984

🧵A curated list of Ultimate Python resources! 🧵

https://twitter.com/ayushi7rawat/status/1315651868891049984?s=19

Twitter

Ayushi Rawat 🐍

🧵A curated list of Ultimate Python resources! 🧵 (Getting started with #Python or a senior Python developer, you wouldn't wanna miss this) 😄 A thread. 🧵👇

362 views07:00

Data1984

Redshift now supports scheduled queries
https://aws.amazon.com/about-aws/whats-new/2020/10/amazon-redshift-supports-scheduling-sql-queries-by-integrating-with-amazon-eventbridge/

Amazon

Amazon Redshift now supports the scheduling of SQL queries by integrating with Amazon EventBridge

344 views07:26

Data1984

New features in Python 3.9.
https://twitter.com/svpino/status/1313202874487312387?s=19

Twitter

Santiago 🎃

Python 3.9 🐍 is out! 🥳 Here are the 5 new features you care about. 🧵👇

324 views07:00

Data1984

Whould you like to have comments in the channel?

Anonymous Poll

73%

Yes

27%

44 voters321 views18:06

Data1984

* Would

324 views18:16

Data1984

#AWS released open-source Python connector for Redshift with Data API support. By the way Redshift Data API was also announced recently.
https://github.com/aws/amazon-redshift-python-driver

GitHub

GitHub - aws/amazon-redshift-python-driver: Redshift Python Connector. It supports Python Database API Specification v2.0.

Redshift Python Connector. It supports Python Database API Specification v2.0. - aws/amazon-redshift-python-driver

315 views07:55

Data1984

Data1984 pinned «Whould you like to have comments in the channel?»

07:55

Data1984

It seems that #AWS is improving #Redshift on a weekly basis. Here is another cool feature.
https://aws.amazon.com/about-aws/whats-new/2020/11/amazon-redshift-announces-automatic-refresh-and-query-rewrite-for-materialized-views/

Amazon Web Services, Inc.

Amazon Redshift announces automatic refresh and query rewrite for materialized views

332 views07:56

Data1984

A comparison of data version control tools.
https://dagshub.com/blog/data-version-control-tools/

DagsHub Blog

Comparing Data Version Control Tools - 2020

Data versioning is one of the keys to automating a team's machine learning model development. While it can be very complicated if your team attempts to develop its own system to manage the process, this doesn’t need to be the case.

1.1K views07:00

Data1984

A short series of articles from Lyft about Gevent #Python library.
https://eng.lyft.com/what-the-heck-is-gevent-4e87db98a8
https://eng.lyft.com/gevent-part-2-correctness-22e3b7998382
https://eng.lyft.com/gevent-part-3-performance-e64303fa102b
https://eng.lyft.com/applying-gevent-learnings-to-deliver-value-to-users-part-4-of-4-36ad932deea8

Medium

What the heck is gevent?

Overview

326 views07:00

Data1984

Introduction to Apache Pinot, a real-time distributed OLAP datastore from LinkedIn and Uber
https://docs.pinot.apache.org/

docs.pinot.apache.org

Introduction | Apache Pinot Docs

Apache Pinot is a real-time distributed OLAP datastore purpose-built for low-latency, high-throughput analytics, and perfect for user-facing analytical workloads.

322 views07:00

Data1984

Some important updates from #AWS :
✅ Amazon Kinesis Data Streams enables data stream retention up to one year.
✅ Now you can export your Amazon DynamoDB table data to your data lake in Amazon S3 to perform analytics at any scale.
✅ Amazon Redshift now supports modifying column compression encodings to optimize storage utilization and query performance
✅ Amazon Athena announces availability of engine version 2

Amazon

Amazon Kinesis Data Streams enables data stream retention up to one year

1.06K views07:00

Data1984

➡️ Discover the new syntax for implicits in #Scala 3.

➡️ Learn how to express extension methods, implicit parameters, implicit conversions, and typeclasses in #Scala 3!

https://t.co/BYFnTVc3yh

www.scala-lang.org

Explicit term inference with Scala 3

393 views07:00

Data1984

#AWS updates:
✅ Amazon EMR now provides up to 35% lower cost and up to 15% improved performance for Spark workloads on Graviton2-based instances
✅ AWS Glue Streaming ETL jobs support reading records in the Apache Avro format
✅ Control the evolution of data streams using the AWS Glue Schema Registry

Amazon

Amazon EMR now provides up to 35% lower cost and up to 15% improved performance for Spark workloads on Graviton2-based instances

316 views07:00

Data1984

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
https://github.com/donnemartin/system-design-primer

GitHub

GitHub - donnemartin/system-design-primer: Learn how to design large-scale systems. Prep for the system design interview. Includes…

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards. - donnemartin/system-design-primer

331 views07:00

About

Blog

Apps

Platform