NEW BOT Телеграм, страница - 184560595

Data1984

787 subscribers

44 photos

1 video

17 files

762 links

This channel is mostly about data related stuff, some of the main topics are #DataEngineering #SQL #Python #cloud .

Contact: @gorros

Download Telegram

About

Blog

Apps

Platform

787 subscribers

https://techcommunity.microsoft.com/t5/azure-data-explorer-blog/general-availability-adx-dashboards/ba-p/3749361

TECHCOMMUNITY.MICROSOFT.COM

General availability: ADX Dashboards

We are thrilled to announce the much-anticipated General Availability of ADX Dashboards!

379 views12:54

https://youtu.be/W_v05d_2RTo

8 Key Data Structures That Power Modern Databases

Weekly system design newsletter: https://bit.ly/3tfAlYD

Checkout our bestselling System Design Interview books:
Volume 1: https://amzn.to/3Ou7gkd
Volume 2: https://amzn.to/3HqGozy

LSM tree video: https://www.youtube.com/watch?v=I6jB0nM9SKU

Other things…

664 viewsedited 12:58

FirstMark | 2023 MAD (ML/AI/Data) Landscape
https://mad.firstmarkcap.com/

FirstMark | 2024 MAD (ML/AI/Data) Landscape

The 2024 MAD (ML/AI/Data) Landscape is the definitive market map of companies and products in machine learning, artificial intelligence and data, compiled by FirstMark.

397 views12:49

https://cloud.google.com/blog/products/data-analytics/building-streaming-data-pipelines/

Google Cloud Blog

Building streaming data pipelines on Google Cloud | Google Cloud Blog

This article reviews three approaches to building a streaming data pipeline on Google Cloud, using Pub/Sub and BigQuery.

448 views13:15

AWS Lambdas - Python vs Rust. Performance and Cost Savings. - Confessions of a Data Guy
https://www.confessionsofadataguy.com/aws-lambdas-python-vs-rust-performance-and-cost-savings/

Confessions of a Data Guy

AWS Lambdas - Python vs Rust. Performance and Cost Savings. - Confessions of a Data Guy

Save money, save money!! Hear Hear! Someone on Linkedin recently brought up the point that companies could save gobs of money by swapping out AWS Python lambdas for Rust ones. While it raised the ire of many a Python Data Engineer, I thought it sounded like…

👍2

436 views15:38

Guide to Partitions Calculation for Processing Data Files in Apache Spark - DZone
https://dzone.com/articles/guide-to-partitions-calculation-for-processing-dat

Guide to Partitions Calculation for Processing Data Files in Apache Spark

Get to Know how Spark chooses the number of partitions implicitly while reading a set of data files into an RDD or a Dataset.

👍1

465 views13:26

Build a poor man’s data lake from scratch with DuckDB | Dagster Blog
https://dagster.io/blog/duckdb-data-lake

Build a Data Lake with DuckDB + Dagster

Use DuckDB, Python, and Dagster to build a lightweight data lake with SQL transforms and Parquet file support.

447 views16:35

https://aws.amazon.com/ru/blogs/publicsector/republic-of-armenias-ministry-high-tech-industry-aws-sign-memorandum-understanding-mou/

Republic of Armenia’s Ministry of High-Tech Industry and AWS sign Memorandum of Understanding (MoU) | Amazon Web Services

The Republic of Armenia’s Ministry of High-Tech Industry and AWS have signed a Memorandum of Understanding (MoU) with the aim of modernizing the technological infrastructure of the state and accelerating the adoption of cloud services in the public and the…

👍3😁1

484 views10:35

Pandas 2.0 and its Ecosystem (Arrow, Polars, DuckDB) | Airbyte
https://airbyte.com/blog/pandas-2-0-ecosystem-arrow-polars-duckdb

Pandas 2.0 and its Ecosystem (Arrow, Polars, DuckDB) | Airbyte

Dive deeper into the power of Pandas and how leveraging it can benefit your organization. Explore a new way to work with data and unlock powerful insights!

850 views06:05

👍3

477 views15:58

Home - Apache Doris
https://doris.apache.org/

doris.apache.org

Apache Doris: Open source data warehouse for real time data analytics - Apache Doris

Apache Doris is an open-source database based on MPP architecture,with easier use and higher performance. As a modern data warehouse, apache doris empowers your Olap query and database analytics.

429 views18:25

https://github.com/microsoft/semantic-kernel

GitHub - microsoft/semantic-kernel: Integrate cutting-edge LLM technology quickly and easily into your apps

Integrate cutting-edge LLM technology quickly and easily into your apps - microsoft/semantic-kernel

393 views14:31

GitHub - MaterializeInc/datagen: Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.
https://github.com/MaterializeInc/datagen

GitHub - MaterializeInc/datagen: Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka…

Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format. - MaterializeInc/datagen

452 views04:54

Welcome - Data With Rust
https://datawithrust.com/

424 views07:57

GitHub - eto-ai/lance: Modern columnar data format for ML implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
https://github.com/eto-ai/lance

GitHub - lancedb/lance: Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code…

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du...

👍1

469 views11:06

MotherDuck: Big Data is Dead
https://motherduck.com/blog/big-data-is-dead/

Big Data is Dead - MotherDuck Blog

Big data is dead. Long live easy data.

👍3

380 views18:40

Lightning fast aggregations by distributing DuckDB across AWS Lambda functions | by BoilingData.com | Medium
https://boilingdata.medium.com/lightning-fast-aggregations-by-distributing-duckdb-across-aws-lambda-functions-e4775931ab04

Lightning fast aggregations by distributing DuckDB across AWS Lambda functions

DuckDB is rapidly changing the way data scientists and engineers work. It’s efficient and internally parallelised architecture means that a…

364 views05:01

https://vickiboykis.com/2021/06/06/the-humble-hash-aggregate/

★ Vicki Boykis ★

The humble hash aggregate

Data work has its own unique architecture we should be aware of

👍2

365 views16:54

Simplify Online Analytical Processing (OLAP) queries in Amazon Redshift using new SQL constructs such as ROLLUP, CUBE, and GROUPING SETS

Simplify Online Analytical Processing (OLAP) queries in Amazon Redshift using new SQL constructs such as ROLLUP, CUBE, and GROUPING…

Amazon Redshift is a fully managed, petabyte-scale, massively parallel data warehouse that makes it fast, simple, and cost-effective to analyze all your data using standard SQL and your existing business intelligence (BI) tools. We are continuously investing…

403 viewsedited 20:18

BigQuery under the hood: Behind the serverless storage and query optimizations that supercharge performance

Google Cloud Blog

Inside BigQuery’s storage and query optimizations | Google Cloud Blog

BigQuery’s serverless architecture features storage and query optimizations that deliver transformational data analytics performance.

459 viewsedited 21:51

Build a real-time GDPR-aligned Apache Iceberg data lake.

Build a real-time GDPR-aligned Apache Iceberg data lake | Amazon Web Services

Data lakes are a popular choice for today’s organizations to store their data around their business activities. As a best practice of a data lake design, data should be immutable once stored. But regulations such as the General Data Protection Regulation…

👍3

484 views08:16