NEW BOT Телеграм, страница - 808693872

Data1984

787 subscribers

44 photos

1 video

17 files

762 links

This channel is mostly about data related stuff, some of the main topics are #DataEngineering #SQL #Python #cloud .

Contact: @gorros

Download Telegram

About

Blog

Apps

Platform

787 subscribers

Build a poor man’s data lake from scratch with DuckDB | Dagster Blog
https://dagster.io/blog/duckdb-data-lake

Build a Data Lake with DuckDB + Dagster

Use DuckDB, Python, and Dagster to build a lightweight data lake with SQL transforms and Parquet file support.

447 views16:35

https://aws.amazon.com/ru/blogs/publicsector/republic-of-armenias-ministry-high-tech-industry-aws-sign-memorandum-understanding-mou/

Republic of Armenia’s Ministry of High-Tech Industry and AWS sign Memorandum of Understanding (MoU) | Amazon Web Services

The Republic of Armenia’s Ministry of High-Tech Industry and AWS have signed a Memorandum of Understanding (MoU) with the aim of modernizing the technological infrastructure of the state and accelerating the adoption of cloud services in the public and the…

👍3😁1

484 views10:35

Pandas 2.0 and its Ecosystem (Arrow, Polars, DuckDB) | Airbyte
https://airbyte.com/blog/pandas-2-0-ecosystem-arrow-polars-duckdb

Pandas 2.0 and its Ecosystem (Arrow, Polars, DuckDB) | Airbyte

Dive deeper into the power of Pandas and how leveraging it can benefit your organization. Explore a new way to work with data and unlock powerful insights!

850 views06:05

👍3

477 views15:58

Home - Apache Doris
https://doris.apache.org/

doris.apache.org

Apache Doris: Open source data warehouse for real time data analytics - Apache Doris

Apache Doris is an open-source database based on MPP architecture,with easier use and higher performance. As a modern data warehouse, apache doris empowers your Olap query and database analytics.

429 views18:25

https://github.com/microsoft/semantic-kernel

GitHub - microsoft/semantic-kernel: Integrate cutting-edge LLM technology quickly and easily into your apps

Integrate cutting-edge LLM technology quickly and easily into your apps - microsoft/semantic-kernel

393 views14:31

GitHub - MaterializeInc/datagen: Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.
https://github.com/MaterializeInc/datagen

GitHub - MaterializeInc/datagen: Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka…

Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format. - MaterializeInc/datagen

452 views04:54

Welcome - Data With Rust
https://datawithrust.com/

424 views07:57

GitHub - eto-ai/lance: Modern columnar data format for ML implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
https://github.com/eto-ai/lance

GitHub - lancedb/lance: Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code…

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du...

👍1

469 views11:06

MotherDuck: Big Data is Dead
https://motherduck.com/blog/big-data-is-dead/

Big Data is Dead - MotherDuck Blog

Big data is dead. Long live easy data.

👍3

380 views18:40

Lightning fast aggregations by distributing DuckDB across AWS Lambda functions | by BoilingData.com | Medium
https://boilingdata.medium.com/lightning-fast-aggregations-by-distributing-duckdb-across-aws-lambda-functions-e4775931ab04

Lightning fast aggregations by distributing DuckDB across AWS Lambda functions

DuckDB is rapidly changing the way data scientists and engineers work. It’s efficient and internally parallelised architecture means that a…

364 views05:01

https://vickiboykis.com/2021/06/06/the-humble-hash-aggregate/

★ Vicki Boykis ★

The humble hash aggregate

Data work has its own unique architecture we should be aware of

👍2

365 views16:54

Simplify Online Analytical Processing (OLAP) queries in Amazon Redshift using new SQL constructs such as ROLLUP, CUBE, and GROUPING SETS

Simplify Online Analytical Processing (OLAP) queries in Amazon Redshift using new SQL constructs such as ROLLUP, CUBE, and GROUPING…

Amazon Redshift is a fully managed, petabyte-scale, massively parallel data warehouse that makes it fast, simple, and cost-effective to analyze all your data using standard SQL and your existing business intelligence (BI) tools. We are continuously investing…

403 viewsedited 20:18

BigQuery under the hood: Behind the serverless storage and query optimizations that supercharge performance

Google Cloud Blog

Inside BigQuery’s storage and query optimizations | Google Cloud Blog

BigQuery’s serverless architecture features storage and query optimizations that deliver transformational data analytics performance.

459 viewsedited 21:51

Build a real-time GDPR-aligned Apache Iceberg data lake.

Build a real-time GDPR-aligned Apache Iceberg data lake | Amazon Web Services

Data lakes are a popular choice for today’s organizations to store their data around their business activities. As a best practice of a data lake design, data should be immutable once stored. But regulations such as the General Data Protection Regulation…

👍3

484 views08:16

Implement slowly changing dimensions in a data lake using AWS Glue and Delta | AWS Big Data Blog
https://aws.amazon.com/blogs/big-data/implement-slowly-changing-dimensions-in-a-data-lake-using-aws-glue-and-delta/

Implement slowly changing dimensions in a data lake using AWS Glue and Delta | Amazon Web Services

In a data warehouse, a dimension is a structure that categorizes facts and measures in order to enable users to answer business questions. To illustrate an example, in a typical sales domain, customer, time or product are dimensions and sales transactions…

440 views18:06

https://github.com/Azure/azure-data-labs-modules

GitHub - Azure/azure-data-labs-modules: A list of Terraform modules to build your Azure Data IaC templates.

A list of Terraform modules to build your Azure Data IaC templates. - Azure/azure-data-labs-modules

437 views18:17

https://towardsdatascience.com/anatomy-of-sql-window-functions-7256d8cf509a

Anatomy of SQL Window Functions

Back To Basics | SQL fundamentals for beginners

450 views05:53

https://www.linkedin.com/posts/mhalkjaer_azure-data-engineering-dp-203-notes-activity-7048574041762226176-X5zE?utm_source=share&utm_medium=member_android

Mathias Halkjær Petersen on LinkedIn: Azure Data Engineering (DP-203) Notes | 26 comments

𝗪𝗮𝗻’𝘁 𝘁𝗼 𝗯𝗲𝗰𝗼𝗺𝗲 𝗮𝗻 𝗔𝘇𝘂𝗿𝗲 𝗗𝗮𝘁𝗮 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿?

Check out this free resource! 🔥

The DP-203 exam is an excellent step in the journey… | 26 comments on LinkedIn

👍2

456 views12:07

Azure Data Engineering (DP-203) Notes.pdf

👍4

490 views12:08

Welcome to Marvin - Marvin
https://www.askmarvin.ai/

Marvin - Marvin

A powerful framework for building AI applications

519 views19:53