Data1984 – Telegram
Data1984
787 subscribers
44 photos
1 video
17 files
762 links
This channel is mostly about data related stuff, some of the main topics are #DataEngineering #SQL #Python #cloud .

Contact: @gorros
Download Telegram
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

https://arxiv.org/html/2404.07544v1
1
Not sure if this is a dbt alternative or just an abstraction layer 🤔
https://www.malloydata.dev/
DataFrames at Scale Comparison: TPC-H

https://docs.coiled.io/blog/tpch.html
It looks like Databricks is up to something. First it was Iceberg and now Hudi. It is not clear if they try to converge on one open table format or just want to make sure Delta Lake becomes the one. In my opinion, Iceberg has more potential to become the one.
1
Build Data Products and a Data Mesh with dbt Cloud: A tutorial from Snowflake. This is very similar to a project I am currently working on.
👍1
AutoMQ is a cloud-first alternative to Kafka by decoupling durability to S3 and EBS. 10x cost-effective. Autoscale in seconds. Single-digit ms latency.
👍3
A nice example of a hybrid solution with Kafka
Source and detailed analysis: https://x.com/BdKozlovski/status/1842912763607142579
I would also add Azure Data Explorer (Kusto) to the list. However, ADX is not open-source.
https://startree.ai/resources/a-tale-of-three-real-time-olap-databases
While US markets are panicking you can try to play with DeepSeek by installing it locally or using Cursor, it is already available there.
https://dev.to/lunaticprogrammer/using-deepseek-r1-in-visual-studio-code-for-free-2279
🎉1