This use-case puts Apache Doris into perspective. I don't know many alternatives to Clickhouse as open-source data warehouse but it looks like Doris is one of them.
Medium
Apache Doris speeds up data reporting, tagging, and data lake analytics
As much as we say Apache Doris is an all-in-one data platform that is capable of various analytics workloads, it is always compelling to…
👍3🤔1
Data1984
This use-case puts Apache Doris into perspective. I don't know many alternatives to Clickhouse as open-source data warehouse but it looks like Doris is one of them.
If you want to learn more about this project here is a review.
DZone
Introduction to Apache Doris: A Next-Generation Real-Time Data Warehouse
This is a technical overview of Apache Doris, introducing how it enables fast query performance with its architectural design, features, and mechanisms.
❤1
Arroyo is an open-source stream processing engine, enabling users to transform, filter, aggregate, and join their data streams in real-time with SQL queries. It's designed to be easy enough for any SQL user to build correct, reliable, and scalable streaming pipelines.
https://www.arroyo.dev/blog/why-arrow-and-datafusion
https://www.arroyo.dev/blog/why-arrow-and-datafusion
www.arroyo.dev
We built a new SQL Engine on Arrow and DataFusion
Arroyo 0.10 has an entirely new SQL engine built with Apache Arrow and DataFusion. It's much faster, smaller, and easier to run. Read on for why and how we're making this change.
❤1
Devin, the first AI software engineer 🤯
X (formerly Twitter)
Cognition (@cognition) on X
Today we're excited to introduce Devin, the first AI software engineer.
Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs…
Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs…
Interesting projects from Microsoft for building LLM applications:
- AICI: Prompts as (Wasm) Programs
- AutoGen: A programming framework for agentic AI
- Semantic Kernel: Integrate cutting-edge LLM technology quickly and easily into your apps
- AICI: Prompts as (Wasm) Programs
- AutoGen: A programming framework for agentic AI
- Semantic Kernel: Integrate cutting-edge LLM technology quickly and easily into your apps
GitHub
GitHub - microsoft/aici: AICI: Prompts as (Wasm) Programs
AICI: Prompts as (Wasm) Programs. Contribute to microsoft/aici development by creating an account on GitHub.
👍1
TaskWeaver: A code-first agent framework for seamlessly planning and executing data analytics tasks.
GitHub
GitHub - microsoft/TaskWeaver: A code-first agent framework for seamlessly planning and executing data analytics tasks.
A code-first agent framework for seamlessly planning and executing data analytics tasks. - GitHub - microsoft/TaskWeaver: A code-first agent framework for seamlessly planning and executing data an...
The Past, Present and Future of Stream Processing - Kai Waehner
https://www.kai-waehner.de/blog/2024/03/20/the-past-present-and-future-of-stream-processing/
https://www.kai-waehner.de/blog/2024/03/20/the-past-present-and-future-of-stream-processing/
Kai Waehner
The Past, Present and Future of Stream Processing
Stream Processing Journey with IBM, Apama, TIBCO StreamBase, Kafka Streams, Apache Flink, Streaming Databases, GenAI and Apache Iceberg.
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples
https://arxiv.org/html/2404.07544v1
https://arxiv.org/html/2404.07544v1
❤1
Not sure if this is a dbt alternative or just an abstraction layer 🤔
https://www.malloydata.dev/
https://www.malloydata.dev/
www.malloydata.dev
A modern open source language for analyzing, transforming, and modeling data.
It looks like Databricks is up to something. First it was Iceberg and now Hudi. It is not clear if they try to converge on one open table format or just want to make sure Delta Lake becomes the one. In my opinion, Iceberg has more potential to become the one.
TechCrunch
Databricks acquires Tabular to build a common data lakehouse standard
Databricks has acquired Tabular, a data management startup, in its quest to build a common standard for data lakehouses.
❤1