Arroyo is an open-source stream processing engine, enabling users to transform, filter, aggregate, and join their data streams in real-time with SQL queries. It's designed to be easy enough for any SQL user to build correct, reliable, and scalable streaming pipelines.
https://www.arroyo.dev/blog/why-arrow-and-datafusion
https://www.arroyo.dev/blog/why-arrow-and-datafusion
www.arroyo.dev
We built a new SQL Engine on Arrow and DataFusion
Arroyo 0.10 has an entirely new SQL engine built with Apache Arrow and DataFusion. It's much faster, smaller, and easier to run. Read on for why and how we're making this change.
❤1
Devin, the first AI software engineer 🤯
X (formerly Twitter)
Cognition (@cognition) on X
Today we're excited to introduce Devin, the first AI software engineer.
Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs…
Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs…
Interesting projects from Microsoft for building LLM applications:
- AICI: Prompts as (Wasm) Programs
- AutoGen: A programming framework for agentic AI
- Semantic Kernel: Integrate cutting-edge LLM technology quickly and easily into your apps
- AICI: Prompts as (Wasm) Programs
- AutoGen: A programming framework for agentic AI
- Semantic Kernel: Integrate cutting-edge LLM technology quickly and easily into your apps
GitHub
GitHub - microsoft/aici: AICI: Prompts as (Wasm) Programs
AICI: Prompts as (Wasm) Programs. Contribute to microsoft/aici development by creating an account on GitHub.
👍1
TaskWeaver: A code-first agent framework for seamlessly planning and executing data analytics tasks.
GitHub
GitHub - microsoft/TaskWeaver: A code-first agent framework for seamlessly planning and executing data analytics tasks.
A code-first agent framework for seamlessly planning and executing data analytics tasks. - GitHub - microsoft/TaskWeaver: A code-first agent framework for seamlessly planning and executing data an...
The Past, Present and Future of Stream Processing - Kai Waehner
https://www.kai-waehner.de/blog/2024/03/20/the-past-present-and-future-of-stream-processing/
https://www.kai-waehner.de/blog/2024/03/20/the-past-present-and-future-of-stream-processing/
Kai Waehner
The Past, Present and Future of Stream Processing
Stream Processing Journey with IBM, Apama, TIBCO StreamBase, Kafka Streams, Apache Flink, Streaming Databases, GenAI and Apache Iceberg.
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples
https://arxiv.org/html/2404.07544v1
https://arxiv.org/html/2404.07544v1
❤1
Not sure if this is a dbt alternative or just an abstraction layer 🤔
https://www.malloydata.dev/
https://www.malloydata.dev/
www.malloydata.dev
A modern open source language for analyzing, transforming, and modeling data.
It looks like Databricks is up to something. First it was Iceberg and now Hudi. It is not clear if they try to converge on one open table format or just want to make sure Delta Lake becomes the one. In my opinion, Iceberg has more potential to become the one.
TechCrunch
Databricks acquires Tabular to build a common data lakehouse standard
Databricks has acquired Tabular, a data management startup, in its quest to build a common standard for data lakehouses.
❤1
Snowflake infrastructure as code.
https://github.com/Titan-Systems/titan
https://github.com/Titan-Systems/titan
GitHub
GitHub - Titan-Systems/titan: Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage…
Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool f...