The Past, Present and Future of Stream Processing - Kai Waehner
https://www.kai-waehner.de/blog/2024/03/20/the-past-present-and-future-of-stream-processing/
https://www.kai-waehner.de/blog/2024/03/20/the-past-present-and-future-of-stream-processing/
Kai Waehner
The Past, Present and Future of Stream Processing
Stream Processing Journey with IBM, Apama, TIBCO StreamBase, Kafka Streams, Apache Flink, Streaming Databases, GenAI and Apache Iceberg.
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples
https://arxiv.org/html/2404.07544v1
https://arxiv.org/html/2404.07544v1
❤1
Not sure if this is a dbt alternative or just an abstraction layer 🤔
https://www.malloydata.dev/
https://www.malloydata.dev/
www.malloydata.dev
A modern open source language for analyzing, transforming, and modeling data.
It looks like Databricks is up to something. First it was Iceberg and now Hudi. It is not clear if they try to converge on one open table format or just want to make sure Delta Lake becomes the one. In my opinion, Iceberg has more potential to become the one.
TechCrunch
Databricks acquires Tabular to build a common data lakehouse standard
Databricks has acquired Tabular, a data management startup, in its quest to build a common standard for data lakehouses.
❤1
Snowflake infrastructure as code.
https://github.com/Titan-Systems/titan
https://github.com/Titan-Systems/titan
GitHub
GitHub - Titan-Systems/titan: Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage…
Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool f...
AutoMQ is a cloud-first alternative to Kafka by decoupling durability to S3 and EBS. 10x cost-effective. Autoscale in seconds. Single-digit ms latency.
Medium
How do we run Kafka 100% on the object storage?
Let’s see how AutoMQ makes this dream come true.
👍3
A nice example of a hybrid solution with Kafka
Source and detailed analysis: https://x.com/BdKozlovski/status/1842912763607142579
Source and detailed analysis: https://x.com/BdKozlovski/status/1842912763607142579
AWS announces Amazon Nova, a new generation of state-of-the-art foundation models (FMs) that deliver frontier intelligence and industry leading price performance, available exclusively in Amazon Bedrock.
https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/
https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/
Amazon
Introducing Amazon Nova foundation models: Frontier intelligence and industry leading price performance | Amazon Web Services
Amazon Nova foundation models deliver frontier intelligence and industry leading price-performance, with support for text and multimodal intelligence, multimodal fine-tuning, and high-quality images and videos.
Introducing queryable object metadata for Amazon S3 buckets (preview) | AWS News Blog
https://aws.amazon.com/blogs/aws/introducing-queryable-object-metadata-for-amazon-s3-buckets-preview/
https://aws.amazon.com/blogs/aws/introducing-queryable-object-metadata-for-amazon-s3-buckets-preview/
Amazon
Introducing queryable object metadata for Amazon S3 buckets (preview) | Amazon Web Services
Unlock S3 data insights effortlessly with AWS' rich metadata capture; query objects by key, size, tags, and more using Athena, Redshift, and Spark at scale.
New Amazon S3 Tables: Storage optimized for analytics workloads | AWS News Blog
https://aws.amazon.com/blogs/aws/new-amazon-s3-tables-storage-optimized-for-analytics-workloads/
https://aws.amazon.com/blogs/aws/new-amazon-s3-tables-storage-optimized-for-analytics-workloads/
Amazon
New Amazon S3 Tables: Storage optimized for analytics workloads | Amazon Web Services
Amazon S3 Tables optimize tabular data storage (like transactions and sensor readings) in Apache Iceberg, enabling high-performance, low-cost queries using Athena, EMR, and Spark.
I would also add Azure Data Explorer (Kusto) to the list. However, ADX is not open-source.
https://startree.ai/resources/a-tale-of-three-real-time-olap-databases
https://startree.ai/resources/a-tale-of-three-real-time-olap-databases
While US markets are panicking you can try to play with DeepSeek by installing it locally or using Cursor, it is already available there.
https://dev.to/lunaticprogrammer/using-deepseek-r1-in-visual-studio-code-for-free-2279
https://dev.to/lunaticprogrammer/using-deepseek-r1-in-visual-studio-code-for-free-2279
🎉1