[Book] Algorithms for decision making
https://algorithmsbook.com/?fbclid=IwAR0u0ETrO--9ECKroSuEzeputjRN2gcljFXZwSlzc9vZN0TKCD7BxZ_3D2I
https://algorithmsbook.com/?fbclid=IwAR0u0ETrO--9ECKroSuEzeputjRN2gcljFXZwSlzc9vZN0TKCD7BxZ_3D2I
[Article][Kubernetes]
NetworkPolicy Editor: Create, Visualize, and Share Kubernetes NetworkPolicies
https://cilium.io/blog/2021/02/10/network-policy-editor
NetworkPolicy Editor: Create, Visualize, and Share Kubernetes NetworkPolicies
https://cilium.io/blog/2021/02/10/network-policy-editor
cilium.io
NetworkPolicy Editor: Create, Visualize, and Share Kubernetes NetworkPolicies
NetworkPolicy Editor simplifies building secure Kubernetes network policies, addressing the steep learning curve from YAML syntax to ...
[MongoDB] Top 5 features of mongodb
https://www.percona.com/blog/2021/02/15/top-5-features-developers-love-about-mongodb
https://www.percona.com/blog/2021/02/15/top-5-features-developers-love-about-mongodb
Percona Database Performance Blog
Top 5 Features Developers Love About MongoDB - Percona Database Performance Blog
In this blog post, we discuss the top five features developers love about MongoDB, and what it does better than anyone else.
[DB]Twitter link: https://twitter.com/andy_pavlo/status/1366203417304121348?s=21
Twitter
Andy Pavlo
Vaccination DB Talk #3 - @KishoreBytes presents an overview of @ApachePinot and their custom star-tree index for JSON documents: https://t.co/1e3LNiOk37 This talk gave me a better understanding of how Pinot fits into the modern DBMS landscape.
[Data pipelines] Components of Modern Data Pipelines
https://softwareengineeringdaily.com/2020/04/30/components-of-modern-data-pipelines/
https://softwareengineeringdaily.com/2020/04/30/components-of-modern-data-pipelines/
Software Engineering Daily
Components of Modern Data Pipelines - Software Engineering Daily
Figure 1 Data flows to and from systems through data pipelines. The motivations for data pipelines include the decoupling of systems, avoidance of performance hits where the data is being captured, and the ability to combine data from different systems. Pipelines…
[Article][Software testing]
Perception and Practices of Differential Testing
https://people.cs.vt.edu/~gulzar/assets/pdf/p71-gulzar.pdf
Perception and Practices of Differential Testing
https://people.cs.vt.edu/~gulzar/assets/pdf/p71-gulzar.pdf
[Article] How to Tune RocksDB for Your Kafka Streams Application
https://cnfl.io/how-to-tune-rocksdb-kafka-streams-state-stores-performance-blog
https://cnfl.io/how-to-tune-rocksdb-kafka-streams-state-stores-performance-blog
Confluent
Performance Tuning RocksDB for Kafka Streams’ State Stores
In this guide, learn how RocksDB and Kafka Streams work, how to improve single node performance, easily identify setup issues, and operate state stores in a more robust manner.
[Article]
Components of Modern Data Pipelines
https://softwareengineeringdaily.com/2020/04/30/components-of-modern-data-pipelines/
Components of Modern Data Pipelines
https://softwareengineeringdaily.com/2020/04/30/components-of-modern-data-pipelines/
Software Engineering Daily
Components of Modern Data Pipelines - Software Engineering Daily
Figure 1 Data flows to and from systems through data pipelines. The motivations for data pipelines include the decoupling of systems, avoidance of performance hits where the data is being captured, and the ability to combine data from different systems. Pipelines…
[Book][ML]
Probabilistic Machine Learning: An Introduction
by Kevin Patrick Murphy.
https://probml.github.io/pml-book/book1.html
Probabilistic Machine Learning: An Introduction
by Kevin Patrick Murphy.
https://probml.github.io/pml-book/book1.html
[Article][Event streaming]
Capturing Every Change From Shopify’s Sharded Monolith
https://shopify.engineering/capturing-every-change-shopify-sharded-monolith
Capturing Every Change From Shopify’s Sharded Monolith
https://shopify.engineering/capturing-every-change-shopify-sharded-monolith
Shopify
Capturing Every Change From Shopify’s Sharded Monolith - Shopify
Shopify’s data warehouse has gone through many iterations since the company's founding in 2004. Since then, the data warehouse has evolved and grown into a data lake, compromising multiple storage mechanisms, systems, and consumers.
[Kappa architecture]
Kappa Architecture is a software architecture pattern. Rather than using a relational DB like SQL or a key-value store like Cassandra, the canonical data store in a Kappa Architecture system is an append-only immutable log. From the log, data is streamed through a computational system and fed into auxiliary stores for serving.
Kappa Architecture is a simplification of Lambda Architecture. A Kappa Architecture system is like a Lambda Architecture system with the batch processing system removed. To replace batch processing, data is simply fed through the streaming system quickly.
Repository dedicated to Kappa Architecture.
http://milinda.pathirage.org/kappa-architecture.com/
Kappa Architecture is a software architecture pattern. Rather than using a relational DB like SQL or a key-value store like Cassandra, the canonical data store in a Kappa Architecture system is an append-only immutable log. From the log, data is streamed through a computational system and fed into auxiliary stores for serving.
Kappa Architecture is a simplification of Lambda Architecture. A Kappa Architecture system is like a Lambda Architecture system with the batch processing system removed. To replace batch processing, data is simply fed through the streaming system quickly.
Repository dedicated to Kappa Architecture.
http://milinda.pathirage.org/kappa-architecture.com/
[Paper][Data lake]
Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores
Delta Lake is an open source ACID table storage layer over cloud object stores initially developed at Databricks. Delta Lake uses a transaction log that is compacted into Apache Parquet format to provide ACID properties, time travel, and significantly faster metadata operations for large tabular datasets (e.g., the ability to quickly search billions of table partitions for those relevant to a query).
https://databricks.com/wp-content/uploads/2020/08/p975-armbrust.pdf
Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores
Delta Lake is an open source ACID table storage layer over cloud object stores initially developed at Databricks. Delta Lake uses a transaction log that is compacted into Apache Parquet format to provide ACID properties, time travel, and significantly faster metadata operations for large tabular datasets (e.g., the ability to quickly search billions of table partitions for those relevant to a query).
https://databricks.com/wp-content/uploads/2020/08/p975-armbrust.pdf
[Article]
READING GROUP. PROTEAN: VM ALLOCATION SERVICE AT SCALE
This paper from Microsoft is full of technical insights into how they operate their datacenters/regions at scale. In particular, the paper discusses one of the fundamental components of any cloud provider — the VM service. The system, called Protean, is an allocation service that handles VM allocation requests
http://charap.co/reading-group-protean-vm-allocation-service-at-scale/
READING GROUP. PROTEAN: VM ALLOCATION SERVICE AT SCALE
This paper from Microsoft is full of technical insights into how they operate their datacenters/regions at scale. In particular, the paper discusses one of the fundamental components of any cloud provider — the VM service. The system, called Protean, is an allocation service that handles VM allocation requests
http://charap.co/reading-group-protean-vm-allocation-service-at-scale/
[Article]
Federated Quantum Machine Learning
Distributed training across several quantum computers could significantly improve the training
time and if we could share the learned model, not the data, it could potentially improve the
data privacy as the training would happen where the data is located. However, to the best of
our knowledge, no work has been done in quantum machine learning (QML) in federation setting
yet. In this work, we present the federated training on hybrid quantum-classical machine learning
models although our framework could be generalized to pure quantum machine learning model…
https://arxiv.org/pdf/2103.12010v1.pdf
Federated Quantum Machine Learning
Distributed training across several quantum computers could significantly improve the training
time and if we could share the learned model, not the data, it could potentially improve the
data privacy as the training would happen where the data is located. However, to the best of
our knowledge, no work has been done in quantum machine learning (QML) in federation setting
yet. In this work, we present the federated training on hybrid quantum-classical machine learning
models although our framework could be generalized to pure quantum machine learning model…
https://arxiv.org/pdf/2103.12010v1.pdf