DE & ML Digest
@mlbigdata
120
subscribers
9
photos
1
video
34.7K
links
Collection of all articles on Data Engineering and Machine Learning
Contact -
@luminousmen
Download Telegram
Join
DE & ML Digest
120 subscribers
DE & ML Digest
ML
How to Train Time Series Forecasting Faster Using Ray, part 2 of 2
Medium
How to Train Time Series Forecasting Faster Using Ray, part 2 of 2
Time Series Forecasting using an LSTM version of RNN with PyTorch Forecasting and Torch Lightning
DE & ML Digest
ML
Bring Your Childhood Drawings To Life Within Seconds— A Demo Of Meta’s Creative AI
Medium
Bring Your Childhood Drawings To Life Within Seconds— A Demo Of Meta’s Creative AI
Creative AI is on the rise. In a matter of literally just seconds, this brand-new tool by the Meta Demo Lab transforms a static drawing to…
DE & ML Digest
ML
Top 5 Single-cell Genomics Papers of 2021
Medium
Top 5 Single-cell Genomics Papers of 2021
A look back at some exciting papers this year in the age of Big Data
DE & ML Digest
ML
How to Structure your Data Science Notebook to be Easy to Follow
Medium
How to Structure your Data Science Notebook to be Easy to Follow
Clear steps to create an organized notebook, including examples
DE & ML Digest
ML
Cutting Down Implementation Time by Integrating Jupyter and KNIME
KDnuggets
Cutting Down Implementation Time by Integrating Jupyter and KNIME - KDnuggets
Are you a KNIME fan or a Jupyter fan? Well, here you don’t have to choose.
DE & ML Digest
ML
5 Things I Wish I Knew Before Becoming a Data Scientist
Medium
5 Key Things I Wish I Knew Before Becoming a Data Scientist
Things that I regret not doing earlier in my data science career.
DE & ML Digest
Big Data
ACID vs BASE: Comparison of two Design Philosophies
Blog | iamluminousmen
ACID vs BASE: Comparison of two Design Philosophies
Discover the differences between ACID and BASE design philosophies - from strong consistency to eventual consistency. Find out which suits your project better!
DE & ML Digest
Big Data
CAP and PACELC theorems in plain English
Blog | iamluminousmen
CAP and PACELC Theorems in Plain English
Understand the CAP and PACELC theorems in distributed systems. Learn how to navigate tradeoffs between consistency, availability, and partition tolerance for optimal system design.
DE & ML Digest
Big Data
Architecturally Significant Requirements
Blog | iamluminousmen
Architecturally Significant Requirements
Discover the crucial Architecturally Significant Requirements (ASR) for distributed systems, including Availability, Durability, Resiliency, Reliability, and Scalability. Learn how these factors impact system design and performance.
DE & ML Digest
Big Data
Explaining the mechanics of Spark caching
Blog | iamluminousmen
Explaining the mechanics of Spark caching
Caching... There is so much in that word - the pain of invalidation and the joy of reusing computation. In Spark, this is known as an optimization technique
DE & ML Digest
Big Data
Machine Learning types
Blog | iamluminousmen
Machine Learning types
Machine Learning is based on the idea that analytic systems can learn to identify patterns and make decisions with minimal human involvement
DE & ML Digest
Big Data
What is Serverless Architecture and what are its benefits?
Blog | iamluminousmen
What is Serverless Architecture and what are its benefits?
So much hype around serverless architectures but what it's really bringing to the table for us? Is it the next standard in application development?
DE & ML Digest
Big Data
Get Hive count in seconds
DE & ML Digest
Big Data
Databricks’ Open Source Genomics Toolkit Outperforms Leading Tools
Databricks
How Glow Performs Genetic Association Studies 10x More Efficiently Than Hail
Learn more about Glow, the open-source toolkit for genomics data analytics that scales to population levels and the testing and benchmarking that shows it is up to 10x faster than competitors.
DE & ML Digest
Big Data
Ray on Databricks
Databricks
How to Use Ray, a Distributed Python Framework, on Databricks
Learn how to use Ray on the Databricks Lakehouse Platform for reinforcement learning and with custom distributed Python pipeline new use cases and optimizations.
DE & ML Digest
Big Data
Things to consider while running Google Cloud Dataproc
Blog | iamluminousmen
Things to consider while running Google Cloud Dataproc
There are many pitfalls that inexperienced engineers may encounter when building pipelines based on Cloud Dataproc, let's look into them.
DE & ML Digest
Big Data
Data Stream Processing
dzone.com
Data Stream Processing - DZone Big Data
In this post, we'll explore unbounded data, what data stream processing is, its characteristics, workflow, and why should we leverage the cloud for it?
DE & ML Digest
Big Data
Scaling With Presto on Spark
dzone.com
Scaling With Presto on Spark - DZone Big Data
Presto on Spark enables more use cases for data analytics, providing a unified SQL experience for both interactive and batch use cases.
DE & ML Digest
Big Data
Spark tips. Caching
Blog | iamluminousmen
Spark Tips. Caching
Another portion of tips to Apache Spark usage, now it's about caching and checkpointing data
DE & ML Digest
Big Data
HDFS vs Cloud-based Object storage(S3)
Blog | iamluminousmen
HDFS vs Cloud-based Object storage(S3)
I am very annoyed that all sorts of big data engineers confuse S3 and HDFS systems, assuming that S3 is the same as HDFS. That’s not true.
DE & ML Digest
Big Data
Announcing Amazon SageMaker Ground Truth Plus – Create Training Datasets Without Code or In-house Resources
Amazon
Announcing Amazon SageMaker Ground Truth Plus – Create Training Datasets Without Code or In-house Resources | Amazon Web Services
Today, we’re pleased to announce the latest service in the Amazon SageMaker suite that will make labeling datasets easier than ever before. Ground Truth Plus is a turn-key service that uses an expert workforce to deliver high-quality training datasets fast…
TWeb.init({scrollToPost:'mlbigdata/35022'});