DE & ML Digest
@mlbigdata
120
subscribers
9
photos
1
video
34.7K
links
Collection of all articles on Data Engineering and Machine Learning
Contact -
@luminousmen
Download Telegram
Join
DE & ML Digest
120 subscribers
DE & ML Digest
Big Data
Explaining the mechanics of Spark caching
Blog | iamluminousmen
Explaining the mechanics of Spark caching
Caching... There is so much in that word - the pain of invalidation and the joy of reusing computation. In Spark, this is known as an optimization technique
DE & ML Digest
Big Data
HDFS vs Cloud-based Object storage(S3)
Blog | iamluminousmen
HDFS vs Cloud-based Object storage(S3)
I am very annoyed that all sorts of big data engineers confuse S3 and HDFS systems, assuming that S3 is the same as HDFS. That’s not true.
DE & ML Digest
Big Data
Get Hive count in seconds
DE & ML Digest
Big Data
What is Serverless Architecture and what are its benefits?
Blog | iamluminousmen
What is Serverless Architecture and what are its benefits?
So much hype around serverless architectures but what it's really bringing to the table for us? Is it the next standard in application development?
DE & ML Digest
Big Data
Spark tips. Caching
Blog | iamluminousmen
Spark Tips. Caching
Another portion of tips to Apache Spark usage, now it's about caching and checkpointing data
DE & ML Digest
Big Data
Things to consider while running Google Cloud Dataproc
Blog | iamluminousmen
Things to consider while running Google Cloud Dataproc
There are many pitfalls that inexperienced engineers may encounter when building pipelines based on Cloud Dataproc, let's look into them.
DE & ML Digest
Big Data
MLflow for Bayesian Experiment Tracking
Databricks
MLflow for Bayesian Experiment Tracking
Learn how to use MLflow for performing reproducible Bayesian experiments.
DE & ML Digest
Big Data
Introducing Apache Spark
™
3.2
Databricks
Introducing Apache Spark
™
3.2
Learn more about the latest release of Apache Spark
™
, version 3.2, including pandas API on Spark, Adaptive Query Execution, and ANSI mode and how you can begin using it through Databricks Runtime 10.0.
DE & ML Digest
Big Data
GPU-accelerated Sentiment Analysis Using Pytorch and Huggingface on Databricks
Databricks
GPU-accelerated Sentiment Analysis Using Pytorch and Huggingface on Databricks
Learn more about the Pytorch-based GPU-accelerated sentiment analysis package from Huggingface and how it leverages the Databricks platform to simplify and scale the analysis of text for sentiment.
DE & ML Digest
Big Data
Management is not a promotion
Blog | iamluminousmen
Management is not a promotion
Leadership skills != subject matter expertise
DE & ML Digest
Big Data
Turning 2 Trillion Data Points of Traffic Intelligence into Critical Business Insights
Databricks
Simplifying Geospatial Data Analysis With Python Using Databricks
Learn how we optimized our workflow at Intelematics by creating a library to work with geospatial data using python on the Databricks Lakehouse platform.
DE & ML Digest
Big Data
Moneyball 2.0: Real-time Decision Making With MLB’s Statcast Data
Databricks
Moneyball 2.0: Real-time Decision Making With MLB’s Statcast Data
The Oakland Athletic
DE & ML Digest
Big Data
New to Data Analysis- need advice
reddit
New to Data Analysis- need advice
22F here- working through my MBA while working full time as a senior purchasing specialist. My school (not an ivy league, but a good school...
DE & ML Digest
Big Data
Identity verification startup Incode raises $220M
VentureBeat
Identity verification startup Incode raises $220M
Incode, a company developing an AI-powered identity verification product, has raised $220 million in venture capital.
DE & ML Digest
Big Data
AI-powered customer engagement platform MoEngage raises $30M to grow operations
VentureBeat
AI-powered customer engagement platform MoEngage raises $30M to grow operations
MoEngage, a startup developing an AI-driven customer engagement platform, has closed a $30 million venture capital round.
DE & ML Digest
Big Data
[Перевод] Что такое хранилище признаков?
Хабр
Что такое хранилище признаков?
Об авторах: Майк Дель Бальсо, генеральный директор и сооснователь компании Tecton Виллем Пиенаар, создатель хранилища признаков FeastСпециалисты по работе с данными постепенно понимают, что для...
DE & ML Digest
Big Data
VentureBeat
Dragonboat nabs $12M to help companies manage product development
Dragonboat, a startup developing a product management platform, has raised $12 million in venture capital funding.
DE & ML Digest
ML
PyCaret 2.3.5 Is Here! Learn What’s New
KDnuggets
PyCaret 2.3.5 Is Here! Learn What’s New - KDnuggets
Read about the new functionalities added in PyCaret’s recent release.
DE & ML Digest
ML
A Spreadsheet that Generates Python: The Mito JupyterLab Extension
KDnuggets
A Spreadsheet that Generates Python: The Mito JupyterLab Extension - KDnuggets
You can call Mito into your Jupyter Environment and each edit you make will generate the equivalent Python in the code cell below.
DE & ML Digest
ML
How to Build a Knowledge Graph with Neo4J and Transformers
KDnuggets
How to Build a Knowledge Graph with Neo4J and Transformers
Learn to use custom Named Entity Recognition and Relation Extraction models.
DE & ML Digest
ML
Sentiment Analysis with KNIME
KDnuggets
Sentiment Analysis with KNIME - KDnuggets
Check out this tutorial on how to approach sentiment classification with supervised machine learning algorithms.
TWeb.init({scrollToPost:'mlbigdata/34743'});