DE & ML Digest
@mlbigdata
120
subscribers
9
photos
1
video
34.7K
links
Collection of all articles on Data Engineering and Machine Learning
Contact -
@luminousmen
Download Telegram
Join
DE & ML Digest
120 subscribers
DE & ML Digest
Big Data
Ray on Databricks
Databricks
How to Use Ray, a Distributed Python Framework, on Databricks
Learn how to use Ray on the Databricks Lakehouse Platform for reinforcement learning and with custom distributed Python pipeline new use cases and optimizations.
DE & ML Digest
Big Data
Databricks’ Open Source Genomics Toolkit Outperforms Leading Tools
Databricks
How Glow Performs Genetic Association Studies 10x More Efficiently Than Hail
Learn more about Glow, the open-source toolkit for genomics data analytics that scales to population levels and the testing and benchmarking that shows it is up to 10x faster than competitors.
DE & ML Digest
Big Data
Thinking of Models as Graphs
dzone.com
Thinking of Models as Graphs - DZone Big Data
This tutorial shows the first step in big data visualization analysis: ingesting your data! Read more to find out how to do it yourself!
DE & ML Digest
Big Data
Use Cases for Apache Kafka in the Public Sector
dzone.com
Use Cases for Apache Kafka in the Public Sector - DZone Big Data
This blog series is about data in motion with Apache Kafka in the Public Sector, including smart city, citizen services, energy, and national security.
DE & ML Digest
Big Data
How Technical Operations Can Build on the Success of Data Science Notebooks
dzone.com
How Technical Operations Can Build on the Success of Data Science Notebooks - DZone Big Data
This article discusses data science notebooks, a popular document format for publishing code, results, and more. Find out more below!
DE & ML Digest
Big Data
[Перевод] Способы обеспечения качества данных для машинного обучения
Хабр
Способы обеспечения качества данных для машинного обучения
Данные — это душа каждой модели машинного обучения. В этой статье мы расскажем о том, почему лучшие команды мира, занимающиеся машинным обучением, тратят больше 80% своего времени на улучшение...
DE & ML Digest
Big Data
Building an Internal Metrics Dashboard
dzone.com
Building an Internal Metrics Dashboard - DZone Big Data
After you're done reading this, you'll be blown away by how simple it is. I'll have a dashboard ready in less than 10 minutes. Yes, this is what you'll get!
DE & ML Digest
Big Data
Пойдем другим путем: как прямо сейчас меняется направление потоков данных
Хабр
Пойдем другим путем: как прямо сейчас меняется направление потоков данных
Современный бизнес не может обойтись без постоянного притока свежей информации. Но получить информацию недостаточно, ее необходимо обработать и проанализировать. Причем сделать это нужно в максимально...
DE & ML Digest
Big Data
Как Business Intelligence «купается» в озёрах данных: практика платформы «Форсайт»
Хабр
Как Business Intelligence «купается» в озёрах данных: практика платформы «Форсайт»
Часть 1. Технология гетерогенных ROLAP-кубовВсем привет. В этой публикации мы начнем рассказ о том, как наша BI-платформа «Форсайт» работает с данными. Как организовано взаимодействие платформы с СУБД...
DE & ML Digest
Big Data
Apache Kafka in the Public Sector - Part 2: Smart City
dzone.com
Apache Kafka in the Public Sector - Part 2: Smart City - DZone IoT
Blog series about data in motion with Apache Kafka in the Public Sector. This is part 2: Use Cases and Architectures for a Smart City.
DE & ML Digest
Big Data
Геномика. Информатика для биологов
Хабр
Геномика. Информатика для биологов
Автор сообщества Фанерозой , биотехнолог, Людмила Хигерович . На дворе двадцать первый век, стремительными темпами информационные технологии захватывают все больше сфер нашей жизни, включая науку. С...
DE & ML Digest
Big Data
How (and Why) to Move from Spark on YARN to Kubernetes
dzone.com
Moving From Spark on YARN to Kubernetes - DZone Big Data
Find out what makes Kubernetes a popular choice for managing containerized applications and learn how you can make the switch from YARN.
DE & ML Digest
Big Data
Announcing Amazon SageMaker Ground Truth Plus
Amazon
Announcing Amazon SageMaker Ground Truth Plus – Create Training Datasets Without Code or In-house Resources | Amazon Web Services
Today, we’re pleased to announce the latest service in the Amazon SageMaker suite that will make labeling datasets easier than ever before. Ground Truth Plus is a turn-key service that uses an expert workforce to deliver high-quality training datasets fast…
DE & ML Digest
Big Data
Scaling With Presto on Spark
dzone.com
Scaling With Presto on Spark - DZone Big Data
Presto on Spark enables more use cases for data analytics, providing a unified SQL experience for both interactive and batch use cases.
DE & ML Digest
Big Data
New DynamoDB Table Class – Save Up To 60% in Your DynamoDB Costs
Amazon
New DynamoDB Table Class – Save Up To 60% in Your DynamoDB Costs | Amazon Web Services
Today we are announcing Amazon DynamoDB Standard-Infrequent Access (DynamoDB Standard-IA). A new table class for DynamoDB that reduces storage costs by 60 percent compared to existing DynamoDB Standard tables, and that delivers the same performance, durability…
DE & ML Digest
Big Data
Data Science 'по ту сторону изгороди'
Хабр
Data Science 'по ту сторону изгороди'
Кадр из мультфильма «Over the Garden Wall» (2014) Большое количество курсов по аналитике данных и питону создает впечатление, что «два месяца курсов, пандас в руках» и ты data science специалист,...
DE & ML Digest
Big Data
New – Amazon RDS Custom for SQL Server Is Generally Available
Amazon
New – Amazon RDS Custom for SQL Server Is Generally Available | Amazon Web Services
On October 26, 2021, we launched Amazon RDS Custom for Oracle, a managed database service for applications that require customization of the underlying operating system and database environment. RDS Custom lets you access and customize your database server…
DE & ML Digest
Big Data
Data labeling will fuel the AI revolution
VentureBeat
Data labeling will fuel the AI revolution
Labeling data for AI using the right techniques helps companies make better decisions and has a measurable impact on business success.
DE & ML Digest
Big Data
An Example of Pushdown Using SingleStore and Spark
dzone.com
An Example of Pushdown Using SingleStore and Spark - DZone Big Data
See an example of query Pushdown using the SingleStore Spark Connector and, in this first article, look at how to load some weather data into SingleStore.
DE & ML Digest
ML
A Gentle Introduction to Vector Space Models
MachineLearningMastery.com
A Gentle Introduction to Vector Space Models - MachineLearningMastery.com
Vector space models are to consider the relationship between data that are represented by vectors. It is popular in information retrieval systems but also useful for other purposes. Generally, this allows us to compare the similarity of two vectors from a…
DE & ML Digest
ML
Using Singular Value Decomposition to Build a Recommender System
TWeb.init({scrollToPost:'mlbigdata/34629'});