😎Neural networks of the Cloud Mail.ru service help preserve memories
ML algorithms will automatically find pictures taken on a specific day and display them as stories to generate a custom photo calendar illustrating memorable events. Thanks to image recognition methods, only successful frames will be included in the result. And if the user doesn't like the picture, it can be removed directly from the video story. You can share the animated photo gallery with your friends by sending it a message or share it on VK and Instagram. The update is already available in the app for iOS and Android. https://corp.mail.ru/ru/press/releases/10947/
ML algorithms will automatically find pictures taken on a specific day and display them as stories to generate a custom photo calendar illustrating memorable events. Thanks to image recognition methods, only successful frames will be included in the result. And if the user doesn't like the picture, it can be removed directly from the video story. You can share the animated photo gallery with your friends by sending it a message or share it on VK and Instagram. The update is already available in the app for iOS and Android. https://corp.mail.ru/ru/press/releases/10947/
vk.company
VK / Нейросеть Облако Mail.ru выберет лучшие фото для соцсетей
С помощью алгоритмов машинного обучения сервис Облако автоматически отбирает снимки, сделанные в определённый день, и отображает их в привычном формате сторис. Так у пользователя формируется своеобразный фотокалендарь, иллюстрирующий памятные события.
…
…
💦3 main items to build ML-pipeline
There are only 3 basic tools to build an effective machine learning pipeline:
• Feature Store to handle offline and online feature conversions. It support the version-control and integration with data lakes and DWH. It also enables fast service and rapid deployment of code in production. For example, Tecton, Hopsworks, Michelangelo Palette, Zipline, Feature Store from Amazon SageMaker and Databricks.
• Model Store as a central registry of models and the use of experiments. It provides version reproducibility and tracking history of ML models and related artifacts such as Git commits, pickle files, scores, regression, etc. Examples: Weights and Biases, MLFlow, Neptune.ai, EthicalML, and solutions by Amazon, Azure, Google.
• Evaluation Store for monitoring and improving the performance of models. It identifies performance metrics for each ML model in any environment, from training to production, including A/B testing tools and visual dashboard. For example, Arize and Neptune.ai.
Additionally, the data annotation platforms (Appen), ML-model maintenance (Kubeflow, Algorithmia) and AI-orchestration (Spell) will be useful for the system of all teams participating in MLOps-processes.
https://towardsdatascience.com/the-only-3-ml-tools-you-need-1aa750778d33
There are only 3 basic tools to build an effective machine learning pipeline:
• Feature Store to handle offline and online feature conversions. It support the version-control and integration with data lakes and DWH. It also enables fast service and rapid deployment of code in production. For example, Tecton, Hopsworks, Michelangelo Palette, Zipline, Feature Store from Amazon SageMaker and Databricks.
• Model Store as a central registry of models and the use of experiments. It provides version reproducibility and tracking history of ML models and related artifacts such as Git commits, pickle files, scores, regression, etc. Examples: Weights and Biases, MLFlow, Neptune.ai, EthicalML, and solutions by Amazon, Azure, Google.
• Evaluation Store for monitoring and improving the performance of models. It identifies performance metrics for each ML model in any environment, from training to production, including A/B testing tools and visual dashboard. For example, Arize and Neptune.ai.
Additionally, the data annotation platforms (Appen), ML-model maintenance (Kubeflow, Algorithmia) and AI-orchestration (Spell) will be useful for the system of all teams participating in MLOps-processes.
https://towardsdatascience.com/the-only-3-ml-tools-you-need-1aa750778d33
Medium
The Only 3 ML Tools You Need
At a rapid pace, many machine learning techniques have moved from proof of concepts to powering crucial pieces of technology that people…
😎10 the most interesting DS-conferences all over the world in August 2021
09.08 – 2nd Workshop on Knowledge Guided Machine Learning (KGML2021). Online event by University of Minnesota https://sites.google.com/umn.edu/kgmlworkshop/workshop
09.08 – International Conference on Sports Analytics and Data Science. New York, United States. https://waset.org/sports-analytics-and-data-science-conference-in-august-2021-in-new-york
11.08 - ML Data Engineering Community Online-meetup by Tecton. Feature Store, Streaming Architecture, MLOps and other DS-themes. Free registration https://www.applyconf.com/
14.08 - KDD 2021, the premier interdisciplinary data science conference in Singapore. Online https://kdd.org/kdd2021/
14.08 - Fragile Earth 2021, develop radically new technological foundations for advancing and meeting the Sustainable Development Goals. Online annual workshop is part of the Earth Day events at ACM’s KDD 2021 Conference on research in Machine Learning and its applications. https://ai4good.org/fragile-earth-2021/
17.08 - Ai4 2021. Online-conference brings together business leaders and data practitioners to facilitate the adoption of AI and ML technology. https://ai4.io/2021/
19.08 - IJCAI-21: 30th International Joint Conference on Artificial Intelligence. Montreal-themed Virtual Reality, Online. https://ijcai-21.org/
25.08 – Data Science Salon, Applying ML and AI to Retail and Ecommerce. Online https://www.datascience.salon/retail-and-ecommerce/
25.08 – DataOps Virtual Event – Zaloni Company, who is the vendor of Arena DataOps platform, invites CDO and lead DataOps Engineers from AWS, KPMG, PWC and others to provide modern experience of data management and engineering in different business areas. Free registration https://www.zaloni.com/dataops-virtual-event-second-annual/
26.08 – International Conference on Smart Technologies in Data Science and Communication. Paris, France. https://waset.org/smart-technologies-in-data-science-and-communication-conference-in-august-2021-in-paris
09.08 – 2nd Workshop on Knowledge Guided Machine Learning (KGML2021). Online event by University of Minnesota https://sites.google.com/umn.edu/kgmlworkshop/workshop
09.08 – International Conference on Sports Analytics and Data Science. New York, United States. https://waset.org/sports-analytics-and-data-science-conference-in-august-2021-in-new-york
11.08 - ML Data Engineering Community Online-meetup by Tecton. Feature Store, Streaming Architecture, MLOps and other DS-themes. Free registration https://www.applyconf.com/
14.08 - KDD 2021, the premier interdisciplinary data science conference in Singapore. Online https://kdd.org/kdd2021/
14.08 - Fragile Earth 2021, develop radically new technological foundations for advancing and meeting the Sustainable Development Goals. Online annual workshop is part of the Earth Day events at ACM’s KDD 2021 Conference on research in Machine Learning and its applications. https://ai4good.org/fragile-earth-2021/
17.08 - Ai4 2021. Online-conference brings together business leaders and data practitioners to facilitate the adoption of AI and ML technology. https://ai4.io/2021/
19.08 - IJCAI-21: 30th International Joint Conference on Artificial Intelligence. Montreal-themed Virtual Reality, Online. https://ijcai-21.org/
25.08 – Data Science Salon, Applying ML and AI to Retail and Ecommerce. Online https://www.datascience.salon/retail-and-ecommerce/
25.08 – DataOps Virtual Event – Zaloni Company, who is the vendor of Arena DataOps platform, invites CDO and lead DataOps Engineers from AWS, KPMG, PWC and others to provide modern experience of data management and engineering in different business areas. Free registration https://www.zaloni.com/dataops-virtual-event-second-annual/
26.08 – International Conference on Smart Technologies in Data Science and Communication. Paris, France. https://waset.org/smart-technologies-in-data-science-and-communication-conference-in-august-2021-in-paris
Google
Workshop
Background Call for Posters Agenda Confirmed Speakers Organizers Inaugural Workshop Register HERE!
Quicklinks to session details: Opening Session (ML1) Weather and Climate Aquatic Sciences Hydrology…
Quicklinks to session details: Opening Session (ML1) Weather and Climate Aquatic Sciences Hydrology…
🙌🏻🚗On July 22, 2021, Yandex opened the world's largest dataset of self-driving vehicles: more than 1600 hours of movement, divided into 600,000 marked-up fragments of trips on the roads of Russia, Israel and the United States in different weather conditions. The dataset was published for the Shifts Challenge at the international conference NeurIPS 2021 in order to draw attention to the problem of "data shift" in machine learning and reduce the uncertainty of applying ML-models in new conditions. All data are depersonalized. The dataset contains high-precision route maps and tracks of all surrounding cars and pedestrians (their position, speed, acceleration, etc.), without personal data (car numbers or faces of people). Participants have to train ML-algorithms on the provided data and check the quality of their work under shear conditions. Algorithm developers with the best quality will receive cash prizes of 5, 3 and 1 thousand dollars.
https://research.yandex.com/shifts
https://github.com/yandex-research/shifts
https://research.yandex.com/shifts
https://github.com/yandex-research/shifts
Shifts Challenge: Robustness and Uncertainty under Real-World Distributional Shift
We invite researchers and machine learning practitioners from all over the world to participate in our NeurIPS 2021 Shifts Challenge on robustness and uncertainty under real-world distributional shift.
👆🏻What is AUC - ROC Curve and why it is so important to evaluate quality of ML-model?
Area Under the Receiver Operating Characteristics is evaluation metric is used to check or visualize the performance of the multi-class classification problem.
AUC - ROC curve measures a performance of the classification at various threshold settings. ROC is a probability curve and AUC represents the degree or measure of separability. It tells how much the model is capable of distinguishing between classes. Higher the AUC, the better the model is at predicting 0 classes as 0 and 1 classes as 1. Higher the AUC, the better the model is at distinguishing between patients with the disease and no disease.
An excellent model has AUC near to the 1 which means it has a good measure of separability. A poor model has an AUC near 0 which means it has the worst measure of separability. In fact, it means it is reciprocating the result. It is predicting 0s as 1s and 1s as 0s. And when AUC is 0.5, it means the model has no class separation capacity whatsoever.
Sensitivity and Specificity are inversely proportional to each other. So when we increase Sensitivity, Specificity decreases, and vice versa. When we decrease the threshold, we get more positive values thus it increases the sensitivity and decreasing the specificity. Similarly, when we increase the threshold, we get more negative values thus we get higher specificity and lower sensitivity.
In a multi-class model, we can plot the N number of AUC ROC Curves for N number classes using the One vs ALL methodology. So for example, If you have three classes named X, Y, and Z, you will have one ROC for X classified against Y and Z, another ROC for Y classified against X and Z, and the third one of Z classified against Y and X.
https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5
Area Under the Receiver Operating Characteristics is evaluation metric is used to check or visualize the performance of the multi-class classification problem.
AUC - ROC curve measures a performance of the classification at various threshold settings. ROC is a probability curve and AUC represents the degree or measure of separability. It tells how much the model is capable of distinguishing between classes. Higher the AUC, the better the model is at predicting 0 classes as 0 and 1 classes as 1. Higher the AUC, the better the model is at distinguishing between patients with the disease and no disease.
An excellent model has AUC near to the 1 which means it has a good measure of separability. A poor model has an AUC near 0 which means it has the worst measure of separability. In fact, it means it is reciprocating the result. It is predicting 0s as 1s and 1s as 0s. And when AUC is 0.5, it means the model has no class separation capacity whatsoever.
Sensitivity and Specificity are inversely proportional to each other. So when we increase Sensitivity, Specificity decreases, and vice versa. When we decrease the threshold, we get more positive values thus it increases the sensitivity and decreasing the specificity. Similarly, when we increase the threshold, we get more negative values thus we get higher specificity and lower sensitivity.
In a multi-class model, we can plot the N number of AUC ROC Curves for N number classes using the One vs ALL methodology. So for example, If you have three classes named X, Y, and Z, you will have one ROC for X classified against Y and Z, another ROC for Y classified against X and Z, and the third one of Z classified against Y and X.
https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5
Medium
Understanding AUC - ROC Curve
In Machine Learning, performance measurement is an essential task. So when it comes to a classification problem, we can count on an AUC - ROC Curve. When we need to check or visualize the performance…
🚗Yandex robots will deliver food to American students
On July 6, 2021, Yandex entered into a cooperation agreement with the American food delivery service Grubhub to deliver food to US student campuses using Rovers. Developed by Yandex, these autonomous courier robots are based on self-driving car technology and can operate in any weather 24/7. Rovers drive on sidewalks and cross the roads at pedestrian crossings. Since the beginning of 2021, in Russia robots have brought thousands of orders from Yandex.Food and Yandex.Lavka. And since April, they have been delivering orders from restaurants in the American city of Ann Arbor, Michigan.
https://yandex.ru/company/press_releases/2021/07-06-2021
On July 6, 2021, Yandex entered into a cooperation agreement with the American food delivery service Grubhub to deliver food to US student campuses using Rovers. Developed by Yandex, these autonomous courier robots are based on self-driving car technology and can operate in any weather 24/7. Rovers drive on sidewalks and cross the roads at pedestrian crossings. Since the beginning of 2021, in Russia robots have brought thousands of orders from Yandex.Food and Yandex.Lavka. And since April, they have been delivering orders from restaurants in the American city of Ann Arbor, Michigan.
https://yandex.ru/company/press_releases/2021/07-06-2021
Компания Яндекс
Роботы Яндекса займутся доставкой еды в кампусах американских университетов
Яндекс заключил соглашение о сотрудничестве с американским сервисом доставки еды Grubhub. Компания станет партнёром Grubhub по роботизированной доставке в кампусах — студенческих городках при колледжах и университетах в США. Осуществлять доставку будут Роверы…
✈️2nd release TF-Ranking by Google AI
In December 2018, Google AI introduced TF-Ranking, an open-source library based on TensorFlow for developing scalable neural ranking models (LTR, learning-to-rank) that help get an ordered list of items in response to a user queries. Unlike standard classification models, which classify one item at a time, LTR models take a complete list of items as input and look for an order that maximizes the usefulness of the entire list. These LTR models are most common in search and recommendation systems, but TF-Ranking is also used in e-commerce, building smart spaces and cities.
In May 2021, Google AI released its second TF-Ranking release, which provides full support for built-in LTR model building using Keras, the high-level TensorFlow 2 API. The Keras ranking model has a new workflow design, incl. flexible ModelBuilder and DatasetBuilder for customizing the training set, and a pipeline for training the model. Also this version of TF-Ranking supports RaggedTensors, Orbit training library and many more improvements.
And thanks to an in-depth study of the capabilities of the TF-Ranking library, the Google AI team has created a Data Augmented Self-Attentive Latent Cross (DASALC) model that combines transformation of neural network features with data enrichment, ensemble methods, and loss ranking. DASALC eliminates the disadvantages of LTR models and gradient boosting decision trees, while retaining the advantages of these methods.
https://ai.googleblog.com/2021/07/advances-in-tf-ranking.html
https://research.google/pubs/pub50030/
In December 2018, Google AI introduced TF-Ranking, an open-source library based on TensorFlow for developing scalable neural ranking models (LTR, learning-to-rank) that help get an ordered list of items in response to a user queries. Unlike standard classification models, which classify one item at a time, LTR models take a complete list of items as input and look for an order that maximizes the usefulness of the entire list. These LTR models are most common in search and recommendation systems, but TF-Ranking is also used in e-commerce, building smart spaces and cities.
In May 2021, Google AI released its second TF-Ranking release, which provides full support for built-in LTR model building using Keras, the high-level TensorFlow 2 API. The Keras ranking model has a new workflow design, incl. flexible ModelBuilder and DatasetBuilder for customizing the training set, and a pipeline for training the model. Also this version of TF-Ranking supports RaggedTensors, Orbit training library and many more improvements.
And thanks to an in-depth study of the capabilities of the TF-Ranking library, the Google AI team has created a Data Augmented Self-Attentive Latent Cross (DASALC) model that combines transformation of neural network features with data enrichment, ensemble methods, and loss ranking. DASALC eliminates the disadvantages of LTR models and gradient boosting decision trees, while retaining the advantages of these methods.
https://ai.googleblog.com/2021/07/advances-in-tf-ranking.html
https://research.google/pubs/pub50030/
research.google
Advances in TF-Ranking
Posted by Michael Bendersky and Xuanhui Wang, Software Engineers, Google Research In December 2018, we introduced TF-Ranking, an open-source T...
💦🏸What is multi-task machine learning?
Usually one ML-model solves one problem, for example, image classification or text synthesis. This is called single-task learning (STL). But some models allow you to make several types of predictions on one sample, for example, image classification and semantic segmentation. This is already multitasking training (MTL, Multi-task learning). The main advantages of MTL are as follows:
• smaller training sample for each of the individual tasks due to the enlargement of the total data set;
• improved generalization of the model - information from related tasks increases the ability of the model to extract useful data from the dataset and reduces overfitting;
• shortening the training duration - instead of wasting time training multiple models to solve multiple problems, a single model is trained;
• reduced requirements for hardware resources - ML-models have many parameters that need to be stored in RAM. Therefore, for devices with limited computing power, IoT, it is better to have one MTL with some common parameters, rather than several STL models that perform a number of related tasks.
The downside to these benefits is performance degradation. During MTL training, tasks can compete with each other. For example, when instance segmentation (segmentation of a separate mask for each distinct object in an image) is trained along with semantic segmentation (classification of objects at the pixel level), the latter task often dominates unless a task balancing mechanism is used.
In addition, the MTL loss function is more complex as a result of summing the individual losses, making optimization difficult. This is where the so-called negative transmission effect occurs when performing multiple tasks, and separate STL models can perform better than a single MTL.
Looking ahead, multitasking machine learning is great for natural language processing and medical research, but the current implementation of this approach does not fully cover its current drawbacks.
https://thegradient.pub/how-to-do-multi-task-learning-intelligently/
Usually one ML-model solves one problem, for example, image classification or text synthesis. This is called single-task learning (STL). But some models allow you to make several types of predictions on one sample, for example, image classification and semantic segmentation. This is already multitasking training (MTL, Multi-task learning). The main advantages of MTL are as follows:
• smaller training sample for each of the individual tasks due to the enlargement of the total data set;
• improved generalization of the model - information from related tasks increases the ability of the model to extract useful data from the dataset and reduces overfitting;
• shortening the training duration - instead of wasting time training multiple models to solve multiple problems, a single model is trained;
• reduced requirements for hardware resources - ML-models have many parameters that need to be stored in RAM. Therefore, for devices with limited computing power, IoT, it is better to have one MTL with some common parameters, rather than several STL models that perform a number of related tasks.
The downside to these benefits is performance degradation. During MTL training, tasks can compete with each other. For example, when instance segmentation (segmentation of a separate mask for each distinct object in an image) is trained along with semantic segmentation (classification of objects at the pixel level), the latter task often dominates unless a task balancing mechanism is used.
In addition, the MTL loss function is more complex as a result of summing the individual losses, making optimization difficult. This is where the so-called negative transmission effect occurs when performing multiple tasks, and separate STL models can perform better than a single MTL.
Looking ahead, multitasking machine learning is great for natural language processing and medical research, but the current implementation of this approach does not fully cover its current drawbacks.
https://thegradient.pub/how-to-do-multi-task-learning-intelligently/
The Gradient
How to Do Multi-Task Learning Intelligently
On new multi-task learning methods that automatically learn what to learn together
🙌🏻Apache Superset - open source framework for BI and DS analysis
Becoming an Apache project in 2017, Superset is a powerful BI big data visualization tool that allows users to quickly and easily create dashboards using simple code, a free visualization designer, and the advanced SQL editor. For corporate use, it is especially important to support different authentication backends (OpenID, LDAP, OAuth, REMOTE_USER) and integration with many SQL-based DBMSs through the SQLAlchemy library. Superset is based on Python, so to use it, you should first install the Anaconda distribution, which includes a set of required DS libraries. Airbnb, Netflix, Twitter, Yahoo! and many other companies, including Superset in their DS projects. https://superset.apache.org/
Becoming an Apache project in 2017, Superset is a powerful BI big data visualization tool that allows users to quickly and easily create dashboards using simple code, a free visualization designer, and the advanced SQL editor. For corporate use, it is especially important to support different authentication backends (OpenID, LDAP, OAuth, REMOTE_USER) and integration with many SQL-based DBMSs through the SQLAlchemy library. Superset is based on Python, so to use it, you should first install the Anaconda distribution, which includes a set of required DS libraries. Airbnb, Netflix, Twitter, Yahoo! and many other companies, including Superset in their DS projects. https://superset.apache.org/
superset.apache.org
Welcome | Superset
Community website for Apache Superset™, a data visualization and data exploration platform
🎯TOP 3 papers from the International Conference on Learning Representations 2021: a brief overview from Zeta Alpha
With the help of its own AI Research Navigator, Zeta Alpha compiled a snippet of over 800 ICLR 2021 reports based on citation and author popularity.
1. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Alexey Dosovitsky, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, etc.). Transformers applied directly to slices of images and pretrained on large datasets work well for image classification and can outperform the best CNNs on large images. https://openreview.net/forum?id=YicbFdNTTy
2. Rethinking Attention with Performers (Krzysztof Choromanski, Valery Likhosherstov, David Dohan, Sinu Song, Andrea Gein, Tamas Sarlos, Peter Hawkins, Jared Davis, etc.). Performers, full rank and attention linear transformers using provable random feature approximation methods, work efficiently without relying on sparsity or low rank. The authors propose a matrix decomposition of the self-attention mechanism into matrices below, which have a combined complexity that is linear with respect to. the length of the sequence L: O (Ld²log (d)) instead of O (L²d). https://openreview.net/forum?id=Ua6zuk0WRH
3. PMI-Masking: Principled masking of correlated spans (Yoav Levin et al.). Co-masking correlated tokens significantly speeds up and improves BERT pre-learning. Instead of randomly masking tokens, the authors identify - using only corpus statistics - token ranges that are highly correlated. To do this, they expand the point mutual information between pairs of tokens to gaps of arbitrary length and show how BERT training for this purpose is trained more efficiently than alternatives such as uniform masking, whole-word masking, random range masking, etc. This strategy works by not allowing models to predict masked words, but by forcing it to use very shallow word correlations that often appear next to each other in order to increase the degree of learning deeper correlations in natural language. https://openreview.net/forum?id=3Aoft6NWFej
Full overview of the ICLR from Zeta Alpha is here: https://www.zeta-alpha.com/post/iclr-2021-10-papers-you-shouldn-t-miss
With the help of its own AI Research Navigator, Zeta Alpha compiled a snippet of over 800 ICLR 2021 reports based on citation and author popularity.
1. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Alexey Dosovitsky, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, etc.). Transformers applied directly to slices of images and pretrained on large datasets work well for image classification and can outperform the best CNNs on large images. https://openreview.net/forum?id=YicbFdNTTy
2. Rethinking Attention with Performers (Krzysztof Choromanski, Valery Likhosherstov, David Dohan, Sinu Song, Andrea Gein, Tamas Sarlos, Peter Hawkins, Jared Davis, etc.). Performers, full rank and attention linear transformers using provable random feature approximation methods, work efficiently without relying on sparsity or low rank. The authors propose a matrix decomposition of the self-attention mechanism into matrices below, which have a combined complexity that is linear with respect to. the length of the sequence L: O (Ld²log (d)) instead of O (L²d). https://openreview.net/forum?id=Ua6zuk0WRH
3. PMI-Masking: Principled masking of correlated spans (Yoav Levin et al.). Co-masking correlated tokens significantly speeds up and improves BERT pre-learning. Instead of randomly masking tokens, the authors identify - using only corpus statistics - token ranges that are highly correlated. To do this, they expand the point mutual information between pairs of tokens to gaps of arbitrary length and show how BERT training for this purpose is trained more efficiently than alternatives such as uniform masking, whole-word masking, random range masking, etc. This strategy works by not allowing models to predict masked words, but by forcing it to use very shallow word correlations that often appear next to each other in order to increase the degree of learning deeper correlations in natural language. https://openreview.net/forum?id=3Aoft6NWFej
Full overview of the ICLR from Zeta Alpha is here: https://www.zeta-alpha.com/post/iclr-2021-10-papers-you-shouldn-t-miss
openreview.net
An Image is Worth 16x16 Words: Transformers for Image Recognition...
While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied...
Forwarded from Big Data Science [RU]
Reminder!
Сегодня! 12 августа в 18:00 будет проходить первый митап из серии Citymobil Data Meet-up!
Будем говорить про логистику, городские данные и технологии умных городов, обсудим роль геоданных и проблемы, которые с ними возникают.
Присоединяйтесь к нам, будем разбираться вместе))
Выступают:
- Артем Солоухин (Ситимобил)
- Андрей Критилин (ЦИАН)
- Фёдор Лаврентьев (Яндекс Go)
Не забудьте подготовить вопросы спикерам, после докладов будем общаться и у вас будет возможность их задать 🙂
Ссылка: https://tulu.la/chat/city-mobil-00002d/meetup-0002fv
Сегодня! 12 августа в 18:00 будет проходить первый митап из серии Citymobil Data Meet-up!
Будем говорить про логистику, городские данные и технологии умных городов, обсудим роль геоданных и проблемы, которые с ними возникают.
Присоединяйтесь к нам, будем разбираться вместе))
Выступают:
- Артем Солоухин (Ситимобил)
- Андрей Критилин (ЦИАН)
- Фёдор Лаврентьев (Яндекс Go)
Не забудьте подготовить вопросы спикерам, после докладов будем общаться и у вас будет возможность их задать 🙂
Ссылка: https://tulu.la/chat/city-mobil-00002d/meetup-0002fv
How to tune hyperparameters to reliably improve ML model accuracy: a detailed guide
The ML model and its preprocessing are individual for each project: the hyperparameters depend on the data. For example, in the logistic regression algorithm there are different hyperparameters (solver, C, penalty), different combinations of which give different results. Similarly, there are tunable support vector machine parameters: gamma, C. These algorithm hyperparameters are available on the Sklearn free Python library site. However, often a developer has to create their own solutions without relying on ready-made recommendations in order to develop an ML-model with high accuracy, which depends on the best combination of hyperparameters. Read the article about testing various combinations of Grid search with and without the Sklearn library, checking the results with cross-validation and conclusions about the efficiency of utilizing CPU. https://towardsdatascience.com/evaluating-all-possible-combinations-of-hyperparameter
The ML model and its preprocessing are individual for each project: the hyperparameters depend on the data. For example, in the logistic regression algorithm there are different hyperparameters (solver, C, penalty), different combinations of which give different results. Similarly, there are tunable support vector machine parameters: gamma, C. These algorithm hyperparameters are available on the Sklearn free Python library site. However, often a developer has to create their own solutions without relying on ready-made recommendations in order to develop an ML-model with high accuracy, which depends on the best combination of hyperparameters. Read the article about testing various combinations of Grid search with and without the Sklearn library, checking the results with cross-validation and conclusions about the efficiency of utilizing CPU. https://towardsdatascience.com/evaluating-all-possible-combinations-of-hyperparameter
✍🏻SoundStream: An End-to-End Neural Audio Codec by Google AI
SoundStream is the first neural network codec to work on speech and music, while being able to run in real-time on a smartphone CPU. It is able to deliver state-of-the-art quality over a broad range of bitrates with a single trained model, which represents a significant advance in learnable codecs.
The main technical ingredient of SoundStream is a neural network, consisting of an encoder, decoder and quantizer, all of which are trained end-to-end. The encoder converts the input audio stream into a coded signal, which is compressed using the quantizer and then converted back to audio using the decoder. SoundStream leverages state-of-the-art solutions in the field of neural audio synthesis to deliver audio at high perceptual quality, by training a discriminator that computes a combination of adversarial and reconstruction loss functions that induce the reconstructed audio to sound like the uncompressed original input. Once trained, the encoder and decoder can be run on separate clients to efficiently transmit high-quality audio over a network. Evaluate SoundStream and learn more about it here
https://ai.googleblog.com/2021/08/soundstream-end-to-end-neural-audio.html
SoundStream is the first neural network codec to work on speech and music, while being able to run in real-time on a smartphone CPU. It is able to deliver state-of-the-art quality over a broad range of bitrates with a single trained model, which represents a significant advance in learnable codecs.
The main technical ingredient of SoundStream is a neural network, consisting of an encoder, decoder and quantizer, all of which are trained end-to-end. The encoder converts the input audio stream into a coded signal, which is compressed using the quantizer and then converted back to audio using the decoder. SoundStream leverages state-of-the-art solutions in the field of neural audio synthesis to deliver audio at high perceptual quality, by training a discriminator that computes a combination of adversarial and reconstruction loss functions that induce the reconstructed audio to sound like the uncompressed original input. Once trained, the encoder and decoder can be run on separate clients to efficiently transmit high-quality audio over a network. Evaluate SoundStream and learn more about it here
https://ai.googleblog.com/2021/08/soundstream-end-to-end-neural-audio.html
research.google
SoundStream: An End-to-End Neural Audio Codec
Posted by Neil Zeghidour, Research Scientist and Marco Tagliasacchi, Staff Research Scientist, Google Research Audio codecs are used to efficiently...
✈️New algorithm to manage drones by MIT
Aerospace engineers at MIT have devised an algorithm that helps drones find the fastest route around obstacles without crashing. The new algorithm combines simulations of a drone flying through a virtual obstacle course with data from experiments of a real drone flying through the same course in a physical space.
The researchers found that a drone trained with their algorithm flew through a simple obstacle course up to 20 percent faster than a drone trained on conventional planning algorithms. Interestingly, the new algorithm didn’t always keep a drone ahead of its competitor throughout the course. In some cases, it chose to slow a drone down to handle a tricky curve, or save its energy in order to speed up and ultimately overtake its rival.
https://news.mit.edu/2021/drones-speed-route-system-0810
Aerospace engineers at MIT have devised an algorithm that helps drones find the fastest route around obstacles without crashing. The new algorithm combines simulations of a drone flying through a virtual obstacle course with data from experiments of a real drone flying through the same course in a physical space.
The researchers found that a drone trained with their algorithm flew through a simple obstacle course up to 20 percent faster than a drone trained on conventional planning algorithms. Interestingly, the new algorithm didn’t always keep a drone ahead of its competitor throughout the course. In some cases, it chose to slow a drone down to handle a tricky curve, or save its energy in order to speed up and ultimately overtake its rival.
https://news.mit.edu/2021/drones-speed-route-system-0810
MIT News
System trains drones to fly around obstacles at high speeds
A new algorithm helps drones find the fastest route around obstacles without crashing. The MIT system could enable fast, nimble drones for time-critical operations such as search and rescue.
🏸FastMoE: A Fast Mixture-of-Expert Training System
Mixture-of-Expert (MoE) presents a strong potential in enlarging the size of language model to trillions of parameters. However, training trillion-scale MoE requires algorithm and system co-design for a well-tuned high performance distributed training system. Unfortunately, the only existing platform that meets the requirements strongly depends on Google's hardware (TPU) and software (Mesh Tensorflow) stack, and is not open and available to the public, especially GPU and PyTorch communities.
The FastMoE – the distributed open-source MoE training system based on PyTorch with common accelerators. The system provides a hierarchical interface for both flexible model design and easy adaption to different applications, such as Transformer-XL and Megatron-LM. Different from direct implementation of MoE models using PyTorch, the training speed is highly optimized in FastMoE by sophisticated high-performance acceleration skills. The system supports placing different experts on multiple GPUs across multiple nodes, enabling enlarging the number of experts linearly against the number of GPUs.
https://github.com/laekov/fastmoe
https://arxiv.org/abs/2103.13262
Mixture-of-Expert (MoE) presents a strong potential in enlarging the size of language model to trillions of parameters. However, training trillion-scale MoE requires algorithm and system co-design for a well-tuned high performance distributed training system. Unfortunately, the only existing platform that meets the requirements strongly depends on Google's hardware (TPU) and software (Mesh Tensorflow) stack, and is not open and available to the public, especially GPU and PyTorch communities.
The FastMoE – the distributed open-source MoE training system based on PyTorch with common accelerators. The system provides a hierarchical interface for both flexible model design and easy adaption to different applications, such as Transformer-XL and Megatron-LM. Different from direct implementation of MoE models using PyTorch, the training speed is highly optimized in FastMoE by sophisticated high-performance acceleration skills. The system supports placing different experts on multiple GPUs across multiple nodes, enabling enlarging the number of experts linearly against the number of GPUs.
https://github.com/laekov/fastmoe
https://arxiv.org/abs/2103.13262
GitHub
GitHub - laekov/fastmoe: A fast MoE impl for PyTorch
A fast MoE impl for PyTorch. Contribute to laekov/fastmoe development by creating an account on GitHub.
🔥Not only GPT-3: what is GPT-J-6B
OpenAI's powerful NLP GPT-3 algorithm is not an open source project. Therefore, other companies offer their alternative solutions. The most interesting of them is now considered GPT-J from EleutherAI with 6 billion parameters. The developers promise that GPT-J will provide more flexible and faster output than Tensorflow + TPU counterparts when performing various downstream streaming tasks.
https://6b.eleuther.ai/
https://colab.research.google.com/github/kingoflolz/mesh-transformer-jax/blob/master/colab_demo.ipynb
https://github.com/kingoflolz/mesh-transformer-jax/#gpt-j-6b
https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/
https://minimaxir.com/2021/06/gpt-j-6b/
OpenAI's powerful NLP GPT-3 algorithm is not an open source project. Therefore, other companies offer their alternative solutions. The most interesting of them is now considered GPT-J from EleutherAI with 6 billion parameters. The developers promise that GPT-J will provide more flexible and faster output than Tensorflow + TPU counterparts when performing various downstream streaming tasks.
https://6b.eleuther.ai/
https://colab.research.google.com/github/kingoflolz/mesh-transformer-jax/blob/master/colab_demo.ipynb
https://github.com/kingoflolz/mesh-transformer-jax/#gpt-j-6b
https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/
https://minimaxir.com/2021/06/gpt-j-6b/
6b.eleuther.ai
EleutherAI - text generation testing UI
EleutherAI web app testing for language models
🌸News from MIT: A New AI-Powered Probabilistic Programming Language
It can impartially assess the "fairness" of AI algorithms more accurately and faster than existing alternatives. This Sum-Product Probabilistic Language (SPPL) is a probabilistic programming system - a new area at the intersection of programming languages and AI that simplifies the development of AI solutions using probabilistic models and explanations of observable data.
SPPL offers improved flexibility and robustness through the expressiveness of the language, its precise and simple semantics, and the speed and reliability of its exact character output engine. This avoids pitfalls by limiting it to a carefully designed class of AI models, including decision tree classifiers. SPPL works by compiling probabilistic programs into a specialized data structure called a sum-product expression. However, this approach cannot analyze neural networks, although it works faster than other similar solutions. SPPL is Python-based open source project.
https://news.mit.edu/2021/exact-symbolic-artificial-intelligence-faster-better-assessment-ai-fairness-0809
https://github.com/probcomp/sppl
It can impartially assess the "fairness" of AI algorithms more accurately and faster than existing alternatives. This Sum-Product Probabilistic Language (SPPL) is a probabilistic programming system - a new area at the intersection of programming languages and AI that simplifies the development of AI solutions using probabilistic models and explanations of observable data.
SPPL offers improved flexibility and robustness through the expressiveness of the language, its precise and simple semantics, and the speed and reliability of its exact character output engine. This avoids pitfalls by limiting it to a carefully designed class of AI models, including decision tree classifiers. SPPL works by compiling probabilistic programs into a specialized data structure called a sum-product expression. However, this approach cannot analyze neural networks, although it works faster than other similar solutions. SPPL is Python-based open source project.
https://news.mit.edu/2021/exact-symbolic-artificial-intelligence-faster-better-assessment-ai-fairness-0809
https://github.com/probcomp/sppl
MIT News
Exact symbolic artificial intelligence for faster, better assessment of AI fairness
A new domain-specific artificial intelligence programming language developed at MIT allows for error-free, exact, automatic solutions to hard AI problems — and it’s thousands of times faster than alternatives. The researchers' Sum-Product Probabilistic Language…
👻What is AIOps and how it differs from MLOps
MLOps is an interdisciplinary approach to managing machine learning methods as standalone products with their own life cycle, with a focus on developing, scaling, and applying ML algorithms on an ongoing basis.
MLOps aims to bridge the gap between creating ML models and maintaining them, while AIOps focuses on automating incident management and intelligent root cause analysis.
AIOps solutions use all tracking and reporting data and logs to detect events and apply machine learning and deep learning to notify IT operations of any issues or disruptions.
The goal of AIOps is to improve the efficiency of IT operations by automating the diagnosis of events and using machine learning to pinpoint root causes. These protections provide technical teams with high quality data that is easy to understand by analyzing the distortions generated by monitoring technologies and reducing false positives by allowing them to function in decision making. AIOps goes beyond preventing downtime to include cost containment, security, and AI-powered policy compliance to improve IT operations.
MLOps helps teams choose which tools, methodologies, and documentation will help their ML models go into production, and AIOps helps teams automate their technology lifecycles.
The greatest effect is provided by the combined use of MLOps and AIOps.
https://ai.plainenglish.io/whats-the-difference-between-aiops-and-mlops-15316cfa803d
MLOps is an interdisciplinary approach to managing machine learning methods as standalone products with their own life cycle, with a focus on developing, scaling, and applying ML algorithms on an ongoing basis.
MLOps aims to bridge the gap between creating ML models and maintaining them, while AIOps focuses on automating incident management and intelligent root cause analysis.
AIOps solutions use all tracking and reporting data and logs to detect events and apply machine learning and deep learning to notify IT operations of any issues or disruptions.
The goal of AIOps is to improve the efficiency of IT operations by automating the diagnosis of events and using machine learning to pinpoint root causes. These protections provide technical teams with high quality data that is easy to understand by analyzing the distortions generated by monitoring technologies and reducing false positives by allowing them to function in decision making. AIOps goes beyond preventing downtime to include cost containment, security, and AI-powered policy compliance to improve IT operations.
MLOps helps teams choose which tools, methodologies, and documentation will help their ML models go into production, and AIOps helps teams automate their technology lifecycles.
The greatest effect is provided by the combined use of MLOps and AIOps.
https://ai.plainenglish.io/whats-the-difference-between-aiops-and-mlops-15316cfa803d
Medium
What’s the Difference Between AIOps and MLOps?
MLOps bridges the gap between data scientists and operations. AIOps focuses on incident management automation and smart root cause…
👆🏻BYOL - Bootstrap Your Own Latent
BYOL is a new approach to self-teaching image representation with 2 neural networks that interact and learn from each other. The online network learns from the representation made by the target network on the same image with various additions. The underlying BYOL architecture is existing ResNet50 or other similar architectures. Input x is padded to t and t ', which are transmitted via the online and target network separately.
The difference between online and target networks is that the former has an MLP architecture with two fully connected layers, and Relu and batchnorm in between. The online network view learns from the view generated by the target network. The online network is updated with a regression loss function whose targets are set by the target network. And the parameters of the target model are updated by the exponential moving average of the online network, allowing you to process more information and avoid decision collapse.
The performance of BYOL is in line with the comparison with the supervised learning architecture of SOTA. There is a slight performance degradation when using only random cropping as image enlargement, but BYOL performs better than SimCLR by iteratively learning from previous versions of its output without using negative pairs with the linear classifier protocol. However, the BYOL approach is not yet applicable to the tasks of processing text, video, and audio.
https://www.youtube.com/watch?v=YPfUiOMYOEE
https://ai.plainenglish.io/byol-bootstrap-your-own-latent-dacee62a3dc8
https://arxiv.org/abs/2006.07733
https://arxiv.org/abs/2010.10241
https://github.com/lucidrains/byol-pytorch
BYOL is a new approach to self-teaching image representation with 2 neural networks that interact and learn from each other. The online network learns from the representation made by the target network on the same image with various additions. The underlying BYOL architecture is existing ResNet50 or other similar architectures. Input x is padded to t and t ', which are transmitted via the online and target network separately.
The difference between online and target networks is that the former has an MLP architecture with two fully connected layers, and Relu and batchnorm in between. The online network view learns from the view generated by the target network. The online network is updated with a regression loss function whose targets are set by the target network. And the parameters of the target model are updated by the exponential moving average of the online network, allowing you to process more information and avoid decision collapse.
The performance of BYOL is in line with the comparison with the supervised learning architecture of SOTA. There is a slight performance degradation when using only random cropping as image enlargement, but BYOL performs better than SimCLR by iteratively learning from previous versions of its output without using negative pairs with the linear classifier protocol. However, the BYOL approach is not yet applicable to the tasks of processing text, video, and audio.
https://www.youtube.com/watch?v=YPfUiOMYOEE
https://ai.plainenglish.io/byol-bootstrap-your-own-latent-dacee62a3dc8
https://arxiv.org/abs/2006.07733
https://arxiv.org/abs/2010.10241
https://github.com/lucidrains/byol-pytorch
YouTube
BYOL: Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning (Paper Explained)
Self-supervised representation learning relies on negative samples to keep the encoder from collapsing to trivial solutions. However, this paper shows that negative samples, which are a nuisance to implement, are not necessary for learning good representation…
💐TOP-15 the most interesting DS-conferences all over the world in September 2021
6-7.09 (offline) and 13-15.09 (online) - AI & Big Data Expo Global, the leading Artificial Intelligence & Big Data Conference & Exhibition, at the Business Design Centre, London https://www.ai-expo.net/global/
9-10.09 – R Conference, New York, Online https://rstats.ai/nyr/
13-17.09 – Data Science Salon Miami Machine Learning & AI Meetup Week. Miami, FL, USA https://www.datascience.salon/miami-ml-meetup-week
14-16.09 - Insurance AI and Innovative Tech USA 2021 – Online Conference by Reuters https://reutersevents.com/events/analyticsusa/
- 15-16.09 - DATA festival #online https://datafestival.de/
- 15-16.09 - Open Data Science Conference, Online https://odsc.com/apac
- 20.09 – 1st Citizen Data Science Summit, Boston https://www.citizen-data-science.org/
- 20-21.09 - International Conference on Advances in Big Data and Data Sciences, Toronto, Canada https://waset.org/advances-in-big-data-and-data-sciences-conference-in-september-2021-in-toronto
- 21.09 – Data Champions Online, Canada https://dco-canada.coriniumintelligence.com/
- 22-23.09 - Big Data LDN, UK largest data & analytics event, Olympia London, UK https://bigdataldn.com/
- 22-23.09 - RE.WORK Deep Learning Summit https://www.re-work.co/events/deep-learning-summit-research and https://www.re-work.co/events/deep-learning-summit-applications
- 28-29.09 – Chief Data & Analytics Officer, Financial Services, Online https://cdao-fs-eu.coriniumintelligence.com/
- 28-30.09 – DataOps Summit Online https://www.dataopssummit-sf.com/about/
- 30.09 - Web Data Extraction Summit 2021 by Zyte https://www.extractsummit.io/
6-7.09 (offline) and 13-15.09 (online) - AI & Big Data Expo Global, the leading Artificial Intelligence & Big Data Conference & Exhibition, at the Business Design Centre, London https://www.ai-expo.net/global/
9-10.09 – R Conference, New York, Online https://rstats.ai/nyr/
13-17.09 – Data Science Salon Miami Machine Learning & AI Meetup Week. Miami, FL, USA https://www.datascience.salon/miami-ml-meetup-week
14-16.09 - Insurance AI and Innovative Tech USA 2021 – Online Conference by Reuters https://reutersevents.com/events/analyticsusa/
- 15-16.09 - DATA festival #online https://datafestival.de/
- 15-16.09 - Open Data Science Conference, Online https://odsc.com/apac
- 20.09 – 1st Citizen Data Science Summit, Boston https://www.citizen-data-science.org/
- 20-21.09 - International Conference on Advances in Big Data and Data Sciences, Toronto, Canada https://waset.org/advances-in-big-data-and-data-sciences-conference-in-september-2021-in-toronto
- 21.09 – Data Champions Online, Canada https://dco-canada.coriniumintelligence.com/
- 22-23.09 - Big Data LDN, UK largest data & analytics event, Olympia London, UK https://bigdataldn.com/
- 22-23.09 - RE.WORK Deep Learning Summit https://www.re-work.co/events/deep-learning-summit-research and https://www.re-work.co/events/deep-learning-summit-applications
- 28-29.09 – Chief Data & Analytics Officer, Financial Services, Online https://cdao-fs-eu.coriniumintelligence.com/
- 28-30.09 – DataOps Summit Online https://www.dataopssummit-sf.com/about/
- 30.09 - Web Data Extraction Summit 2021 by Zyte https://www.extractsummit.io/
AI & Big Data Expo Global - Conference & Exhibition
AI & Big Data Expo
AI & Big Data Expo, part of TechEx Global, London is the premier the leading conference & exhibition event showcasing Generative AI, Machine Learning & Data. Register your pass.
🏸What is AIOps
While we got used to MLOps, a new Ops phenomenon happened in IT, the need for which actually arose a long time ago. Meet AIOps - using AI to simplify IT operations management and accelerate and automate problem solving in today's complex IT environments. AIOps leverages the power of big data, analytics and machine learning for the following purposes:
• Collecting and aggregating huge and ever-growing volumes of operational data generated by many IT infrastructure components, applications and performance monitoring tools;
• Filtering useful signals from noise to reveal really important events and patterns related to the performance and availability of systems;
• identifying root causes and responding quickly to problems, sometimes automatically without human intervention.
By replacing many separate tools for manual IT operations with a single intelligent and automated platform, AIOps enables you to respond quickly and even proactively to slowdowns and system failures with much less effort. AIOps bridges the gap between all diverse, dynamic and complex IT landscapes without sacrificing application performance and availability. With more companies moving from traditional IT infrastructure to a dynamic mix of on-premises clusters, private clouds, and public clouds today, AIOps is relevant for many enterprises.
https://medium.com/geekculture/aiops-6e463cbe617a
While we got used to MLOps, a new Ops phenomenon happened in IT, the need for which actually arose a long time ago. Meet AIOps - using AI to simplify IT operations management and accelerate and automate problem solving in today's complex IT environments. AIOps leverages the power of big data, analytics and machine learning for the following purposes:
• Collecting and aggregating huge and ever-growing volumes of operational data generated by many IT infrastructure components, applications and performance monitoring tools;
• Filtering useful signals from noise to reveal really important events and patterns related to the performance and availability of systems;
• identifying root causes and responding quickly to problems, sometimes automatically without human intervention.
By replacing many separate tools for manual IT operations with a single intelligent and automated platform, AIOps enables you to respond quickly and even proactively to slowdowns and system failures with much less effort. AIOps bridges the gap between all diverse, dynamic and complex IT landscapes without sacrificing application performance and availability. With more companies moving from traditional IT infrastructure to a dynamic mix of on-premises clusters, private clouds, and public clouds today, AIOps is relevant for many enterprises.
https://medium.com/geekculture/aiops-6e463cbe617a
Medium
AIOps
AIOps uses artificial intelligence to simplify IT operations management and accelerate and automate problem resolution in complex modern IT…