🎯MLOps-tools save your time and efforts to develop, test and deploy Machine Learning models. MlFlow is one of the most useful and popular MLOps-tools. If you are interested how to use it in practice, read this brief article https://medium.com/hashmapinc/why-i-love-mlflow-951b8d1134be
Medium
Why I Love MLflow
And You Should Too! MLflow is the best tool out there in this space. You need to know what MLflow is and why you should be using it.
😁Теперь в этом канале мы будем постить интересные новости и статьи сразу на английском языке. А русскоязычные публикации и дайджесты отечественных ивентов читайте здесь: https://news.1rj.ru/str/bdscience_ru
Telegram
Big Data Science [RU]
Big Data Science [RU] — канал о жизни Data Science.
Для сотрудничества: a.chernobrovov@gmail.com
🌏 — https://news.1rj.ru/str/bdscience — Big Data Science channel (english version)
💼 — https://news.1rj.ru/str/bds_job — channel about Data Science jobs and career
Для сотрудничества: a.chernobrovov@gmail.com
🌏 — https://news.1rj.ru/str/bdscience — Big Data Science channel (english version)
💼 — https://news.1rj.ru/str/bds_job — channel about Data Science jobs and career
👀How to find medical image from billions of templates through Spark SQL: scalable library to efficiently read DICOM-files in Dataframe https://bd-practice.medium.com/dicom-read-library-apache-spark-third-party-contribution-e6cb269e5c3c
Medium
Dicom Read Library (Apache Spark Third Party Contribution)
Article by Nirali Gandhi, Big Data & Cloud Lead Developer
Forwarded from Machinelearning
🎨 Colorization Transformer by Google
💻 Github: https://github.com/google-research/google-research/tree/master/coltran
📝 Paper: https://arxiv.org/abs/2102.04432v1
🌐 Pre-trained model: https://console.cloud.google.com/storage/browser/gresearch/coltran;tab=objects?pli=1&prefix=&forceOnObjectsSortingFiltering=false
@ai_machinelearning_big_data
💻 Github: https://github.com/google-research/google-research/tree/master/coltran
📝 Paper: https://arxiv.org/abs/2102.04432v1
🌐 Pre-trained model: https://console.cloud.google.com/storage/browser/gresearch/coltran;tab=objects?pli=1&prefix=&forceOnObjectsSortingFiltering=false
@ai_machinelearning_big_data
Why Data Engineer is the best friend of Data Scientist and Data Analyst - the shifting role in 2021: tasks, responsibilities, salaries and perspectives
https://palakdatascientist.medium.com/data-engineers-of-2021-the-shifting-role-35d13c9106f
https://palakdatascientist.medium.com/data-engineers-of-2021-the-shifting-role-35d13c9106f
Medium
Data Engineers of 2021: The Shifting Role
The role of big data engineer professionals is continuously evolving with the evolution of technologies and tools. Learn about the…
Typical use cases for the most popular ML-algorithms with BigQuery - from regression to time series analysis
https://medium.com/cloudzone/try-62d6aeb4a5e1
https://medium.com/cloudzone/try-62d6aeb4a5e1
Medium
5 Machine Learning Models You Can Deploy Using BigQuery
Deploying machine learning (ML) models requires multiple teams and coordination. Developing a statistical model or picking which one to…
How to streamline the implementation of reasoning systems with ReAgent from Facebook.
ReAgent is the end-to-end platform applied Reinforcement Learning designed for large-scale, distributed recommendation/optimization tasks where we don’t have access to a simulator. The main purpose of this framework is to make the development & experimentation of deep reinforcement algorithms fast. ReAgent is built on Python. It uses PyTorch framework for data modelling. ReAgent holds different algorithms for data preprocessing, feature engineering, model training & evaluation and lastly for optimized serving. It is capable of handling Large-dimension datasets, provides optimized algorithms for data preprocessing, training, and gives a highly efficient production environment for model serving. https://analyticsindiamag.com/hands-on-to-reagent-end-to-end-platform-for-applied-reinforcement-learning/
ReAgent is the end-to-end platform applied Reinforcement Learning designed for large-scale, distributed recommendation/optimization tasks where we don’t have access to a simulator. The main purpose of this framework is to make the development & experimentation of deep reinforcement algorithms fast. ReAgent is built on Python. It uses PyTorch framework for data modelling. ReAgent holds different algorithms for data preprocessing, feature engineering, model training & evaluation and lastly for optimized serving. It is capable of handling Large-dimension datasets, provides optimized algorithms for data preprocessing, training, and gives a highly efficient production environment for model serving. https://analyticsindiamag.com/hands-on-to-reagent-end-to-end-platform-for-applied-reinforcement-learning/
Analytics India Magazine
Hands-on to ReAgent: End-to-End Platform for Applied Reinforcement Learning
Facebook ReAgent, previously known as Horizon is an end-to-end platform for using applied Reinforcement Learning in order to solve industrial
🌞New look to Elasticsearch: how to use it as Time Series Database and tune performance of queries for interactive data analytics - case of ThousandEyes company https://medium.com/thousandeyes-engineering/what-we-learned-using-elasticsearch-as-a-time-series-database-bdbde38cdb64
Medium
What We Learned Using Elasticsearch as a Time Series Database
by Gaurav Mishra, Software Engineer II at ThousandEyes
💦Transparent interpretation of results and permanent learning in production with non-stop adaptation of neural network to new conditions and data
Liquid NN from MIT for decision making in autonomous driving and medical diagnosis based on nervous system of microscopic nematode with 302 neurons and principles of time series data ananlytics. This ML-model edged out other state-of-the-art time series algorithms by a few percentage points in accurately predicting future values in datasets, ranging from atmospheric chemistry to traffic patterns. Just changing the representation of a neuron with the differential equations, you can deal with small number of highly expressive neurons and peer into the “black box” of the network’s decision making and diagnose why the network made a certain characterization.
https://news.mit.edu/2021/machine-learning-adapts-0128
Liquid NN from MIT for decision making in autonomous driving and medical diagnosis based on nervous system of microscopic nematode with 302 neurons and principles of time series data ananlytics. This ML-model edged out other state-of-the-art time series algorithms by a few percentage points in accurately predicting future values in datasets, ranging from atmospheric chemistry to traffic patterns. Just changing the representation of a neuron with the differential equations, you can deal with small number of highly expressive neurons and peer into the “black box” of the network’s decision making and diagnose why the network made a certain characterization.
https://news.mit.edu/2021/machine-learning-adapts-0128
MIT News
“Liquid” machine-learning system adapts to changing conditions
MIT researchers developed a neural network that learns on the job, not just during training. The “liquid” network varies its equations’ parameters, enhancing its ability to analyze time series data. The advance could boost autonomous driving, medical diagnosis…
🌷Not only LightGBM and XGBoost: meet new probabilistic prediction algorithm - Natural Gradient Boosting (NGBoost). Released in 2019, NGBoost uses the Natural Gradient to address technical challenges that makes generic probabilistic prediction hard with existing gradient boosting methods. This algorithm consists of three abstract modular components: base learner, parametric probability distribution, and scoring rule. All three components are treated as hyperparameters chosen in advance before training. NGBoost makes it easier to do probabilistic regression with flexible tree-based models. Further, it has been possible to do probabilistic classification for quite some time since most classifiers are actually probabilistic classifiers in that they return probabilities over each class. For instance, logistic regression returns class probabilities as output. In this light, NGBoost doesn’t add much new but experiments on several regression datasets proved that this ML-algorithm provides competitive predictive performance of both uncertainty estimates and traditional metrics. On other hand its computing time is quite longer than other two algorithms and there’s no some useful options, e.g. early stopping, showing the intermediate results, the flexibility of choosing the base learner, setting a random state seed, dealing only with decision tree and Ridge regression,and so on. But this modular ML-algorithm for probabilistic prediction is quite competitive against other popular boosting methods. See more
http://www.51anomaly.org/pdf/NGBOOST.pdf
https://medium.com/@ODSC/using-the-ngboost-algorithm-8d337b753c58
https://towardsdatascience.com/ngboost-explained-comparison-to-lightgbm-and-xgboost-fda510903e53
https://www.groundai.com/project/ngboost-natural-gradient-boosting-for-probabilistic-prediction/1
http://www.51anomaly.org/pdf/NGBOOST.pdf
https://medium.com/@ODSC/using-the-ngboost-algorithm-8d337b753c58
https://towardsdatascience.com/ngboost-explained-comparison-to-lightgbm-and-xgboost-fda510903e53
https://www.groundai.com/project/ngboost-natural-gradient-boosting-for-probabilistic-prediction/1
😂For those who skipped everything in 2020: Top 15 Machine Learning & AI Research Papers – from YOLO 4 to TensorFlow Quantum https://rubikscode.net/2020/12/21/2020s-top-15-machine-learning-ai-research-papers/
Rubix Code
2020’s Top 15 Machine Learning & AI Research Papers
Machine Learning, Deep Learning & AI Research Papers from 2020, that we thing you should read.
ML to optimize microchip's architecture: Apollo project from Google to search right parameters of chip for certain Neural Net and meet the high speed of computations
https://www.zdnet.com/article/googles-deep-learning-finds-a-critical-path-in-ai-chips
https://www.zdnet.com/article/googles-deep-learning-finds-a-critical-path-in-ai-chips
ZDNET
Google’s deep learning finds a critical path in AI chips
The work marks a beginning in using machine learning techniques to optimize the architecture of chips.
Deep into NGBoost and probabilistic regression: what is probabilistic supervised learning and how to deal with prediction intervals. About correct interpretation of this ML-algorithm
https://towardsdatascience.com/interpreting-the-probabilistic-predictions-from-ngboost-868d6f3770b2
https://towardsdatascience.com/interpreting-the-probabilistic-predictions-from-ngboost-868d6f3770b2
Medium
NGBoost and Prediction Intervals
What is probabilistic regression and how should you interpret probabilistic predictions?
About tensor holography to create real time 3D-holograms for virtual reality, 3D printing and medical visualization that could be run on your smartphone. Meet new AI-method from MIT researchers https://news.mit.edu/2021/3d-holograms-vr-0310
MIT News
Using artificial intelligence to generate 3D holograms in real-time
MIT researchers developed a way to produce holograms almost instantly. The deep learning-based method is so efficient, it could run on a smartphone, they say.
🤓Deep fake is not too simple: interview with Belgium VFX specialist Chris Ume, creator of viral video about fake Tom Cruise. Why only ML-algorithm is not enough to get high quality result and you need thorough tune video effects manually
https://www.theverge.com/2021/3/5/22314980/tom-cruise-deepfake-tiktok-videos-ai-impersonator-chris-ume-miles-fisher
https://www.theverge.com/2021/3/5/22314980/tom-cruise-deepfake-tiktok-videos-ai-impersonator-chris-ume-miles-fisher
The Verge
Tom Cruise deepfake creator says public shouldn’t be worried about ‘one-click fakes’
‘You can’t do it by just pressing a button.’
😜Not only Deep Learning: new approach to build AI systems working as human brain - sparse coding principle to supply series of local functions in synaptic learning rules and reduce number of adjusting data in NN-model. The startup Nara Logics from MIT alumnus is trying to increase effectiveness of AI by mimicking the brain structure and function at the circuit level.
https://news.mit.edu/2021/nara-logics-ai-0312
https://news.mit.edu/2021/nara-logics-ai-0312
MIT News
Artificial intelligence that more closely mimics the mind
Nara Logics, co-founded by MIT alumnus Nathan Wilson PhD ’05, is attempting to mimic the brain with an AI platform powered by an engine it calls Nara Logics Synaptic Intelligence.
💥Meet the CLIP (Contrastive Language – Image Pre-training) - new Neural Net from OpenAI: it can be instructed in natural language to perform a great variety of classification benchmarks, without directly optimizing for the benchmark’s performance, similar to the “zero-shot” capabilities of GPT-2 and GPT-3. CLIP is based on zero-shot transfer, natural language supervision, and multimodal learning to recognize a wide variety of visual concepts in images and associate them with their names. Read more where you can use this unique ML-model https://openai.com/blog/clip/
Openai
CLIP: Connecting text and images
We’re introducing a neural network called CLIP which efficiently learns visual concepts from natural language supervision. CLIP can be applied to any visual classification benchmark by simply providing the names of the visual categories to be recognized,…
👀 Why modern AI for Computer Vision should have Multimodal Neurons and how this Faceted Feature Visualization rises the accuracy of predictions and classifications. New paper from
OpenAI researchers https://distill.pub/2021/multimodal-neurons/
OpenAI researchers https://distill.pub/2021/multimodal-neurons/
Distill
Multimodal Neurons in Artificial Neural Networks
We report the existence of multimodal neurons in artificial neural networks, similar to those found in the human brain.
🤓How to assess the potential effectiveness of medical drugs: new method DeepBAR form MIT researchers to calculate the binding affinities between drug candidates and their targets. It is based on GAN-models for analyzing molecular structures as images
https://news.mit.edu/2021/drug-discovery-binding-affinity-0315
https://news.mit.edu/2021/drug-discovery-binding-affinity-0315
MIT News
Faster drug discovery through machine learning
MIT researchers have developed DeepBAR, a machine learning technique that quickly calculates drug molecules’ binding affinity with target proteins. The advance could accelerate drug discovery and protein engineering.
🥺Looking for interesting reading? Take the TOP-21 books about Data Science, Engineering and Statistics – must to be read in 2021 https://towardsdatascience.com/21-data-science-books-you-should-read-in-2021-db625e97feb6
And short-list of 5 items https://medium.com/curious/5-books-every-data-scientist-should-read-in-2021-206609d8593b
And short-list of 5 items https://medium.com/curious/5-books-every-data-scientist-should-read-in-2021-206609d8593b
Medium
21 Data Science Books You Should Read in 2021
An Updated Collection of the Best Data Science Books to Read Right Now
😂If you do not want to study grammar and history, use ML to pass the exams! GPT-3 has done it with U.S. History, Research Methods, Creative Writing, and Law. In 3-20 minutes, NN was able to mimic human writing in areas of grammar, syntax, and word frequency and get the same feedback as the human writers
https://www.zdnet.com/article/ai-can-write-a-passing-college-paper-in-20-minutes/
https://www.zdnet.com/article/ai-can-write-a-passing-college-paper-in-20-minutes/
ZDNet
AI can write a passing college paper in 20 minutes
Natural language processing is on the cusp of changing our relationship with machines forever.