Mike's ML Forge – Telegram
Mike's ML Forge
253 subscribers
130 photos
10 videos
16 files
58 links
Welcome to this channel,in this channel, we're diving deep into the world of Data Science and ML Also a bit of my personal journey, becoming a person who says " I designed the board, collected the data, trained the model, and deployed it"
Download Telegram
time to seize the day
🔥21
ገብርኤል..ኃያል
8
Mike's ML Forge
ገብርኤል..ኃያል
Peeps እንኳን አደረሳችሁ
10
Forwarded from BeNN
The pain when you see the model is overfitting 💀
1
Mike's ML Forge
GIF
I’ve gone three whole days without water
👀21
GPT 5 announcement?
1
In 2.5 years, we went from high school student level Al to PhD level. Anyone now has an expert-level assistant in their pocket, on demand. Just think about what could happen in the next 2.5 years. Wild.
2😱1
2🔥1
6
Chimaev is chaos and born for war.
DDP is a storm of will power and never afraid to bleed for victory.
2🔥2
Mike's ML Forge
Photo
This isn't just a fight, it's a collision of two warriors who refuse to fold
🔥3
🔥5
Fucking Finally!!!
🔥102
Happy New year guys 🎉
5
Forwarded from Tech Nerd (Tech Nerd)
Something to experiment in 2018 🔥

@selfmadecoder
Forwarded from Data 2 Pattern
Dimension reduction

Dimension reduction is the process of reducing the number of variables (dimensions) in a dataset while keeping its most important information. It is a powerful technique for simplifying complex data, which offers benefits such as improved computational efficiency, better model performance, and easier data visualization.

Why reduce dimensions?


💡 Curse of dimensionality: When a dataset has too many dimensions relative to the number of data points, it can become sparse, making it difficult for machine learning models to find meaningful patterns.
🔑 Eliminate redundancy and noise: Datasets often contain variables that are highly correlated or irrelevant, adding noise and complexity that can confuse models.
📊 Improve visualization: The human brain is limited to visualizing data in two or three dimensions. Dimensionality reduction allows you to represent high-dimensional data in a way that is easier for people to understand.
🎯 Increase efficiency: Fewer dimensions mean less computational time and resources are needed to process the data, which is especially important for large datasets.
⚡️ Prevent overfitting: By simplifying the dataset and removing noise, a model is less likely to learn the random fluctuations in the data and more likely to generalize well to new data.

Common techniques
There are two primary approaches to dimensionality reduction:

1. Feature extraction
This method transforms the original variables into a new, smaller set of variables (components) that are combinations of the original ones.
👉 Principal Component Analysis (PCA): A popular unsupervised method that creates new, uncorrelated components, ordered by the amount of variance they explain.
👉 Factor Analysis (EFA): An unsupervised method used to identify underlying, unobserved (latent) factors that cause the correlations among the observed variables.
👉 t-SNE (t-Distributed Stochastic Neighbor Embedding): A nonlinear method especially useful for visualizing high-dimensional data by placing similar data points closer together in a lower-dimensional space.

2. Feature selection
This method selects a subset of the most relevant original variables, discarding the rest. It does not transform the variables.

Filter methods: Use statistical measures to score features and keep the best ones, for example, by filtering out low-variance or highly correlated variables.
Wrapper methods: Evaluate different subsets of features by training and testing a model with each subset to see which performs best.

https://medium.com/@souravbanerjee423/demystify-the-power-of-dimensionality-reduction-in-machine-learning-26b70b882571

@data_to_pattern @data_to_pattern @data_to_pattern