NEW BOT Телеграм, страница

Mike's ML Forge

RandomizedSearchCV (Faster Alternative)

🎯 How it works:

    Randomly selects a subset of hyperparameter combinations instead of trying all.
    Still uses cross-validation to evaluate performance.
    Saves time by focusing on random but diverse samples.

✅ Pros:
✔️ Much faster than GridSearchCV.
✔️ Works well when there are many hyperparameters.

❌ Cons:
❌ Might not find the absolute best combination (since it’s random).
❌ Less exhaustive compared to GridSearchCV.

138 viewsMike, 20:45

Mike's ML Forge

GridSearchCV (Exhaustive Search)

🔍 How it works:

    Tries every possible combination of hyperparameters from a predefined set.
    Uses cross-validation to evaluate each combination.
    Selects the best performing set.

✅ Pros:
✔️ Finds the best hyperparameters since it checks all options.
✔️ Ensures optimal tuning when the search space is small.

❌ Cons:
❌ Very slow if there are many parameters and values.
❌ Computationally expensive.

168 viewsMike, 20:46

Mike's ML Forge

And you can clearly see the difference in accuracy 🙂

⚡5

157 viewsMike, 20:46

Mike's ML Forge

Ofc im in class room😁

🔥5🤣1

191 viewsMike, 06:22

Mike's ML Forge

Ofc im in class room😁

I already filled and cleaned the missing values also seeing some visualisation mannn this so fun😅

👍2

185 viewsMike, 06:48

Mike's ML Forge

Even the new passport got LED light 😁

😁4

250 viewsMike, 07:54

Mike's ML Forge

😁4

215 viewsMike, 20:30

Feature Encoding 101: Prepare Data For Machine Learning

various feature encoding methods. These are important in order to turn all sorts of features into meaningful numerical representations.

472 viewsMike, edited 17:10

Mike's ML Forge

Sun’s up, ideas loading… Let’s go!🙌🏽

239 viewsMike, 04:53

Mike's ML Forge

If you ever see me staring at a flower for too long… don’t interrupt. It’s a moment of deep appreciation stg😭

❤6

333 viewsMike, 15:34

Mike's ML Forge

what if there is Imbalanced Dataset ?

184 viewsMike, 22:02

Mike's ML Forge

what i meant by Imbalanced Dataset is for example what if a dataset has an unequal distribution of a target class (let's say 90% class A, 10% class B), it prevents one set from having too many samples of one class. so Normal train_test_split() might lead to under-representation of rare categories in the test set, making model evaluation unreliable.

185 viewsMike, 22:06

Mike's ML Forge

StratifiedShuffleSplit in Scikit-Learn

The StratifiedShuffleSplit class in Scikit-Learn is used for splitting datasets into training and test sets while maintaining the same proportion of a specific category (strata) in both sets. It is particularly useful when working with imbalanced datasets to ensure that the train and test sets have a similar distribution of the target variable.

it Prevents the test set from being skewed toward high or low-income areas, which could happen with a simple random split.

 from sklearn.model_selection import StratifiedShuffleSplit
import pandas as pd

# Creating income categories
housing["income_category"] = pd.cut(housing["median_income"],
                                    bins=[0., 1.5, 3.0, 4.5, 6., float("inf")],
                                    labels=[1, 2, 3, 4, 5])

# Stratified split based on income category
split = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=42)

for train_idx, test_idx in split.split(housing, housing["income_category"]):
    strat_train_set = housing.loc[train_idx]
    strat_test_set = housing.loc[test_idx]

👍2

238 viewsMike, 22:09

Mike's ML Forge

Finally got myself a በገና😭

❤5😭2🔥1

305 viewsMike, 16:58

Mike's ML Forge

Forwarded from Sincerely yours

248 viewsMike, 12:56

Mike's ML Forge

Python Data Cleaning Cookbook.pdf

3.4 MB