📚 Data Science Riddle
Which algorithm is most sensitive to feature scaling?
Which algorithm is most sensitive to feature scaling?
Anonymous Quiz
25%
Decision Tree
24%
Random Forest
36%
KNN
15%
Naive Bayes
📚 Data Science Riddle
Why does bagging reduce variance?
Why does bagging reduce variance?
Anonymous Quiz
13%
Uses deeper trees
50%
Averages multiple models
29%
Penalizes weights
9%
Learns Sequentially
📊 Infographic Elements That Every Data Person Should Master 🚀
After years of working with data, I can tell you one thing:
👉 The
Here’s your quick visual toolkit 👇
🔹 Timelines
* Sequential ⏩ great for processes
* Scaled ⏳ best for real dates/events
🔹 Circular Charts
* Donut 🍩 & Pie 🥧 for proportions
* Radial 🌌 for progress or cycles
* Venn 🎯 when you want to show overlaps
🔹 Creative Comparisons
* Bubble 🫧 & Area 🔵 for impact by size
* Dot Matrix 🔴 for colorful distributions
* Pictogram 👥 when storytelling matters most
🔹 Classic Must-Haves
* Bar 📊 & Histogram 📏 (clear, reliable)
* Line 📈 for trends
* Area 🌊 & Stacked Area for the “big picture”
🔹 Advanced Tricks
* Stacked Bar 🏗 when categories add up
* Span 📐 for ranges
* Arc 🌈 for relationships
💡 Pro tip from experience:
If your audience doesn’t “get it” in 3 seconds, change the chart. The best visualizations
After years of working with data, I can tell you one thing:
👉 The
chart ou choose is as important as the data itself.Here’s your quick visual toolkit 👇
🔹 Timelines
* Sequential ⏩ great for processes
* Scaled ⏳ best for real dates/events
🔹 Circular Charts
* Donut 🍩 & Pie 🥧 for proportions
* Radial 🌌 for progress or cycles
* Venn 🎯 when you want to show overlaps
🔹 Creative Comparisons
* Bubble 🫧 & Area 🔵 for impact by size
* Dot Matrix 🔴 for colorful distributions
* Pictogram 👥 when storytelling matters most
🔹 Classic Must-Haves
* Bar 📊 & Histogram 📏 (clear, reliable)
* Line 📈 for trends
* Area 🌊 & Stacked Area for the “big picture”
🔹 Advanced Tricks
* Stacked Bar 🏗 when categories add up
* Span 📐 for ranges
* Arc 🌈 for relationships
💡 Pro tip from experience:
If your audience doesn’t “get it” in 3 seconds, change the chart. The best visualizations
speak louder than numbers❤8🔥3
📚 Data Science Riddle
Which Metric is best for imbalanced classification?
Which Metric is best for imbalanced classification?
Anonymous Quiz
20%
Accuracy
18%
Precision
18%
Recall
44%
F1-Score
📚 Data Science Riddle
A dataset has 20% missing values in a critical column. What's the most practical choice?
A dataset has 20% missing values in a critical column. What's the most practical choice?
Anonymous Quiz
6%
Drop all rows
49%
Fill with mean/median
41%
Use model-based imputation
5%
Ignore missing data
❤3
ML models don’t all think alike 🤖
❇️ Naive Bayes = probability
❇️ KNN = proximity
❇️ Discriminant Analysis = decision boundaries
Different paths, same goal: accurate classification.
Which one do you reach for first?
❇️ Naive Bayes = probability
❇️ KNN = proximity
❇️ Discriminant Analysis = decision boundaries
Different paths, same goal: accurate classification.
Which one do you reach for first?
❤4
📚 Data Science Riddle
In a medical diagnosis project, what's more important?
In a medical diagnosis project, what's more important?
Anonymous Quiz
34%
High precision
15%
High recall
37%
High accuracy
14%
High F1-score
Important LLM Terms
🔹 Transformer Architecture
🔹 Attention Mechanism
🔹 Pre-training
🔹 Fine-tuning
🔹 Parameters
🔹 Self-Attention
🔹 Embeddings
🔹 Context Window
🔹 Masked Language Modeling (MLM)
🔹 Causal Language Modeling (CLM)
🔹 Multi-Head Attention
🔹 Tokenization
🔹 Zero-Shot Learning
🔹 Few-Shot Learning
🔹 Transfer Learning
🔹 Overfitting
🔹 Inference
🔹 Language Model Decoding
🔹 Hallucination
🔹 Latency
🔹 Transformer Architecture
🔹 Attention Mechanism
🔹 Pre-training
🔹 Fine-tuning
🔹 Parameters
🔹 Self-Attention
🔹 Embeddings
🔹 Context Window
🔹 Masked Language Modeling (MLM)
🔹 Causal Language Modeling (CLM)
🔹 Multi-Head Attention
🔹 Tokenization
🔹 Zero-Shot Learning
🔹 Few-Shot Learning
🔹 Transfer Learning
🔹 Overfitting
🔹 Inference
🔹 Language Model Decoding
🔹 Hallucination
🔹 Latency
❤11
Why is Kafka Called Kafka❔
Here’s a fun fact that surprises a lot of people.
The “Kafka” you use for real-time data pipelines is… named after the novelist Franz Kafka.
Why? Jay Kreps (the creator) once explained it simply:
- He liked the name.
- It sounded mysterious.
- And Kafka (the author) wrote a lot.
That last part is key.
Because Apache Kafka is all about writing: streams of events, logs, and data in motion.
So the name stuck.
Today, Millions of engineers across the globe talk about “Kafka” every single day… and most don’t realize they’re also invoking a 20th-century novelist.
It's funny how small choices like naming your project can shape how the world remembers it.
Here’s a fun fact that surprises a lot of people.
The “Kafka” you use for real-time data pipelines is… named after the novelist Franz Kafka.
Why? Jay Kreps (the creator) once explained it simply:
- He liked the name.
- It sounded mysterious.
- And Kafka (the author) wrote a lot.
That last part is key.
Because Apache Kafka is all about writing: streams of events, logs, and data in motion.
So the name stuck.
Today, Millions of engineers across the globe talk about “Kafka” every single day… and most don’t realize they’re also invoking a 20th-century novelist.
It's funny how small choices like naming your project can shape how the world remembers it.
❤5👍1😁1