Data science/ML/AI – Telegram
Data science/ML/AI
13K subscribers
510 photos
1 video
98 files
314 links
Data science and machine learning hub

Python, SQL, stats, ML, deep learning, projects, PDFs, roadmaps and AI resources.

For beginners, data scientists and ML engineers
👉 https://rebrand.ly/bigdatachannels

DMCA: @disclosure_bds
Contact: @mldatascientist
Download Telegram
📚 Data Science Riddle

Which algorithm is most sensitive to feature scaling?
Anonymous Quiz
25%
Decision Tree
24%
Random Forest
36%
KNN
15%
Naive Bayes
Great Packages for R
2
Big Data 5V
👍21
📚 Data Science Riddle

Why does bagging reduce variance?
Anonymous Quiz
13%
Uses deeper trees
50%
Averages multiple models
29%
Penalizes weights
9%
Learns Sequentially
📊 Infographic Elements That Every Data Person Should Master 🚀

After years of working with data, I can tell you one thing:
👉 The chart ou choose is as important as the data itself.

Here’s your quick visual toolkit 👇

🔹 Timelines

* Sequential great for processes
* Scaled best for real dates/events

🔹 Circular Charts

* Donut 🍩 & Pie 🥧 for proportions
* Radial 🌌 for progress or cycles
* Venn 🎯 when you want to show overlaps

🔹 Creative Comparisons

* Bubble 🫧 & Area 🔵 for impact by size
* Dot Matrix 🔴 for colorful distributions
* Pictogram 👥 when storytelling matters most

🔹 Classic Must-Haves

* Bar 📊 & Histogram 📏 (clear, reliable)
* Line 📈 for trends
* Area 🌊 & Stacked Area for the “big picture”

🔹 Advanced Tricks

* Stacked Bar 🏗 when categories add up
* Span 📐 for ranges
* Arc 🌈 for relationships

💡 Pro tip from experience:
If your audience doesn’t “get it” in 3 seconds, change the chart. The best visualizations speak louder than numbers
8🔥3
Most Common Data Science Skills in Job Posting
5
Machine Learning Cheatsheet
4
📚 Data Science Riddle

Which Metric is best for imbalanced classification?
Anonymous Quiz
20%
Accuracy
18%
Precision
18%
Recall
44%
F1-Score
SQL JOINS
3
Introduction To Linear Regression
8
📚 Data Science Riddle

A dataset has 20% missing values in a critical column. What's the most practical choice?
Anonymous Quiz
6%
Drop all rows
49%
Fill with mean/median
41%
Use model-based imputation
5%
Ignore missing data
3
ML models don’t all think alike 🤖

❇️ Naive Bayes = probability
❇️ KNN = proximity
❇️ Discriminant Analysis = decision boundaries

Different paths, same goal: accurate classification.

Which one do you reach for first?
4
📚 Data Science Riddle

In a medical diagnosis project, what's more important?
Anonymous Quiz
34%
High precision
15%
High recall
37%
High accuracy
14%
High F1-score
Important LLM Terms

🔹 Transformer Architecture
🔹 Attention Mechanism
🔹 Pre-training
🔹 Fine-tuning
🔹 Parameters
🔹 Self-Attention
🔹 Embeddings
🔹 Context Window
🔹 Masked Language Modeling (MLM)
🔹 Causal Language Modeling (CLM)
🔹 Multi-Head Attention
🔹 Tokenization
🔹 Zero-Shot Learning
🔹 Few-Shot Learning
🔹 Transfer Learning
🔹 Overfitting
🔹 Inference

🔹 Language Model Decoding
🔹 Hallucination
🔹 Latency
11
Cheatsheet: Bayes Theroem And Classifier
9
Why is Kafka Called Kafka

Here’s a fun fact that surprises a lot of people.

The “Kafka” you use for real-time data pipelines is… named after the novelist Franz Kafka.

Why? Jay Kreps (the creator) once explained it simply:

- He liked the name.
- It sounded mysterious.
- And Kafka (the author) wrote a lot.

That last part is key.
Because Apache Kafka is all about writing: streams of events, logs, and data in motion.
So the name stuck.

Today, Millions of engineers across the globe talk about “Kafka” every single day… and most don’t realize they’re also invoking a 20th-century novelist.

It's funny how small choices like naming your project can shape how the world remembers it.
5👍1😁1
📚 Data Science Riddle

Why do CNNs use pooling layers?
Anonymous Quiz
51%
Reduce dimensionality
16%
Increase non-linearity
13%
Normalize activations
21%
Improve learning rate
4
Data Analyst 🆚 Data Engineer: Key Differences

Confused about the roles of a Data Analyst and Data Engineer? 🤔 Here's a breakdown:

👨‍💻 Data Analyst:

🎯 Role: Analyzes, interprets, & visualizes data to extract insights for business decisions.

👍 Best For: Those who enjoy finding patterns, trends, & actionable insights.

🔑 Responsibilities:
  🧹 Cleaning & organizing data.
  📊 Using tools like Excel, Power BI, Tableau & SQL.
  📝 Creating reports & dashboards.
  🤝 Collaborating with business teams.

Skills: Analytical skills, SQL, Excel, reporting tools, statistical analysis, business intelligence.

Outcome: Guides decision-making in business, marketing, finance, etc.

⚙️ Data Engineer:

🏗️ Role: Designs, builds, & maintains data infrastructure.

👍 Best For: Those who enjoy technical data management & architecture for large-scale analysis.

🔑 Responsibilities:
  🗄️ Managing databases & data pipelines.
  🔄 Developing ETL processes.
  🔒 Ensuring data quality & security.
  ☁️ Working with big data technologies like Hadoop, Spark, AWS, Azure & Google Cloud.

Skills: Python, Java, Scala, database management, big data tools, data architecture, cloud technologies.

Outcome: Creates infrastructure & pipelines for efficient data flow for analysis.

In short: Data Analysts extract insights, while Data Engineers build the systems for data storage, processing, & analysis. Data Analysts focus on business outcomes, while Data Engineers focus on the technical foundation.
6