Data Science Jupyter Notebooks – Telegram
Data Science Jupyter Notebooks
12.5K subscribers
315 photos
49 videos
9 files
1.07K links
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
Download Telegram
Top 100 Data Science Interview Questions

Data Science Basics
1. What is data science and how is it different from data analytics?
2. What are the key steps in a data science lifecycle?
3. What types of problems does data science solve?
4. What skills does a data scientist need in real projects?
5. What is the difference between structured and unstructured data?
6. What is exploratory data analysis and why do you do it first?
7. What are common data sources in real companies?
8. What is feature engineering?
9. What is the difference between supervised and unsupervised learning?
10. What is bias in data and how does it affect models?

Statistics and Probability
11. What is the difference between mean, median, and mode?
12. What is standard deviation and variance?
13. What is probability distribution?
14. What is normal distribution and where is it used?
15. What is skewness and kurtosis?
16. What is correlation vs causation?
17. What is hypothesis testing?
18. What are Type I and Type II errors?
19. What is p-value?
20. What is confidence interval?

Data Cleaning and Preprocessing
21. How do you handle missing values?
22. How do you treat outliers?
23. What is data normalization and standardization?
24. When do you use Min-Max scaling vs Z-score?
25. How do you handle imbalanced datasets?
26. What is one-hot encoding?
27. What is label encoding?
28. How do you detect data leakage?
29. What is duplicate data and how do you handle it?
30. How do you validate data quality?

Python for Data Science
31. Why is Python popular in data science?
32. Difference between list, tuple, set, and dictionary?
33. What is NumPy and why is it fast?
34. What is Pandas and where do you use it?
35. Difference between loc and iloc?
36. What are vectorized operations?
37. What is lambda function?
38. What is list comprehension?
39. How do you handle large datasets in Python?
40. What are common Python libraries used in data science?

Data Visualization
41. Why is data visualization important?
42. Difference between bar chart and histogram?
43. When do you use box plots?
44. What does a scatter plot show?
45. What are common mistakes in data visualization?
46. Difference between Seaborn and Matplotlib?
47. What is a heatmap used for?
48. How do you visualize distributions?
49. What is dashboarding?
50. How do you choose the right chart?

Machine Learning Basics
51. What is machine learning?
52. Difference between regression and classification?
53. What is overfitting and underfitting?
54. What is train-test split?
55. What is cross-validation?
56. What is bias-variance tradeoff?
57. What is feature selection?
58. What is model evaluation?
59. What is baseline model?
60. How do you choose a model?

Supervised Learning
61. How does linear regression work?
62. Assumptions of linear regression?
63. What is logistic regression?
64. What is decision tree?
65. What is random forest?
66. What is KNN and when do you use it?
67. What is SVM?
68. How does Naive Bayes work?
69. What are ensemble methods?
70. How do you tune hyperparameters?

Unsupervised Learning
71. What is clustering?
72. Difference between K-means and hierarchical clustering?
73. How do you choose value of K?
74. What is PCA?
75. Why is dimensionality reduction needed?
76. What is anomaly detection?
77. What is association rule mining?
78. What is DBSCAN?
79. What is cosine similarity?
80. Where is unsupervised learning used?

Model Evaluation Metrics
81. What is accuracy and when is it misleading?
82. What is precision and recall?
83. What is F1 score?
84. What is ROC curve?
85. What is AUC?
86. Difference between confusion matrix metrics?
87. What is log loss?
88. What is RMSE?
89. What metric do you use for imbalanced data?
90. How do business metrics link to ML metrics?
5
🔥 Trending Repository: qui

📝 Denoscription: A fast, single-binary qBittorrent web UI: manage multiple instances, automate torrent workflows, and cross-seed across trackers.

🔗 Repository URL: https://github.com/autobrr/qui

🌐 Website: https://getqui.com

📖 Readme: https://github.com/autobrr/qui#readme

📊 Statistics:
🌟 Stars: 2.6K stars
👀 Watchers: 8
🍴 Forks: 74 forks

💻 Programming Languages: Go - TypeScript - CSS - Python - Makefile - HTML

🏷️ Related Topics:
#go #golang #qbittorrent #libtorrent #workflows #qbit #cross_seed #cross_seeding


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: nanochat

📝 Denoscription: The best ChatGPT that $100 can buy.

🔗 Repository URL: https://github.com/karpathy/nanochat

📖 Readme: https://github.com/karpathy/nanochat#readme

📊 Statistics:
🌟 Stars: 41.4K stars
👀 Watchers: 289
🍴 Forks: 5.4K forks

💻 Programming Languages: Python - Jupyter Notebook - HTML - Shell

🏷️ Related Topics: Not available

==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: rag-from-scratch

📝 Denoscription: No denoscription available

🔗 Repository URL: https://github.com/langchain-ai/rag-from-scratch

📖 Readme: https://github.com/langchain-ai/rag-from-scratch#readme

📊 Statistics:
🌟 Stars: 6.8K stars
👀 Watchers: 60
🍴 Forks: 1.8K forks

💻 Programming Languages: Jupyter Notebook

🏷️ Related Topics: Not available

==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: review-prompts

📝 Denoscription: AI review prompts

🔗 Repository URL: https://github.com/masoncl/review-prompts

📖 Readme: https://github.com/masoncl/review-prompts#readme

📊 Statistics:
🌟 Stars: 192 stars
👀 Watchers: 9
🍴 Forks: 29 forks

💻 Programming Languages: Python - Shell

🏷️ Related Topics: Not available

==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: skills

📝 Denoscription: Skills Catalog for Codex

🔗 Repository URL: https://github.com/openai/skills

📖 Readme: https://github.com/openai/skills#readme

📊 Statistics:
🌟 Stars: 2.6K stars
👀 Watchers: 26
🍴 Forks: 166 forks

💻 Programming Languages: Python - Shell - JavaScript

🏷️ Related Topics: Not available

==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: ccpm

📝 Denoscription: Project management system for Claude Code using GitHub Issues and Git worktrees for parallel agent execution.

🔗 Repository URL: https://github.com/automazeio/ccpm

🌐 Website: https://automaze.io/ccpm

📖 Readme: https://github.com/automazeio/ccpm#readme

📊 Statistics:
🌟 Stars: 6.5K stars
👀 Watchers: 39
🍴 Forks: 684 forks

💻 Programming Languages: Shell - Batchfile

🏷️ Related Topics:
#project_management #ai_agents #claude #ai_coding #vibe_coding #claude_code


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: vm0

📝 Denoscription: the easiest way to run natural language-described workflows automatically

🔗 Repository URL: https://github.com/vm0-ai/vm0

🌐 Website: https://vm0.ai

📖 Readme: https://github.com/vm0-ai/vm0#readme

📊 Statistics:
🌟 Stars: 522 stars
👀 Watchers: 1
🍴 Forks: 20 forks

💻 Programming Languages: TypeScript - MDX - Shell - CSS - Rust - JavaScript

🏷️ Related Topics:
#react #cli #typenoscript #containers #sandbox #cloudflare #codex #dev_tools #ai_agent #ai_runtime #gemini_cli #agentic_workflow #claude_code #context_engineer #ai_sandbox


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🎯 Want to Upskill in IT? Try Our FREE 2026 Learning Kits!

SPOTO gives you free, instant access to high-quality, updated resources that help you study smarter and pass exams faster.
Latest Exam Materials:
Covering #Python, #Cisco, #PMI, #Fortinet, #AWS, #Azure, #AI, #Excel, #comptia, #ITIL, #cloud & more!
100% Free, No Sign-up:
All materials are instantly downloadable

What’s Inside:
📘IT Certs E-book: https://bit.ly/3Mlu5ez
📝IT Exams Skill Test: https://bit.ly/3NVrgRU
🎓Free IT courses: https://bit.ly/3M9h5su
🤖Free PMP Study Guide: https://bit.ly/4te3EIn
☁️Free Cloud Study Guide: https://bit.ly/4kgFVDs

👉 Become Part of Our IT Learning Circle! resources and support:
https://chat.whatsapp.com/FlG2rOYVySLEHLKXF3nKGB

💬 Want exam help? Chat with an admin now!
wa.link/8fy3x4
2
🔥 Trending Repository: claude-code-hooks-mastery

📝 Denoscription: Master Claude Code Hooks

🔗 Repository URL: https://github.com/disler/claude-code-hooks-mastery

📖 Readme: https://github.com/disler/claude-code-hooks-mastery#readme

📊 Statistics:
🌟 Stars: 2.3K stars
👀 Watchers: 52
🍴 Forks: 498 forks

💻 Programming Languages: Python - TypeScript

🏷️ Related Topics: Not available

==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
1
🔥 Trending Repository: anki

📝 Denoscription: Anki is a smart spaced repetition flashcard program

🔗 Repository URL: https://github.com/ankitects/anki

🌐 Website: https://apps.ankiweb.net

📖 Readme: https://github.com/ankitects/anki#readme

📊 Statistics:
🌟 Stars: 26.1K stars
👀 Watchers: 349
🍴 Forks: 2.8K forks

💻 Programming Languages: Rust - Python - Svelte - TypeScript - SCSS - Shell

🏷️ Related Topics: Not available

==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: opentelemetry-collector-contrib

📝 Denoscription: Contrib repository for the OpenTelemetry Collector

🔗 Repository URL: https://github.com/open-telemetry/opentelemetry-collector-contrib

🌐 Website: https://opentelemetry.io

📖 Readme: https://github.com/open-telemetry/opentelemetry-collector-contrib#readme

📊 Statistics:
🌟 Stars: 4.3K stars
👀 Watchers: 62
🍴 Forks: 3.3K forks

💻 Programming Languages: Go - Makefile - Go Template - Shell - Dockerfile - Jinja

🏷️ Related Topics:
#opentelemetry #open_telemetry


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: likec4

📝 Denoscription: Visualize, collaborate, and evolve the software architecture with always actual and live diagrams from your code

🔗 Repository URL: https://github.com/likec4/likec4

🌐 Website: https://likec4.dev

📖 Readme: https://github.com/likec4/likec4#readme

📊 Statistics:
🌟 Stars: 1.4K stars
👀 Watchers: 17
🍴 Forks: 117 forks

💻 Programming Languages: TypeScript - MDX - Astro - JavaScript - CSS - Langium

🏷️ Related Topics:
#architecture #diagrams #c4 #architecture_as_code


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🤖 A tool that allows you to collect ML models based on a text denoscription

There's an entire agent system inside that automates the entire ML creation cycle - from the idea to the finished solution, without manual fiddling with architecture and pipelines.

How it works:
You formulate the task in ordinary text and provide the data. If necessary, the system extracts the schema itself

Under the hood, a group of AI agents work: one designs the model, the second writes the code, the third evaluates the quality and corrects errors

If there's a lack of data, the system can generate a synthetic dataset for testing

There's support for Ray for parallel model exploration and scaling to cores or clusters

It connects to any cloud or local models via LiteLLM


It's ideal for rapid prototyping and experiments, when it's important to quickly get a working result - get it here.
https://github.com/plexe-ai/plexe

tags: #useful

@DataScienceN
Please open Telegram to view this post
VIEW IN TELEGRAM
2
Data Science Project Ideas

1️⃣ Beginner Friendly Projects
• Exploratory Data Analysis (EDA) on CSV datasets
• Student Marks Analysis
• COVID / Weather Data Analysis
• Simple Data Visualization Dashboard
• Basic Recommendation System (rule-based)

2️⃣ Python for Data Science
• Sales Data Analysis using Pandas
• Web Scraping + Analysis (BeautifulSoup)
• Data Cleaning  Preprocessing Project
• Movie Rating Analysis
• Stock Price Analysis (historical data)

3️⃣ Machine Learning Projects
• House Price Prediction
• Spam Email Classifier
• Loan Approval Prediction
• Customer Churn Prediction
• Iris / Titanic Dataset Classification

4️⃣ Data Visualization Projects
• Interactive Dashboard using Matplotlib/Seaborn
• Sales Performance Dashboard
• Social Media Analytics Dashboard
• COVID Trends Visualization
• Country-wise GDP Analysis

5️⃣ NLP (Text  Language) Projects
• Sentiment Analysis on Reviews
• Resume Screening System
• Fake News Detection
• Chatbot (Rule-based → ML-based)
• Topic Modeling on Articles

6️⃣ Advanced ML / AI Projects
• Recommendation System (Collaborative Filtering)
• Credit Card Fraud Detection
• Image Classification (CNN basics)
• Face Mask Detection
• Speech-to-Text Analysis

7️⃣ Data Engineering / Big Data
• ETL Pipeline using Python
• Data Warehouse Design (Star Schema)
• Log File Analysis
• API Data Ingestion Project
• Batch Processing with Large Datasets

8️⃣ Real-World / Portfolio Projects
• End-to-End Data Science Project
• Business Problem → Data → Model → Insights
• Kaggle Competition Project
• Open Dataset Case Study
• Automated Data Reporting Tool
2🔥1
🔥 Trending Repository: cognee

📝 Denoscription: Memory for AI Agents in 6 lines of code

🔗 Repository URL: https://github.com/topoteretes/cognee

🌐 Website: https://www.cognee.ai

📖 Readme: https://github.com/topoteretes/cognee#readme

📊 Statistics:
🌟 Stars: 11.7K stars
👀 Watchers: 59
🍴 Forks: 1.2K forks

💻 Programming Languages: Python - TypeScript - Shell - Dockerfile - CSS - Mako

🏷️ Related Topics:
#open_source #ai #knowledge #neo4j #knowledge_graph #openai #help_wanted #graph_database #ai_agents #contributions_welcome #cognitive_architecture #good_first_issue #rag #good_first_pr #vector_database #graph_rag #ai_memory #cognitive_memory #graphrag #context_engineering


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
1
🔥 Trending Repository: fish-shell

📝 Denoscription: The user-friendly command line shell.

🔗 Repository URL: https://github.com/fish-shell/fish-shell

🌐 Website: https://fishshell.com

📖 Readme: https://github.com/fish-shell/fish-shell#readme

📊 Statistics:
🌟 Stars: 32.3K stars
👀 Watchers: 279
🍴 Forks: 2.2K forks

💻 Programming Languages: Rust - Shell - Python - HTML - JavaScript - CMake

🏷️ Related Topics:
#shell #rust #fish #terminal


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
1
🔥 Trending Repository: prompt-optimizer

📝 Denoscription: 一款提示词优化器,助力于编写高质量的提示词

🔗 Repository URL: https://github.com/linshenkx/prompt-optimizer

🌐 Website: https://prompt.always200.com

📖 Readme: https://github.com/linshenkx/prompt-optimizer#readme

📊 Statistics:
🌟 Stars: 19.2K stars
👀 Watchers: 77
🍴 Forks: 2.4K forks

💻 Programming Languages: TypeScript - Vue - JavaScript - Shell - CSS - Dockerfile

🏷️ Related Topics:
#prompt #prompt_toolkit #prompt_tuning #llm #prompt_engineering #prompt_optimization


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
2
🔥 Trending Repository: anet

📝 Denoscription: Simple Rust VPN Client / Server

🔗 Repository URL: https://github.com/ZeroTworu/anet

📖 Readme: https://github.com/ZeroTworu/anet#readme

📊 Statistics:
🌟 Stars: 268 stars
👀 Watchers: 15
🍴 Forks: 20 forks

💻 Programming Languages: Rust - Inno Setup - Shell - Makefile

🏷️ Related Topics:
#rust #vpn


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
2
🔥 Trending Repository: data-engineer-handbook

📝 Denoscription: This is a repo with links to everything you'd ever want to learn about data engineering

🔗 Repository URL: https://github.com/DataExpert-io/data-engineer-handbook

📖 Readme: https://github.com/DataExpert-io/data-engineer-handbook#readme

📊 Statistics:
🌟 Stars: 39.7K stars
👀 Watchers: 466
🍴 Forks: 7.6K forks

💻 Programming Languages: Jupyter Notebook - Python - Makefile - Dockerfile - Shell

🏷️ Related Topics:
#data #awesome #sql #bigdata #dataengineering #apachespark


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
1