Data Science Jupyter Notebooks – Telegram
Data Science Jupyter Notebooks
12.3K subscribers
307 photos
48 videos
9 files
965 links
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
Download Telegram
🔥 Trending Repository: UltraRAG

📝 Denoscription: UltraRAG v3: A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

🔗 Repository URL: https://github.com/OpenBMB/UltraRAG

🌐 Website: https://ultrarag.openbmb.cn/

📖 Readme: https://github.com/OpenBMB/UltraRAG#readme

📊 Statistics:
🌟 Stars: 2.7K stars
👀 Watchers: 27
🍴 Forks: 231 forks

💻 Programming Languages: Python - JavaScript - CSS - HTML - Jinja - Shell - Dockerfile

🏷️ Related Topics:
#flask #demo #ui #mcp #openai #easy #gpt #embedding #vlm #multimodal #rag #sentence_transformers #huggingface_transformers #llm #vllm #qwen #deepseek


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: airllm

📝 Denoscription: AirLLM 70B inference with single 4GB GPU

🔗 Repository URL: https://github.com/lyogavin/airllm

📖 Readme: https://github.com/lyogavin/airllm#readme

📊 Statistics:
🌟 Stars: 8.4K stars
👀 Watchers: 150
🍴 Forks: 771 forks

💻 Programming Languages: Jupyter Notebook - Python - Shell

🏷️ Related Topics:
#open_source #chinese_nlp #llama #lora #instruction_set #finetune #open_source_models #open_models #llm #generative_ai #instruct_gpt #qlora #chinese_llm


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
1
These Google Colab-notebooks help to implement all machine learning algorithms from scratch 🤯

Repo: https://udlbook.github.io/udlbook/


👉 @codeprogrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
2
🔥 Trending Repository: mlx-audio

📝 Denoscription: A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

🔗 Repository URL: https://github.com/Blaizzy/mlx-audio

📖 Readme: https://github.com/Blaizzy/mlx-audio#readme

📊 Statistics:
🌟 Stars: 3.4K stars
👀 Watchers: 32
🍴 Forks: 285 forks

💻 Programming Languages: Python - TypeScript

🏷️ Related Topics:
#text_to_speech #transformers #speech_synthesis #speech_recognition #speech_to_text #audio_processing #mlx #multimodal #apple_silicon


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: res-downloader

📝 Denoscription: 视频号、小程序、抖音、快手、小红书、直播流、m3u8、酷狗、QQ音乐等常见网络资源下载!

🔗 Repository URL: https://github.com/putyy/res-downloader

🌐 Website: https://github.com/putyy/res-downloader

📖 Readme: https://github.com/putyy/res-downloader#readme

📊 Statistics:
🌟 Stars: 14.2K stars
👀 Watchers: 85
🍴 Forks: 1.8K forks

💻 Programming Languages: Go - Vue - NSIS - TypeScript - JavaScript - CSS - HTML

🏷️ Related Topics:
#wechat #kuaishou #douyin #xiaohongshu #wechat_video #res_downloader


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: FinRobot

📝 Denoscription: FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀

🔗 Repository URL: https://github.com/AI4Finance-Foundation/FinRobot

🌐 Website: https://finrobot.ai

📖 Readme: https://github.com/AI4Finance-Foundation/FinRobot#readme

📊 Statistics:
🌟 Stars: 5K stars
👀 Watchers: 73
🍴 Forks: 925 forks

💻 Programming Languages: Jupyter Notebook - Python

🏷️ Related Topics:
#finance #multimodal_deep_learning #robo_advisor #large_language_models #prompt_engineering #chatgpt #fingpt #aiagent


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
Media is too big
VIEW IN TELEGRAM
Ant AI Automated Sales Robot is an intelligent robot focused on automating lead generation and sales conversion. Its core function simulates human conversation, achieving end-to-end business conversion and easily generating revenue without requiring significant time investment.

I. Core Functions: Fully Automated "Lead Generation - Interaction - Conversion"

Precise Lead Generation and Human-like Communication: Ant AI is trained on over 20 million real social chat records, enabling it to autonomously identify target customers and build trust through natural conversation, requiring no human intervention.

High Conversion Rate Across Multiple Scenarios: Ant AI intelligently recommends high-conversion-rate products based on chat content, guiding customers to complete purchases through platforms such as iFood, Shopee, and Amazon. It also supports other transaction scenarios such as movie ticket purchases and utility bill payments.

24/7 Operation: Ant AI continuously searches for customers and recommends products. You only need to monitor progress via your mobile phone, requiring no additional management time.

II. Your Profit Guarantee: Low Risk, High Transparency, Zero Inventory Pressure, Stable Commission Sharing

We have established partnerships with platforms such as Shopee and Amazon, which directly provide abundant product sourcing. You don't need to worry about inventory or logistics. After each successful order, the company will charge the merchant a commission and share all profits with you. Earnings are predictable and withdrawals are convenient. Member data shows that each bot can generate $30 to $100 in profit per day. Commission income can be withdrawn to your account at any time, and the settlement process is transparent and open.

Low Initial Investment Risk. Bot development and testing incur significant costs. While rental fees are required, in the early stages of the project, the company prioritizes market expansion and brand awareness over short-term profits.

If you are interested, please join my Telegram group for more information and leave a message: https://news.1rj.ru/str/+lVKtdaI5vcQ1ZDA1
1👍1
🔥 Trending Repository: czkawka

📝 Denoscription: Multi functional app to find duplicates, empty folders, similar images etc.

🔗 Repository URL: https://github.com/qarmin/czkawka

📖 Readme: https://github.com/qarmin/czkawka#readme

📊 Statistics:
🌟 Stars: 28.3K stars
👀 Watchers: 147
🍴 Forks: 925 forks

💻 Programming Languages: Rust - Fluent - Slint - Python - Just - Nix

🏷️ Related Topics:
#rust #gtk_rs #duplicates #cleaner #multiplatform #similar_images #similar_music #similar_videos


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
2
🔥 Trending Repository: conduit

📝 Denoscription: Conduit React Native app

🔗 Repository URL: https://github.com/Psiphon-Inc/conduit

📖 Readme: https://github.com/Psiphon-Inc/conduit#readme

📊 Statistics:
🌟 Stars: 78 stars
👀 Watchers: 8
🍴 Forks: 37 forks

💻 Programming Languages: TypeScript - Java - Swift - Go - JavaScript - Makefile

🏷️ Related Topics: Not available

==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
1
🔥 Trending Repository: video2x

📝 Denoscription: A machine learning-based video super resolution and frame interpolation framework. Est. Hack the Valley II, 2018.

🔗 Repository URL: https://github.com/k4yt3x/video2x

🌐 Website: https://docs.video2x.org

📖 Readme: https://github.com/k4yt3x/video2x#readme

📊 Statistics:
🌟 Stars: 17.9K stars
👀 Watchers: 173
🍴 Forks: 1.6K forks

💻 Programming Languages: C++ - CMake - Just - Python - Dockerfile - Shell - C

🏷️ Related Topics:
#machine_learning #vulkan #neural_networks #frame_interpolation #anime4k #rife #upscale_video #realcugan #realesrgan #super_resoluion


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
🔥 Trending Repository: ai-data-science-team

📝 Denoscription: An AI-powered data science team of agents to help you perform common data science tasks 10X faster.

🔗 Repository URL: https://github.com/business-science/ai-data-science-team

📖 Readme: https://github.com/business-science/ai-data-science-team#readme

📊 Statistics:
🌟 Stars: 3.9K stars
👀 Watchers: 78
🍴 Forks: 741 forks

💻 Programming Languages: Python

🏷️ Related Topics:
#data_science #machine_learning #ai #openai #gpt #copilot #agents #data_scientist #ml_engineering #ai_engineer #ai_engineering #ml_engineer #generative_ai


==================================
🧠 By: https://news.1rj.ru/str/DataScienceM
3
📦 PyInstaller: packaging a console application for a 'bare' server

Do you want to deploy your crappy utility on a production server where there's not even X11?
A proper build with PyInstaller will save you from dancing with the devil.

1. The essence of the problem

Building for Windows/Mac with a GUI is easy. But when building for Linux without a graphical environment, PyInstaller might drag in unnecessary dependencies, and your binary won't run.

2. The key command

pyinstaller --onefile --clean --noconsole --noupx your_noscript.py


Pay special attention to the flags:


--noconsole - removes the console for GUI, do not use it for CLI.

--noupx - disables UPX (often a problem on bare servers).

--clean - cleans the cache between builds.

3. Why this is needed


🟢Background daemons and system services

🟢Automation in CI/CD pipelines

绿色Utilities for administration

绿色Microservices in containers

绿色Scripts on 'bare' VPS


4. Important nuances


🟡All dependencies must be explicitly specified in requirements.txt or pyproject.toml.

🟡Test the build in a minimalist Docker image (for example, python:3.11-slim).

🟡Dynamic imports may not be found - specify hidden imports via --hidden-import.


Is it useful?

❤️ - Yes
👍 - I already use it

👩‍💻 @DataSciencen
Please open Telegram to view this post
VIEW IN TELEGRAM
3
Data Science Interview Questions with Answers Part-1

1. What is data science and how is it different from data analytics?
Data science focuses on building predictive and decision-making systems using data. It uses statistics, machine learning, and domain knowledge to forecast outcomes or automate actions. Data analytics focuses on analyzing historical and current data to understand trends and performance. Analytics explains what happened and why. Data science focuses on what will happen next and what decision should be taken.

2. What are the key steps in a data science lifecycle?
A data science lifecycle starts with clearly defining the business problem in measurable terms. Data is then collected from relevant sources and cleaned to handle missing values, errors, and inconsistencies. Exploratory data analysis is performed to understand patterns and relationships. Features are engineered to improve model performance. Models are trained and evaluated using suitable metrics. The best model is deployed and continuously monitored to handle data changes and performance drift.

3. What types of problems does data science solve?
Data science solves prediction, classification, recommendation, optimization, and anomaly detection problems. Examples include predicting customer churn, detecting fraud, recommending products, forecasting demand, and optimizing pricing. These problems usually involve large data, uncertainty, and the need to make data-driven decisions at scale.

4. What skills does a data scientist need in real projects?
A data scientist needs strong skills in statistics, probability, and machine learning. Programming skills in Python or similar languages are required for data processing and modeling. Data cleaning, feature engineering, and model evaluation are critical. Business understanding and communication skills are equally important to translate results into actionable insights.

5. What is the difference between structured and unstructured data?
Structured data is organized in rows and columns with a fixed schema, such as tables in databases. Examples include sales records and customer data. Unstructured data does not follow a predefined format. Examples include text, images, audio, and videos. Structured data is easier to analyze, while unstructured data requires additional processing techniques.

6. What is exploratory data analysis and why do you do it first?
Exploratory data analysis is the process of understanding data using summaries, statistics, and visual checks. It helps identify patterns, trends, outliers, and data quality issues. It is done first to avoid incorrect assumptions and to guide feature engineering and model selection. Good EDA reduces modeling errors later.

7. What are common data sources in real companies?
Common data sources include relational databases, data warehouses, log files, APIs, third-party vendors, spreadsheets, and cloud storage systems. Companies also use data from applications, sensors, user interactions, and external platforms such as payment gateways or marketing tools.

8. What is feature engineering?
Feature engineering is the process of creating new input variables from raw data to improve model performance. This includes transformations, aggregations, encoding categorical values, and creating time-based or behavioral features. Good features often have more impact on results than complex algorithms.

9. What is the difference between supervised and unsupervised learning?
Supervised learning uses labeled data where the target outcome is known. It is used for prediction and classification tasks such as churn prediction or spam detection. Unsupervised learning works with unlabeled data and focuses on finding patterns or structure. It is used for clustering, segmentation, and anomaly detection.
3