import ChatTTS
from IPython.display import Audio
chat = ChatTTS.Chat()
chat.load_models()
texts = ["<PUT YOUR TEXT HERE>",]
wavs = chat.infer(texts, use_decoder=True)
Audio(wavs[0], rate=24_000, autoplay=True)
ChatTTS is a text-to-speech model designed specifically for conversational scenarios such as LLM assistant.
ChatTTS supports both English and Chinese (if this is relevant).
🤗 Play Hugging Face
Please open Telegram to view this post
VIEW IN TELEGRAM
❤18🔥2
Please open Telegram to view this post
VIEW IN TELEGRAM
❤20
Please open Telegram to view this post
VIEW IN TELEGRAM
❤7🔥4
Please open Telegram to view this post
VIEW IN TELEGRAM
❤38👍4
🌟 DeepSearcher: AI Harvester for Your Data.
It is positioned by developers as a tool for enterprise knowledge management, intelligent QA systems and information search scenarios.
DeepSearcher can use information from the Internet if necessary, is compatible with Milvus vector databases and their service provider Zilliz Cloud, Pymilvus, OpenAI and VoyageAI embeddings. It is possible to connect LLM DeepSeek and OpenAI via API directly or through TogetherAI and SiliconFlow.
Local file download, connection of web crawlers FireCrawl, Crawl4AI and Jina Reader are supported.
Our immediate plans include adding a web clipper feature, expanding the list of supported vector databases, and creating a RESTful API interface.
▶️ Local installation and launch:
# Clone the repository
# Create a Python venv
# Install dependencies
# Quick start demo
# Customize your config here
# Load your local data
# (Optional) Load from web crawling (
# Query
🌐 GitHub: https://github.com/zilliztech/deep-searcher
The project combines the use of LLM, vector databases to perform search, evaluation, and reasoning tasks based on the provided data (files, text, sources).
It is positioned by developers as a tool for enterprise knowledge management, intelligent QA systems and information search scenarios.
DeepSearcher can use information from the Internet if necessary, is compatible with Milvus vector databases and their service provider Zilliz Cloud, Pymilvus, OpenAI and VoyageAI embeddings. It is possible to connect LLM DeepSeek and OpenAI via API directly or through TogetherAI and SiliconFlow.
Local file download, connection of web crawlers FireCrawl, Crawl4AI and Jina Reader are supported.
Our immediate plans include adding a web clipper feature, expanding the list of supported vector databases, and creating a RESTful API interface.
▶️ Local installation and launch:
# Clone the repository
git clone https://github.com/zilliztech/deep-searcher.git
# Create a Python venv
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
cd deep searcher
pip install -e .
# Quick start demo
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query
config = Configuration()
# Customize your config here
config.set_provider_config("llm", "OpenAI", {"model": "gpt-4o-mini"})
init_config(config = config)# Load your local data
from deepsearcher.offline_loading import load_from_local_files
load_from_local_files(paths_or_directory=your_local_path)
# (Optional) Load from web crawling (
FIRECRAWL_API_KEY env variable required)from deepsearcher.offline_loading import load_from_website
load_from_website(urls=website_url)
# Query
result = query("Write a report about xxx.") # Your question herePlease open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
❤40🔥2🥰1
Please open Telegram to view this post
VIEW IN TELEGRAM
❤12
Please open Telegram to view this post
VIEW IN TELEGRAM
❤3👍2
Resume key words for data scientist role explained in points:
1. Data Analysis:
- Proficient in extracting, cleaning, and analyzing data to derive insights.
- Skilled in using statistical methods and machine learning algorithms for data analysis.
- Experience with tools such as Python, R, or SQL for data manipulation and analysis.
2. Machine Learning:
- Strong understanding of machine learning techniques such as regression, classification, clustering, and neural networks.
- Experience in model development, evaluation, and deployment.
- Familiarity with libraries like TensorFlow, scikit-learn, or PyTorch for implementing machine learning models.
3. Data Visualization:
- Ability to present complex data in a clear and understandable manner through visualizations.
- Proficiency in tools like Matplotlib, Seaborn, or Tableau for creating insightful graphs and charts.
- Understanding of best practices in data visualization for effective communication of findings.
4. Big Data:
- Experience working with large datasets using technologies like Hadoop, Spark, or Apache Flink.
- Knowledge of distributed computing principles and tools for processing and analyzing big data.
- Ability to optimize algorithms and processes for scalability and performance.
5. Problem-Solving:
- Strong analytical and problem-solving skills to tackle complex data-related challenges.
- Ability to formulate hypotheses, design experiments, and iterate on solutions.
- Aptitude for identifying opportunities for leveraging data to drive business outcomes and decision-making.
Resume key words for a data analyst role
1. SQL (Structured Query Language):
- SQL is a programming language used for managing and querying relational databases.
- Data analysts often use SQL to extract, manipulate, and analyze data stored in databases, making it a fundamental skill for the role.
2. Python/R:
- Python and R are popular programming languages used for data analysis and statistical computing.
- Proficiency in Python or R allows data analysts to perform various tasks such as data cleaning, modeling, visualization, and machine learning.
3. Data Visualization:
- Data visualization involves presenting data in graphical or visual formats to communicate insights effectively.
- Data analysts use tools like Tableau, Power BI, or Python libraries like Matplotlib and Seaborn to create visualizations that help stakeholders understand complex data patterns and trends.
4. Statistical Analysis:
- Statistical analysis involves applying statistical methods to analyze and interpret data.
- Data analysts use statistical techniques to uncover relationships, trends, and patterns in data, providing valuable insights for decision-making.
5. Data-driven Decision Making:
- Data-driven decision making is the process of making decisions based on data analysis and evidence rather than intuition or gut feelings.
- Data analysts play a crucial role in helping organizations make informed decisions by analyzing data and providing actionable insights that drive business strategies and operations.
1. Data Analysis:
- Proficient in extracting, cleaning, and analyzing data to derive insights.
- Skilled in using statistical methods and machine learning algorithms for data analysis.
- Experience with tools such as Python, R, or SQL for data manipulation and analysis.
2. Machine Learning:
- Strong understanding of machine learning techniques such as regression, classification, clustering, and neural networks.
- Experience in model development, evaluation, and deployment.
- Familiarity with libraries like TensorFlow, scikit-learn, or PyTorch for implementing machine learning models.
3. Data Visualization:
- Ability to present complex data in a clear and understandable manner through visualizations.
- Proficiency in tools like Matplotlib, Seaborn, or Tableau for creating insightful graphs and charts.
- Understanding of best practices in data visualization for effective communication of findings.
4. Big Data:
- Experience working with large datasets using technologies like Hadoop, Spark, or Apache Flink.
- Knowledge of distributed computing principles and tools for processing and analyzing big data.
- Ability to optimize algorithms and processes for scalability and performance.
5. Problem-Solving:
- Strong analytical and problem-solving skills to tackle complex data-related challenges.
- Ability to formulate hypotheses, design experiments, and iterate on solutions.
- Aptitude for identifying opportunities for leveraging data to drive business outcomes and decision-making.
Resume key words for a data analyst role
1. SQL (Structured Query Language):
- SQL is a programming language used for managing and querying relational databases.
- Data analysts often use SQL to extract, manipulate, and analyze data stored in databases, making it a fundamental skill for the role.
2. Python/R:
- Python and R are popular programming languages used for data analysis and statistical computing.
- Proficiency in Python or R allows data analysts to perform various tasks such as data cleaning, modeling, visualization, and machine learning.
3. Data Visualization:
- Data visualization involves presenting data in graphical or visual formats to communicate insights effectively.
- Data analysts use tools like Tableau, Power BI, or Python libraries like Matplotlib and Seaborn to create visualizations that help stakeholders understand complex data patterns and trends.
4. Statistical Analysis:
- Statistical analysis involves applying statistical methods to analyze and interpret data.
- Data analysts use statistical techniques to uncover relationships, trends, and patterns in data, providing valuable insights for decision-making.
5. Data-driven Decision Making:
- Data-driven decision making is the process of making decisions based on data analysis and evidence rather than intuition or gut feelings.
- Data analysts play a crucial role in helping organizations make informed decisions by analyzing data and providing actionable insights that drive business strategies and operations.
❤50👍2
Please open Telegram to view this post
VIEW IN TELEGRAM
❤10
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3
The "Deepdive Llama3 from scratch" project is an extended fork of the guide repository for creating LLama-3 from scratch step by step.
The original project has been reworked, updated, improved and optimized in order to help everyone understand and master the implementation principle and detailed rationalization process of the Llama3 model.
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
1❤16
olmOCR is a project designed to convert PDF files and document images into structured Markdown text. It can handle equations, tables, and handwritten text, preserving the correct reading order even in the most complex multi-column layouts.
olmOCR is trained with heuristics to handle common parsing and metadata errors and supports SGLang and vLLM, where it can scale from one to hundreds of GPUs, making it a unique solution for large-scale tasks.
The key advantage of olmOCR is its cost-effectiveness. Processing 1 million PDF pages will cost only $190 (with GPU rental), which is about 1/32 of the cost of using the GPT-4o API for the same volume.
The development team created a unique method called "document anchoring" to improve the quality of the extracted text. It uses text and metadata from PDF files to improve the accuracy of processing. Image regions and text blocks are extracted, concatenated and inserted into the model prompt. When VLM requests a plain text version of the document, the "anchored" text is used along with the rasterized page image.
In tests, olmOCR showed high results compared to Marker, MinerU and GOT-OCR 2.0. During testing, olmOCR was preferred in 61.3% of cases against Marker, in 58.6% against GOT-OCR and in 71.4% against MinerU.
poppler-utilssglang with flashinfer for GPU inference# Install dependencies
sudo apt-get update
sudo apt-get install poppler-utils ttf-mscorefonts-installer msttcorefonts fonts-crosextra-caladea fonts-crosextra-carlito gsfonts lcdf-typetools
# Set up a conda env
conda create -n olmocr python=3.11
conda activate olmocr
git clone https://github.com/allenai/olmocr.git
cd olmocr
pip install -e .
# Convert a Single PDF
python -m olmocr.pipeline ./localworkspace --pdfs tests/gnarly_pdfs/test.pdf
# Convert Multiple PDFs
python -m olmocr.pipeline ./localworkspace --pdfs tests/gnarly_pdfs/*.pdf
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
❤41👍1🔥1