Data science/ML/AI – Telegram
Data science/ML/AI
13K subscribers
510 photos
1 video
98 files
314 links
Data science and machine learning hub

Python, SQL, stats, ML, deep learning, projects, PDFs, roadmaps and AI resources.

For beginners, data scientists and ML engineers
👉 https://rebrand.ly/bigdatachannels

DMCA: @disclosure_bds
Contact: @mldatascientist
Download Telegram
Going Denser with Open-Vocabulary Part Segmentation

Publication date:
18 May 2023

Topic: Object detection

Paper: https://arxiv.org/pdf/2305.11173v1.pdf

GitHub: https://github.com/facebookresearch/vlpart

Denoscription:

Object detection has been expanded from a limited number of categories to open vocabulary. Moving forward, a complete intelligent vision system requires understanding more fine-grained object denoscriptions, object parts. In this work, we propose a detector with the ability to predict both open-vocabulary objects and their part segmentation. This ability comes from two designs:

🔹 We train the detector on the joint of part-level, object-level and image-level data.
🔹 We parse the novel object into its parts by its dense semantic correspondence with the base object.
👍4🕊1
Self guide to become a data analyst
4
Cloud Engineer Roadmap
1👍1
1700202599352.pdf
10.1 MB
WHICH CHART WHEN?
The data Analyst's guide to choosing the right charts
👍51
Data Science Techniques
👍2
Create your own roadmap to succeed as a Data Engineer. 😉

▶️In the ever-evolving field of data engineering, staying up-to-date with the latest technologies and best practices is crucial with industries relying heavily on data-driven decision-making.

👉As we approach 2024, the field of data engineering continues to evolve, with new challenges and opportunities with the following key pointers:

📌Programming languages: Python, Scala and Java are few most popular programming languages for data engineers.

📌Databases: SQL or NoSQL databases such as Server, MySQL, and PostgreSQL, MongoDB, Cassandra are few popular databases.

📌Data modeling: The process of creating a blueprint for a database, it helps to ensure that the database is designed to meet the needs of the business.

📌Cloud computing: AWS, Azure, and GCP are the three major cloud computing platforms that can be used to build and deploy data engineering solutions.

📌Big data technologies: Apache Spark, Kafka, Beam and Hadoop are some of the most popular big data technologies to process and analyze large datasets.

📌Data warehousing: Snowflake, Databricks, BigQuery and Redshift are popular data warehousing platforms used to store and analyze large datasets for business intelligence purposes.

📌Data streaming: Apache Kafka and Spark are popular data streaming platform used to process and analyze data in real time.

📌Data lakes and data meshes: The two emerging data management architectures, Data lakes are centralized repositories for all types of data, while data meshes are decentralized architectures that distribute data across multiple locations.

📌Orchestraction: Pipelines are orchestrated using tools like Airflow, Dagster, Mage or similar other tools to schedule and monitor workflows.

📌Data quality, data observability, and data governance: Ensuring reliability and trustworthiness of data quality helps to keep data accurate, complete, and consistent. Data observability helps to monitor and understand data systems. Data governance is the process of establishing policies and procedures for managing data.

📌Data visualization: Tableau, Power BI, and Looker are three popular data visualization tools to create charts and graphs that can be used to communicate data insights to stakeholders.

📌DevOps and data ops: Two set of practices used to automate and streamline the development and deployment of data engineering solutions.

🔰Develop good communication and collaboration skills is equally important to understand the business aspects of data engineering, such as project management and stakeholder engagement.

♐️Stay updated and relevant with emerging trends like AI/ML, and IOT used to develop intelligent data pipelines and data warehouses.

➠Data engineers who want to be successful in 2023-2024 and beyond should focus on developing their skills and experience in the areas listed above.
10👍2
Steps to become a successful data scienctist
6
Data Science and Machine Learning Projects with source code

This repository contains articles, GitHub repos and Kaggle kernels which provides data science and machine learning projects with code.

Creator: Durgesh Samariya
Stars ⭐️: 125
Forked By: 34
https://github.com/durgeshsamariya/Data-Science-Machine-Learning-Project-with-Source-Code

#machine #learning #datascience

Join @datascience_bds for more cool repositories.
*This channel belongs to @bigdataspecialist group
👍42
What is Data Science ?

If you have absolutely no idea what Data Science is and are looking for a very quick non-technical introduction to Data Science , this course will help you get started on fundamental concepts underlying Data Science.

If you are an experienced Data Science professional, attending this course will give you some idea of how to explain your profession to an absolute lay person.

Rating ⭐️: 4.2 out 5
Students 👨‍🎓 : 24,071
Duration : 40min of on-demand video
Created by 👨‍🏫: Gopinath Ramakrishnan

🔗 Course Link


#datascience #data_science

👉Join @datascience_bds for more👈
👍4
👍41
In Data Science you can find multiple data distributions...

But where are they typically found?

Check examples of 4 common distributions:

1️⃣ Normal Distribution:
Often found in natural and social phenomena where many factors contribute to an outcome. Examples include heights of adults in a population, test scores, measurement errors, and blood pressure readings.

2️⃣ Uniform Distribution:
This appears when every outcome in a range is equally likely. Examples include rolling a fair die (each number has an equal chance of appearing) and selecting a random number within a fixed range.

3️⃣ Binomial Distribution:
Used when you're dealing with a fixed number of trials or experiments, each of which has only two possible outcomes (success or failure), like flipping a coin a set number of times, or the number of defective items in a batch.

4️⃣ Poisson Distribution:
Common in scenarios where you're counting the number of times an event happens over a specific interval of time or space. Examples include the number of phone calls received by a call centre in an hour or the probability of taxi frequency.


Each distribution offers insights into the underlying processes of the data and is useful for different kinds of statistical analysis and prediction.
👍7
Data Analytics and Hypothesis Testing.pdf
1.9 MB
Data Analytics and Hypothesis Testing
👍9👏31
Neural Networks and Deep Learning
Neural networks and deep learning are integral parts of artificial intelligence (AI) and machine learning (ML). Here's an overview:

1.Neural Networks: Neural networks are computational models inspired by the human brain's structure and functioning. They consist of interconnected nodes (neurons) organized in layers: input layer, hidden layers, and output layer.

Each neuron receives input, processes it through an activation function, and passes the output to the next layer. Neurons in subsequent layers perform more complex computations based on previous layers' outputs.

Neural networks learn by adjusting weights and biases associated with connections between neurons through a process called training. This is typically done using optimization techniques like gradient descent and backpropagation.

2.Deep Learning : Deep learning is a subset of ML that uses neural networks with multiple layers (hence the term "deep"), allowing them to learn hierarchical representations of data.

These networks can automatically discover patterns, features, and representations in raw data, making them powerful for tasks like image recognition, natural language processing (NLP), speech recognition, and more.

Deep learning architectures such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs), and Transformer models have demonstrated exceptional performance in various domains.

3.Applications Computer Vision: Object detection, image classification, facial recognition, etc., leveraging CNNs.

Natural Language Processing (NLP) Language translation, sentiment analysis, chatbots, etc., utilizing RNNs, LSTMs, and Transformers.
Speech Recognition: Speech-to-text systems using deep neural networks.

4.Challenges and Advancements: Training deep neural networks often requires large amounts of data and computational resources. Techniques like transfer learning, regularization, and optimization algorithms aim to address these challenges.

LAdvancements in hardware (GPUs, TPUs), algorithms (improved architectures like GANs - Generative Adversarial Networks), and techniques (attention mechanisms) have significantly contributed to the success of deep learning.

5. Frameworks and Libraries: There are various open-source libraries and frameworks (TensorFlow, PyTorch, Keras, etc.) that provide tools and APIs for building, training, and deploying neural networks and deep learning models.
👍5
Python Roadmap for Data Science in 2024
👍71
Data Science Interview Questions.pdf
1.8 MB
Data Science Interview Questions
👍81
transaction-fraud-detection

A data science project to predict whether a transaction is a fraud or not.

Creator: juniorcl
Stars ⭐️: 103
Forked By: 53
https://github.com/juniorcl/transaction-fraud-detection

#machine #learning #datascience

Join @datascience_bds for more cool repositories.
*This channel belongs to @bigdataspecialist group
👍2
Learn Data Cleaning with Python

Perform Data Cleaning Techniques with the Python Programming Language. Practice and Solution Notebooks included.

Rating ⭐️: 4.1 out 5
Students 👨‍🎓 : 10,171
Duration : 50min of on-demand video
Created by 👨‍🏫: Valentine Mwangi

🔗 Course Link


#datascience #data_cleaning #python

👉Join @datascience_bds for more👈
👍3
Machine Intelligence - an Introductory Course

Learn the cutting-edge Algorithms in the field of Machine Learning, Deep Learning, Artificial Intelligence, and more!

Rating ⭐️: 4.1 out 5
Students 👨‍🎓 : 14,063
Duration : 40min of on-demand video
Created by 👨‍🏫: Taimur Zahid

🔗 Course Link


#datascience #machinelearning

👉Join @datascience_bds for more👈
Deep Learning CNN Project.pdf
3.8 MB
🚀 Deep Learning CNN Project: Cat vs Dog Classification

🔍 Key Highlights:
📸 25,000 training images, 12,500 testing images
🧠 Custom fully connected layers
➡️ Binary Cross-Entropy loss function
⚙️ Exponential decay and learning rate schedule

🛠 Tools & Libraries:
📊 TensorFlow & Keras
📈 NumPy, OpenCV, Matplotlib
📉 Learning rate scheduling
👍31