Data Science Portfolio - Kaggle Datasets & AI Projects | Artificial Intelligence – Telegram
Data Science Portfolio - Kaggle Datasets & AI Projects | Artificial Intelligence
37.1K subscribers
282 photos
76 files
336 links
Free Datasets For Data Science Projects & Portfolio

Buy ads: https://telega.io/c/DataPortfolio

For Promotions/ads: @coderfun @love_data
Download Telegram
Free Datasets to practice data science projects

1. Enron Email Dataset

Data Link: https://www.cs.cmu.edu/~enron/

2. Chatbot Intents Dataset

Data Link: https://github.com/katanaml/katana-assistant/blob/master/mlbackend/intents.json

3. Flickr 30k Dataset

Data Link: https://www.kaggle.com/hsankesara/flickr-image-dataset

4. Parkinson Dataset

Data Link: https://archive.ics.uci.edu/ml/datasets/parkinsons

5. Iris Dataset

Data Link: https://archive.ics.uci.edu/ml/datasets/Iris

6. ImageNet dataset

Data Link: http://www.image-net.org/

7. Mall Customers Dataset

Data Link: https://www.kaggle.com/shwetabh123/mall-customers

8. Google Trends Data Portal

Data Link: https://trends.google.com/trends/

9. The Boston Housing Dataset

Data Link: https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html

10. Uber Pickups Dataset

Data Link: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city

11. Recommender Systems Dataset

Data Link: https://cseweb.ucsd.edu/~jmcauley/datasets.html

Source Code: https://bit.ly/37iBDEp

12. UCI Spambase Dataset

Data Link: https://archive.ics.uci.edu/ml/datasets/Spambase

13. GTSRB (German traffic sign recognition benchmark) Dataset

Data Link: http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset

Source Code: https://bit.ly/39taSyH

14. Cityscapes Dataset

Data Link: https://www.cityscapes-dataset.com/

15. Kinetics Dataset

Data Link: https://deepmind.com/research/open-source/kinetics

16. IMDB-Wiki dataset

Data Link: https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/


17. Color Detection Dataset

Data Link: https://github.com/codebrainz/color-names/blob/master/output/colors.csv


18. Urban Sound 8K dataset

Data Link: https://urbansounddataset.weebly.com/urbansound8k.html

19. Librispeech Dataset

Data Link: http://www.openslr.org/12

20. Breast Histopathology Images Dataset

Data Link: https://www.kaggle.com/paultimothymooney/breast-histopathology-images

21. Youtube 8M Dataset

Data Link: https://research.google.com/youtube8m/

Join for more -> https://news.1rj.ru/str/addlist/ID95piZJZa0wYzk5

ENJOY LEARNING 👍👍
👍111
Data Cleaning Checklist:

If you're just starting out in the world of data analytics, hopefully this checklist helps demystify the concept of "data cleaning"...

Missing data - Decide if you’re going to omit the datapoint, mathematically estimate the missing data using statistical methods, or use an external source to fill in the missing data.

Duplicate data - Identify duplicate data and what it means in context. Is the duplicate an error that needs to be deleted? Or is it possible that you could have two of the same data point?

Formatting errors - Ensure all data is rounded to the correct decimal place, all data is aligned correctly, and the data format is consistent within columns.

Incorrect data types - Ensure all of your data is pulled as the correct data type (ex. making sure that integers are not used for money values).

Outliers - Identify data points that are +/- 2 standard deviations from the mean, and double check that these values are correct. If they are correct, they may require further investigation.
👍7🔥2
5 Handy Tips to master Data Science ⬇️


1️⃣ Begin with introductory projects that cover the fundamental concepts of data science, such as data exploration, cleaning, and visualization. These projects will help you get familiar with common data science tools and libraries like Python (Pandas, NumPy, Matplotlib), R, SQL, and Excel

2️⃣ Look for publicly available datasets from sources like Kaggle, UCI Machine Learning Repository. Working with real-world data will expose you to the challenges of messy, incomplete, and heterogeneous data, which is common in practical scenarios.

3️⃣ Explore various data science techniques like regression, classification, clustering, and time series analysis. Apply these techniques to different datasets and domains to gain a broader understanding of their strengths, weaknesses, and appropriate use cases.

4️⃣ Work on projects that involve the entire data science lifecycle, from data collection and cleaning to model building, evaluation, and deployment. This will help you understand how different components of the data science process fit together.

5️⃣ Consistent practice is key to mastering any skill. Set aside dedicated time to work on data science projects, and gradually increase the complexity and scope of your projects as you gain more experience.
👍54
🚀Here are 5 fresh Project ideas for Data Analysts 👇

🎯 𝗔𝗶𝗿𝗯𝗻𝗯 𝗢𝗽𝗲𝗻 𝗗𝗮𝘁𝗮 🏠
https://www.kaggle.com/datasets/arianazmoudeh/airbnbopendata

💡This dataset describes the listing activity of homestays in New York City

🎯 𝗧𝗼𝗽 𝗦𝗽𝗼𝘁𝗶𝗳𝘆 𝘀𝗼𝗻𝗴𝘀 𝗳𝗿𝗼𝗺 𝟮𝟬𝟭𝟬-𝟮𝟬𝟭𝟵 🎵

https://www.kaggle.com/datasets/leonardopena/top-spotify-songs-from-20102019-by-year

🎯𝗪𝗮𝗹𝗺𝗮𝗿𝘁 𝗦𝘁𝗼𝗿𝗲 𝗦𝗮𝗹𝗲𝘀 𝗙𝗼𝗿𝗲𝗰𝗮𝘀𝘁𝗶𝗻𝗴 📈

https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting/data
💡Use historical markdown data to predict store sales

🎯 𝗡𝗲𝘁𝗳𝗹𝗶𝘅 𝗠𝗼𝘃𝗶𝗲𝘀 𝗮𝗻𝗱 𝗧𝗩 𝗦𝗵𝗼𝘄𝘀 📺

https://www.kaggle.com/datasets/shivamb/netflix-shows
💡Listings of movies and tv shows on Netflix - Regularly Updated

🎯𝗟𝗶𝗻𝗸𝗲𝗱𝗜𝗻 𝗗𝗮𝘁𝗮 𝗔𝗻𝗮𝗹𝘆𝘀𝘁 𝗷𝗼𝗯𝘀 𝗹𝗶𝘀𝘁𝗶𝗻𝗴𝘀 💼

https://www.kaggle.com/datasets/cedricaubin/linkedin-data-analyst-jobs-listings
💡More than 8400 rows of data analyst jobs from USA, Canada and Africa.

ENJOY LEARNING 👍👍
👍11
🔒 Dataset Name: Spotify Songs Album

🔍 This dataset provides concise details about music tracks and their performance across various platforms. It includes essential information like track name, artist(s), release date, and presence in popular playlists and charts on platforms like Spotify, Apple Music, Deezer, and Shazam. Additionally, it features metrics such as BPM, key, mode, danceability, valence, energy, acousticness, instrumentalness, and liveness_speechiness, which offer insights into the musical characteristics and appeal of each track.

💡 With this data, analysts can evaluate the popularity, genre, and audience engagement of different music offerings across multiple streaming services.

🤌 From: Kaggle

🤖 Size: 47.1 kB
👍52
🔒 Dataset Name: Employee Data Analysis

🔍 Unlocking Insights for a Thriving Workplace

🚀 Our extensive collection of datasets provides a deep dive into different aspects of employee engagement and organizational dynamics.

💡 Our extensive collection of datasets provides a deep dive into different aspects of employee engagement and organizational dynamics.

🤌 From: Kaggle

🤖 Size: 120 kB
5👍4
cryptos historical data.zip
26.5 MB
Dataset Name: top 1000 cryptos historical data ( Daily updates )
Instagram fake spammer genuine accounts.zip
6.8 KB
Dataset Name: Instagram fake spammer genuine accounts
    
👍73
Don't forget to check these 10 SQL projects with corresponding datasets that you could use to practice your SQL skills:

1. Analysis of Sales Data:

(https://www.kaggle.com/kyanyoga/sample-sales-data)

2. HR Analytics:

(https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset)

3. Social Media Analytics:

(https://www.kaggle.com/datasets/ramjasmaurya/top-1000-social-media-channels)

4. Financial Data Analysis:

(https://www.kaggle.com/datasets/nitindatta/finance-data)

5. Healthcare Data Analysis:

(https://www.kaggle.com/cdc/mortality)

6. Customer Relationship Management:

(https://www.kaggle.com/pankajjsh06/ibm-watson-marketing-customer-value-data)

7. Web Analytics:

(https://www.kaggle.com/zynicide/wine-reviews)

8. E-commerce Analysis:

(https://www.kaggle.com/olistbr/brazilian-ecommerce)

9. Supply Chain Management:

(https://www.kaggle.com/datasets/harshsingh2209/supply-chain-analysis)

10. Inventory Management:

(https://www.kaggle.com/datasets?search=inventory+management)

Share this channel with your friends 🤝🤩

Join for more -> https://news.1rj.ru/str/addlist/ID95piZJZa0wYzk5

ENJOY LEARNING 👍👍
👍83
The key to starting your data analysis career:

It's not your education
It's not your experience

It's how you apply these principles:

1. Learn the job through "doing"
2. Build a portfolio
3. Make yourself known

No one starts an expert, but everyone can become one.

If you're looking for a career in data analysis, start by:

⟶ Watching videos
⟶ Reading experts advice
⟶ Doing internships
⟶ Building a portfolio
⟶ Learning from seniors

You'll be amazed at how fast you'll learn and how quickly you'll become an expert.

So, start today and let the data analysis career begin
👍84
Here is the list of few projects (found on kaggle). They cover Basics of Python, Advanced Statistics, Supervised Learning (Regression and Classification problems) & Data Science

Please also check the discussions and notebook submissions for different approaches and solution after you tried yourself.

1. Basic python and statistics

Pima Indians :- https://www.kaggle.com/uciml/pima-indians-diabetes-database
Cardio Goodness fit :- https://www.kaggle.com/saurav9786/cardiogoodfitness
Automobile :- https://www.kaggle.com/toramky/automobile-dataset

2. Advanced Statistics

Game of Thrones:-https://www.kaggle.com/mylesoneill/game-of-thrones
World University Ranking:-https://www.kaggle.com/mylesoneill/world-university-rankings
IMDB Movie Dataset:- https://www.kaggle.com/carolzhangdc/imdb-5000-movie-dataset

3. Supervised Learning

a) Regression Problems

How much did it rain :- https://www.kaggle.com/c/how-much-did-it-rain-ii/overview
Inventory Demand:- https://www.kaggle.com/c/grupo-bimbo-inventory-demand
Property Inspection predictiion:- https://www.kaggle.com/c/liberty-mutual-group-property-inspection-prediction
Restaurant Revenue prediction:- https://www.kaggle.com/c/restaurant-revenue-prediction/data
IMDB Box office Prediction:-https://www.kaggle.com/c/tmdb-box-office-prediction/overview

b) Classification problems

Employee Access challenge :- https://www.kaggle.com/c/amazon-employee-access-challenge/overview
Titanic :- https://www.kaggle.com/c/titanic
San Francisco crime:- https://www.kaggle.com/c/sf-crime
Customer satisfcation:-https://www.kaggle.com/c/santander-customer-satisfaction
Trip type classification:- https://www.kaggle.com/c/walmart-recruiting-trip-type-classification
Categorize cusine:- https://www.kaggle.com/c/whats-cooking

4. Some helpful Data science projects for beginners

https://www.kaggle.com/c/house-prices-advanced-regression-techniques

https://www.kaggle.com/c/digit-recognizer

https://www.kaggle.com/c/titanic

5. Intermediate Level Data science Projects

Black Friday Data : https://www.kaggle.com/sdolezel/black-friday

Human Activity Recognition Data : https://www.kaggle.com/uciml/human-activity-recognition-with-smartphones

Trip History Data : https://www.kaggle.com/pronto/cycle-share-dataset

Million Song Data : https://www.kaggle.com/c/msdchallenge

Census Income Data : https://www.kaggle.com/c/census-income/data

Movie Lens Data : https://www.kaggle.com/grouplens/movielens-20m-dataset

Twitter Classification Data : https://www.kaggle.com/c/twitter-sentiment-analysis2

Share with credits: https://news.1rj.ru/str/sqlproject

ENJOY LEARNING 👍👍
👍114
𝐒𝐐𝐋 𝐂𝐚𝐬𝐞 𝐒𝐭𝐮𝐝𝐢𝐞𝐬 𝐟𝐨𝐫 𝐈𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰:

Join for more: https://news.1rj.ru/str/sqlanalyst

1. Danny’s Diner:
Restaurant analytics to understand the customer orders pattern.
Link: https://8weeksqlchallenge.com/case-study-1/

2. Pizza Runner
Pizza shop analytics to optimize the efficiency of the operation
Link: https://8weeksqlchallenge.com/case-study-2/

3. Foodie Fie
Subnoscription-based food content platform
Link: https://lnkd.in/gzB39qAT

4. Data Bank: That’s money
Analytics based on customer activities with the digital bank
Link: https://lnkd.in/gH8pKPyv

5. Data Mart: Fresh is Best
Analytics on Online supermarket
Link: https://lnkd.in/gC5bkcDf

6. Clique Bait: Attention capturing
Analytics on the seafood industry
Link: https://lnkd.in/ggP4JiYG

7. Balanced Tree: Clothing Company
Analytics on the sales performance of clothing store
Link: https://8weeksqlchallenge.com/case-study-7

8. Fresh segments: Extract maximum value
Analytics on online advertising
Link: https://8weeksqlchallenge.com/case-study-8
👍54