Data Science Portfolio - Kaggle Datasets & AI Projects | Artificial Intelligence – Telegram
Data Science Portfolio - Kaggle Datasets & AI Projects | Artificial Intelligence
37.1K subscribers
282 photos
76 files
336 links
Free Datasets For Data Science Projects & Portfolio

Buy ads: https://telega.io/c/DataPortfolio

For Promotions/ads: @coderfun @love_data
Download Telegram
Top🔥10 Computer Vision 🔥Project Ideas 🔥

1. Edge Detection
2. Photo Sketching
3. Detecting Contours
4. Collage Mosaic Generator
5. Barcode and QR Code Scanner
6. Face Detection
7. Blur the Face
8. Image Segmentation
9. Human Counting with OpenCV
10. Colour Detection
👍191
Free Datasets to work on Power BI + SQL projects 👇👇

1. AdventureWorks Sample Database:
- Link: [AdventureWorks Sample Database](https://docs.microsoft.com/en-us/sql/samples/adventureworks-install-configure?view=sql-server-ver15)
- Denoscription: A sample database provided by Microsoft, containing sales, products, customers, and other related data.

2. Online Retail Dataset:
- Link: [UCI Machine Learning Repository - Online Retail Dataset](https://archive.ics.uci.edu/ml/datasets/online+retail)
- Denoscription: Transactional data from an online retail store, suitable for customer segmentation and sales analysis.

3. Supermarket Sales Dataset:
- Link: [Supermarket Sales Dataset](https://www.kaggle.com/aungpyaeap/supermarket-sales)
- Denoscription: Sales data from a supermarket, useful for inventory management and sales performance analysis.

4. Yahoo Finance (Historical Stock Data):
- Link: [Yahoo Finance](https://finance.yahoo.com/)
- Denoscription: Historical stock data for various companies, suitable for financial analysis and visualization.

5. Human Resources Analytics: Employee Attrition and Performance:
- Link: [Kaggle HR Analytics Dataset](https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset)
- Denoscription: Employee data including demographics, performance, and attrition information, suitable for employee performance analysis.

Bonus Open Sources Resources: https://news.1rj.ru/str/DataPortfolio/16

These datasets are freely available for practicing Power BI and SQL skills. You can download them from the provided links and import them into your SQL database management system (e.g., MySQL, SQL Server, PostgreSQL) for hands-on ☺️💪
👍152
FitbitFitness Tracker Data.zip
4.2 MB
📦 Datasets name: FitbitFitness Tracker Data: Capstone Project



🌸 This dataset contains personal fitness tracker from thirty three eligible Fitbit users. This dataset was generated by respondents to a distributed survey via Amazon Mechanical Turk between the 12th of April, 2016 and the 12th of May, 2016.
This dataset has been cleaned, formatted with the date & time columns separated into 2 columns (one for date and the other for 24-hr time format) to prepare for the analysis done in SQL and visualisation in Tableau.


🌐 Format: CSV file

🔐 From: Kaggle
Metaverse Financial Transactions.zip
5.2 MB
📦 Datasets name: Metaverse Financial Transactions


🌸 This dataset provides blockchain financial transactions within the Open Metaverse, aiming to provide a rich, diverse, and realistic set of data for developing and testing anomaly detection models, fraud analysis, and predictive analytics in virtual environments. With a focus on applicability, this dataset captures various transaction types, user behaviors, and risk profiles across a global network.


🌐 Format: CSV file

🔐 From: Kaggle
👍165
Don't forget to check these 10 SQL projects with corresponding datasets that you could use to practice your SQL skills:

1. Analysis of Sales Data:

(https://www.kaggle.com/kyanyoga/sample-sales-data)

2. HR Analytics:

(https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset)

3. Social Media Analytics:

(https://www.kaggle.com/datasets/ramjasmaurya/top-1000-social-media-channels)

4. Financial Data Analysis:

(https://www.kaggle.com/datasets/nitindatta/finance-data)

5. Healthcare Data Analysis:

(https://www.kaggle.com/cdc/mortality)

6. Customer Relationship Management:

(https://www.kaggle.com/pankajjsh06/ibm-watson-marketing-customer-value-data)

7. Web Analytics:

(https://www.kaggle.com/zynicide/wine-reviews)

8. E-commerce Analysis:

(https://www.kaggle.com/olistbr/brazilian-ecommerce)

9. Supply Chain Management:

(https://www.kaggle.com/datasets/harshsingh2209/supply-chain-analysis)

10. Inventory Management:

(https://www.kaggle.com/datasets?search=inventory+management)

Share this channel with your friends 🤝🤩

Join for more -> https://news.1rj.ru/str/addlist/ID95piZJZa0wYzk5

ENJOY LEARNING 👍👍
👍13🔥21
Free Python certification course from Google that you should not miss in 2024.

Link: https://www.kaggle.com/learn/python
👍41
Free Datasets to practice data science projects

1. Enron Email Dataset

Data Link: https://www.cs.cmu.edu/~enron/

2. Chatbot Intents Dataset

Data Link: https://github.com/katanaml/katana-assistant/blob/master/mlbackend/intents.json

3. Flickr 30k Dataset

Data Link: https://www.kaggle.com/hsankesara/flickr-image-dataset

4. Parkinson Dataset

Data Link: https://archive.ics.uci.edu/ml/datasets/parkinsons

5. Iris Dataset

Data Link: https://archive.ics.uci.edu/ml/datasets/Iris

6. ImageNet dataset

Data Link: http://www.image-net.org/

7. Mall Customers Dataset

Data Link: https://www.kaggle.com/shwetabh123/mall-customers

8. Google Trends Data Portal

Data Link: https://trends.google.com/trends/

9. The Boston Housing Dataset

Data Link: https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html

10. Uber Pickups Dataset

Data Link: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city

11. Recommender Systems Dataset

Data Link: https://cseweb.ucsd.edu/~jmcauley/datasets.html

Source Code: https://bit.ly/37iBDEp

12. UCI Spambase Dataset

Data Link: https://archive.ics.uci.edu/ml/datasets/Spambase

13. GTSRB (German traffic sign recognition benchmark) Dataset

Data Link: http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset

Source Code: https://bit.ly/39taSyH

14. Cityscapes Dataset

Data Link: https://www.cityscapes-dataset.com/

15. Kinetics Dataset

Data Link: https://deepmind.com/research/open-source/kinetics

16. IMDB-Wiki dataset

Data Link: https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/


17. Color Detection Dataset

Data Link: https://github.com/codebrainz/color-names/blob/master/output/colors.csv


18. Urban Sound 8K dataset

Data Link: https://urbansounddataset.weebly.com/urbansound8k.html

19. Librispeech Dataset

Data Link: http://www.openslr.org/12

20. Breast Histopathology Images Dataset

Data Link: https://www.kaggle.com/paultimothymooney/breast-histopathology-images

21. Youtube 8M Dataset

Data Link: https://research.google.com/youtube8m/

Join for more -> https://news.1rj.ru/str/addlist/ID95piZJZa0wYzk5

ENJOY LEARNING 👍👍
👍111
Data Cleaning Checklist:

If you're just starting out in the world of data analytics, hopefully this checklist helps demystify the concept of "data cleaning"...

Missing data - Decide if you’re going to omit the datapoint, mathematically estimate the missing data using statistical methods, or use an external source to fill in the missing data.

Duplicate data - Identify duplicate data and what it means in context. Is the duplicate an error that needs to be deleted? Or is it possible that you could have two of the same data point?

Formatting errors - Ensure all data is rounded to the correct decimal place, all data is aligned correctly, and the data format is consistent within columns.

Incorrect data types - Ensure all of your data is pulled as the correct data type (ex. making sure that integers are not used for money values).

Outliers - Identify data points that are +/- 2 standard deviations from the mean, and double check that these values are correct. If they are correct, they may require further investigation.
👍7🔥2
5 Handy Tips to master Data Science ⬇️


1️⃣ Begin with introductory projects that cover the fundamental concepts of data science, such as data exploration, cleaning, and visualization. These projects will help you get familiar with common data science tools and libraries like Python (Pandas, NumPy, Matplotlib), R, SQL, and Excel

2️⃣ Look for publicly available datasets from sources like Kaggle, UCI Machine Learning Repository. Working with real-world data will expose you to the challenges of messy, incomplete, and heterogeneous data, which is common in practical scenarios.

3️⃣ Explore various data science techniques like regression, classification, clustering, and time series analysis. Apply these techniques to different datasets and domains to gain a broader understanding of their strengths, weaknesses, and appropriate use cases.

4️⃣ Work on projects that involve the entire data science lifecycle, from data collection and cleaning to model building, evaluation, and deployment. This will help you understand how different components of the data science process fit together.

5️⃣ Consistent practice is key to mastering any skill. Set aside dedicated time to work on data science projects, and gradually increase the complexity and scope of your projects as you gain more experience.
👍54
🚀Here are 5 fresh Project ideas for Data Analysts 👇

🎯 𝗔𝗶𝗿𝗯𝗻𝗯 𝗢𝗽𝗲𝗻 𝗗𝗮𝘁𝗮 🏠
https://www.kaggle.com/datasets/arianazmoudeh/airbnbopendata

💡This dataset describes the listing activity of homestays in New York City

🎯 𝗧𝗼𝗽 𝗦𝗽𝗼𝘁𝗶𝗳𝘆 𝘀𝗼𝗻𝗴𝘀 𝗳𝗿𝗼𝗺 𝟮𝟬𝟭𝟬-𝟮𝟬𝟭𝟵 🎵

https://www.kaggle.com/datasets/leonardopena/top-spotify-songs-from-20102019-by-year

🎯𝗪𝗮𝗹𝗺𝗮𝗿𝘁 𝗦𝘁𝗼𝗿𝗲 𝗦𝗮𝗹𝗲𝘀 𝗙𝗼𝗿𝗲𝗰𝗮𝘀𝘁𝗶𝗻𝗴 📈

https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting/data
💡Use historical markdown data to predict store sales

🎯 𝗡𝗲𝘁𝗳𝗹𝗶𝘅 𝗠𝗼𝘃𝗶𝗲𝘀 𝗮𝗻𝗱 𝗧𝗩 𝗦𝗵𝗼𝘄𝘀 📺

https://www.kaggle.com/datasets/shivamb/netflix-shows
💡Listings of movies and tv shows on Netflix - Regularly Updated

🎯𝗟𝗶𝗻𝗸𝗲𝗱𝗜𝗻 𝗗𝗮𝘁𝗮 𝗔𝗻𝗮𝗹𝘆𝘀𝘁 𝗷𝗼𝗯𝘀 𝗹𝗶𝘀𝘁𝗶𝗻𝗴𝘀 💼

https://www.kaggle.com/datasets/cedricaubin/linkedin-data-analyst-jobs-listings
💡More than 8400 rows of data analyst jobs from USA, Canada and Africa.

ENJOY LEARNING 👍👍
👍11
🔒 Dataset Name: Spotify Songs Album

🔍 This dataset provides concise details about music tracks and their performance across various platforms. It includes essential information like track name, artist(s), release date, and presence in popular playlists and charts on platforms like Spotify, Apple Music, Deezer, and Shazam. Additionally, it features metrics such as BPM, key, mode, danceability, valence, energy, acousticness, instrumentalness, and liveness_speechiness, which offer insights into the musical characteristics and appeal of each track.

💡 With this data, analysts can evaluate the popularity, genre, and audience engagement of different music offerings across multiple streaming services.

🤌 From: Kaggle

🤖 Size: 47.1 kB
👍52
🔒 Dataset Name: Employee Data Analysis

🔍 Unlocking Insights for a Thriving Workplace

🚀 Our extensive collection of datasets provides a deep dive into different aspects of employee engagement and organizational dynamics.

💡 Our extensive collection of datasets provides a deep dive into different aspects of employee engagement and organizational dynamics.

🤌 From: Kaggle

🤖 Size: 120 kB
5👍4
cryptos historical data.zip
26.5 MB
Dataset Name: top 1000 cryptos historical data ( Daily updates )
Instagram fake spammer genuine accounts.zip
6.8 KB
Dataset Name: Instagram fake spammer genuine accounts
    
👍73