Kaggle Data Hub – Telegram
Kaggle Data Hub
28.9K subscribers
811 photos
13 videos
309 files
1.07K links
Your go-to hub for Kaggle datasets – explore, analyze, and leverage data for Machine Learning and Data Science projects.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
archive.zip.003
1.3 GB
ISIC 2019 Skin Lesion images for classification

https://news.1rj.ru/str/datasets1 👑
Please open Telegram to view this post
VIEW IN TELEGRAM
3👍3🔥1
Tuberculosis (TB) Prediction(Top 75 Countries)

About Dataset

This dataset includes 400,000 records with 22 variables that capture demographic, health, and socioeconomic factors influencing tuberculosis incidence across 70 countries. The data is designed to resemble real-world patterns observed in tuberculosis prevalence and healthcare indicators. It can be used for tasks such as denoscriptive analysis, machine learning, and public health research.

https://news.1rj.ru/str/datasets1 🏐
Please open Telegram to view this post
VIEW IN TELEGRAM
👍62🔥1
archive.zip
62.2 MB
Tuberculosis (TB) Prediction(Top 75 Countries)

https://news.1rj.ru/str/datasets1 🏐
Please open Telegram to view this post
VIEW IN TELEGRAM
2🔥1
STL-10 Image Recognition Dataset

Train models to recognize different animals and vehicles

Context

STL-10 is an image recognition dataset inspired by CIFAR-10 dataset with some improvements. With a corpus of 100,000 unlabeled images and 500 training images, this dataset is best for developing unsupervised feature learning, deep learning, self-taught learning algorithms. Unlike CIFAR-10, the dataset has a higher resolution which makes it a challenging benchmark for developing more scalable unsupervised learning methods.
Content

Data overview:

There are three files: train_image.zips, test_images.zip and unlabeled_images.zip
10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck
Images are 96x96 pixels, color
500 training images (10 pre-defined folds), 800 test images per class
100,000 unlabeled images for unsupervised learning. These examples are extracted from a similar but broader distribution of images. For instance, it contains other types of animals (bears, rabbits, etc.) and vehicles (trains, buses, etc.) in addition to the ones in the labeled set
Images were acquired from labeled examples on ImageNet

https://news.1rj.ru/str/datasets1 🆘
Please open Telegram to view this post
VIEW IN TELEGRAM
👍13🔥2
archive.zip
1.9 GB
STL-10 Image Recognition Dataset

https://news.1rj.ru/str/datasets1 💦
Please open Telegram to view this post
VIEW IN TELEGRAM
👍8🔥2
Skin Cancer MNIST: HAM10000

a large collection of multi-source dermatoscopic images of pigmented lesions

Overview
Another more interesting than digit classification dataset to use to get biology and medicine students more excited about machine learning and image processing.

https://news.1rj.ru/str/datasets1 🩵
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4🔥1
The California Wildfire Data 🔥🔥🔥🔥

Structures Impacted by Wildland Fires in California!

Column Denoscriptions:
OBJECTID: A unique identifier for each record in the dataset.
DAMAGE: Indicates the level of fire damage to the structure (e.g., "No Damage", "Affected (1-9%)").
STREETNUMBER: The street number of the impacted structure.
STREETNAME: The name of the street where the impacted structure is located.
STREETTYPE: The type of street (e.g., "Road", "Lane").
STREETSUFFIX: Additional address information, such as apartment or building numbers (if applicable).
CITY: The city where the impacted structure is located.
STATE: The state abbreviation (e.g., "CA" for California).
ZIPCODE: The postal code of the impacted structure.
CALFIREUNIT: The CAL FIRE unit responsible for the area.
COUNTY: The county where the impacted structure is located.
COMMUNITY: The community or neighborhood of the structure.
INCIDENTNAME: The name of the fire incident that impacted the structure.
APN: The Assessor’s Parcel Number (APN) of the property.
ASSESSEDIMPROVEDVALUE: The assessed value of the improved property (e.g., structures, not just land).
YEARBUILT: The year the structure was built.
SITEADDRESS: The full address of the property, including city, state, and ZIP code.
GLOBALID: A globally unique identifier for each record.
Latitude: The latitude coordinate of the structure’s location.
Longitude: The longitude coordinate of the structure’s location.
UTILITYMISCSTRUCTUREDISTANCE: The distance between the main structure and any utility or miscellaneous structures (if recorded).
FIRENAME: An alternative or secondary name for the fire incident.
geometry: A geospatial representation of the location in a point format (e.g., "POINT (-13585927.697 4646740.750)").

https://news.1rj.ru/str/datasets1 🎙
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4🔥31
Please open Telegram to view this post
VIEW IN TELEGRAM
4👍3🔥3
Stress Non stress Images

Emotion-Based Stress Classification: Categorization of Stress and Non-Stress Sta

The dataset contains images categorized based on a person's emotional state, classified into the following groups:

Non-Stress: Includes emotions such as happy and neutral.
Stress: Includes emotions such as sad and angry.
This categorization facilitates the analysis of emotional states in relation to stress levels.
Originally these datasets are available at official website of CK+ and TFEID

#Datasets #Kaggle #MachineLearning #Python #ML #LLM #NLP #ComputerVision #GPT4

https://news.1rj.ru/str/datasets1 ⚠️
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3🔥3
archive.zip
573 MB
Stress Non stress Images

Emotion-Based Stress #Classification: Categorization of Stress and Non-Stress Sta

#Datasets #Kaggle #MachineLearning #Python #ML #LLM #NLP #ComputerVision #GPT4

https://news.1rj.ru/str/datasets1 ⚠️
Please open Telegram to view this post
VIEW IN TELEGRAM
👍6🔥2
Forwarded from Tomas
🎁 Your balance is credited $4,000 , the owner of the channel wants to contact you!

Dear subscriber, we would like to thank you very much for supporting our channel, and as a token of our gratitude we would like to provide you with free access to Lisa's investor channel, with the help of which you can earn today

T.me/Lisainvestor

Be sure to take advantage of our gift, admission is free, don't miss the opportunity, change your life for the better.

You can follow the link :
https://news.1rj.ru/str/+-FM_9cBcSGUyZmFh
👍4👎2
Crime Data

Crime Data from 2020 to Present

Denoscription:
This dataset contains detailed records of crimes reported across various regions from 2020 to the present. It provides valuable insights into crime trends, patterns, and changes in crime rates over time. The data is suitable for researchers, data analysts, law enforcement agencies, and policymakers looking to analyze crime dynamics or develop predictive models to enhance public safety measures.

https://news.1rj.ru/str/datasets1 😭
Please open Telegram to view this post
VIEW IN TELEGRAM
👍71🔥1
CICIDS2017: Cleaned & Preprocessed

Cleaned and Preprocessed CICIDS2017 Data for Machine Learning

Cleaned and Preprocessed CICIDS2017 Data for Machine Learning

This dataset provides a cleaned and preprocessed version of the original CICIDS2017 network intrusion detection dataset, prepared for machine learning. It includes the following CSV file:

cicids2017_cleaned.csv: Contains the raw, unscaled feature values after cleaning and preprocessing, ready for further treatment (such as scaling and sampling) after train/test split.

#Datasets #Kaggle #MachineLearning #Python #ML #LLM #NLP #ComputerVision #GPT4

https://news.1rj.ru/str/datasets1 👿
Please open Telegram to view this post
VIEW IN TELEGRAM
2🔥2👍1
Please open Telegram to view this post
VIEW IN TELEGRAM
1🔥1
Global Terrorism Database

More than 180,000 terrorist attacks worldwide, 1970-2017

Context

Information on more than 180,000 Terrorist Attacks

The Global Terrorism Database (GTD) is an open-source database including information on terrorist attacks around the world from 1970 through 2017. The GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 180,000 attacks. The database is maintained by researchers at the National Consortium for the Study of Terrorism and Responses to Terrorism (START), headquartered at the University of Maryland.
More Information

#Datasets #Kaggle #MachineLearning #Python #ML #LLM #NLP #ComputerVision #GPT4

https://news.1rj.ru/str/datasets1 👿
Please open Telegram to view this post
VIEW IN TELEGRAM
2🔥2👍1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍1🔥1
International football results from 1872 to 2024

An up-to-date dataset of over 47,000 international football results

Context

Well, what happened was that I was looking for a semi-definite easy-to-read list of international football matches and couldn't find anything decent. So I took it upon myself to collect it for my own use. I might as well share it.
Content

This dataset includes 47,917 results of international football matches starting from the very first official match in 1872 up to 2024. The matches range from FIFA World Cup to FIFI Wild Cup to regular friendly matches. The matches are strictly men's full internationals and the data does not include Olympic Games or matches where at least one of the teams was the nation's B-team, U-23 or a league select team.

results.csv includes the following columns:

date - date of the match
home_team - the name of the home team
away_team - the name of the away team
home_score - full-time home team score including extra time, not including penalty-shootouts
away_score - full-time away team score including extra time, not including penalty-shootouts
tournament - the name of the tournament
city - the name of the city/town/administrative unit where the match was played
country - the name of the country where the match was played
neutral - TRUE/FALSE column indicating whether the match was played at a neutral venue

shootouts.csv includes the following columns:

date - date of the match
home_team - the name of the home team
away_team - the name of the away team
winner - winner of the penalty-shootout
first_shooter - the team that went first in the shootout

goalscorers.csv includes the following columns:

date - date of the match
home_team - the name of the home team
away_team - the name of the away team
team - name of the team scoring the goal
scorer - name of the player scoring the goal
own_goal - whether the goal was an own-goal
penalty - whether the goal was a penalty

https://news.1rj.ru/str/datasets1 🖕
Please open Telegram to view this post
VIEW IN TELEGRAM
🔥21
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4🔥2