Implementations of 15 NLP research papers using Keras, Tensorflow, and Scikit Learn.
https://github.com/GauravBh1010tt/DeepLearn
https://github.com/GauravBh1010tt/DeepLearn
GitHub
GitHub - GauravBh1010tt/DeepLearn: Implementation of research papers on Deep Learning+ NLP+ CV in Python using Keras, Tensorflow…
Implementation of research papers on Deep Learning+ NLP+ CV in Python using Keras, Tensorflow and Scikit Learn. - GauravBh1010tt/DeepLearn
Официальный резил тайга 2.0. Краткая выжимка, по ссылке более детально:
We have gathered the resources with respect to popular NLP-problems:
thematic modelling - news with theme tags, all the sites which provide rubrication (news, poems, prose)
readability of texts - a popular science magazine NPlus1 has a readability metric for each text, provided by editor.
NER and fact extraction - news with references to mentioned person’s page or wiki-information, news with personalia tags
key-words extraction - news with key-word tags, hashtags on social media
authorship attribution - all the texts with author information - magazines, news, and more important - social media - with gender, age, city, time and education mark-up.
chat-bot training - open-source film subnoscripts
text generation - any resource depending on genre
rare words studying, frequency dictionaries - literary magazines, social media
morphological and syntactic parsers - any resource with respect to the genre
Taiga corpus is an ambitious project to become the largest fully available webcorpus constructed from open text sources. Taiga corpus is:
open source, CC BY-SA 3.0
big - about 5 billion words by now
sorted by datasets applicable to different machine laearning tasks
made by linguists, experienced in text crawling, parsing and filtering
rich with metainformation
POS-tagged and syntactically tagged in Universal Dependencies
https://tatianashavrina.github.io/taiga_site/
Создатели:
Tatiana Shavrina (rybolos@gmail.com)
Yana Kurmachova (yana.kurmacheva@gmail.com)
We have gathered the resources with respect to popular NLP-problems:
thematic modelling - news with theme tags, all the sites which provide rubrication (news, poems, prose)
readability of texts - a popular science magazine NPlus1 has a readability metric for each text, provided by editor.
NER and fact extraction - news with references to mentioned person’s page or wiki-information, news with personalia tags
key-words extraction - news with key-word tags, hashtags on social media
authorship attribution - all the texts with author information - magazines, news, and more important - social media - with gender, age, city, time and education mark-up.
chat-bot training - open-source film subnoscripts
text generation - any resource depending on genre
rare words studying, frequency dictionaries - literary magazines, social media
morphological and syntactic parsers - any resource with respect to the genre
Taiga corpus is an ambitious project to become the largest fully available webcorpus constructed from open text sources. Taiga corpus is:
open source, CC BY-SA 3.0
big - about 5 billion words by now
sorted by datasets applicable to different machine laearning tasks
made by linguists, experienced in text crawling, parsing and filtering
rich with metainformation
POS-tagged and syntactically tagged in Universal Dependencies
https://tatianashavrina.github.io/taiga_site/
Создатели:
Tatiana Shavrina (rybolos@gmail.com)
Yana Kurmachova (yana.kurmacheva@gmail.com)
Taiga Сorpus
Taiga is a corpus, where text sources and their meta-information are collected according to popular ML tasks.
An open-source corpus for machine learning.
Python module to easily generate text using a pretrained character-based recurrent neural network.
https://github.com/minimaxir/textgenrnn?reddit=1
https://github.com/minimaxir/textgenrnn?reddit=1
GitHub
minimaxir/textgenrnn
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code. - minimaxir/textgenrnn
Which of the Hollywood stars is most similar to my voice?
https://github.com/andabi/voice-vector
https://github.com/andabi/voice-vector
GitHub
GitHub - andabi/voice-vector: Deep neural networks for getting text-independent speaker embedding written in TensorFlow
Deep neural networks for getting text-independent speaker embedding written in TensorFlow - GitHub - andabi/voice-vector: Deep neural networks for getting text-independent speaker embedding written...
Neural Style Transfer: A Review
https://github.com/ycjing/Neural-Style-Transfer-Papers
https://github.com/ycjing/Neural-Style-Transfer-Papers
GitHub
GitHub - ycjing/Neural-Style-Transfer-Papers: :pencil2: Neural Style Transfer: A Review
:pencil2: Neural Style Transfer: A Review. Contribute to ycjing/Neural-Style-Transfer-Papers development by creating an account on GitHub.
Задачи сегментации изображения с помощью нейронной сети Unet
http://blog.datalytica.ru/2018/03/unet.html
http://blog.datalytica.ru/2018/03/unet.html
blog.datalytica.ru
Задачи сегментации изображения с помощью нейронной сети Unet
Блог компании "Даталитика"
Data Augmentation | How to use Deep Learning when you have Limited Data
https://medium.com/nanonets/how-to-use-deep-learning-when-you-have-limited-data-part-2-data-augmentation-c26971dc8ced
https://medium.com/nanonets/how-to-use-deep-learning-when-you-have-limited-data-part-2-data-augmentation-c26971dc8ced
Medium
Data Augmentation | How to use Deep Learning when you have Limited Data — Part 2
This article is a comprehensive review of Data Augmentation techniques for Deep Learning, specific to images. This is Part 2 of How to use…
Simple GUI app to collect mouse X,Y data for modelling or analysis
https://github.com/oist-cnru/mouse_drawing_app
https://github.com/oist-cnru/mouse_drawing_app
GitHub
oist-cnru/mouse_drawing_app
mouse_drawing_app - A simple GUI app for the generation of 2d mouse input data.
Full-stack machine learning solutions | Hive
https://thehive.ai/blog/simple-ml-serving
https://thehive.ai/blog/simple-ml-serving
thehive.ai
Full-stack machine learning solutions | Hive
Accelerate traditional ML workflows with Hive AI's powerful, deep-learning visual recognition APIs or training data platform; an essential service for data science teams.
Обзор материалов по машинному обучению № 3 (16 — 23 апреля 2018 года)
https://habrahabr.ru/post/354124/
https://habrahabr.ru/post/354124/
Habr
Обзор материалов по машинному обучению № 3 (16 — 23 апреля 2018 года)
Добрый день! Это третий дайджест материалов по машинному обучению и анализу данных, который появился после длительного перерыва. События предстоящей недели 1. Data science завтрак. 25 апреля с 9-30 до...
Как мы участвовали в хакатоне от OpenData
https://habrahabr.ru/company/spbau/blog/354150/
https://habrahabr.ru/company/spbau/blog/354150/
habrahabr.ru
Как мы участвовали в хакатоне от OpenData
Всем привет, в этой статье я хочу рассказать про Why So Serious Hack. Про то, что вообще нас туда привело, чем хакатоны в классическом понимании отличаются от...
Применение рекуррентных слоев для решения многоходовок
https://habr.com/post/354220/
https://habr.com/post/354220/
Habr
Применение рекуррентных слоев для решения многоходовок
История Рекуррентные слои были изобретены еще в 80х Джоном Хопфилдом. Они легли в основу разработанных им искусственных ассоциативных нейронных сетей (сетей Хопфилда). Сегодня рекуррентные сети...
pytorch implementation of Get To The Point: Summarization with Pointer-Generator Networks
https://github.com/atulkum/pointer_summarizer
https://github.com/atulkum/pointer_summarizer
GitHub
GitHub - atulkum/pointer_summarizer: pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"
pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks" - GitHub - atulkum/pointer_summarizer: pytorch implementation of "Get To The P...
Announcing PyTorch 1.0 for both research and production
https://code.facebook.com/posts/172423326753505/
https://code.facebook.com/posts/172423326753505/
Facebook Code
Announcing PyTorch 1.0 for both research and production
We're announcing the next version of our open source AI framework, PyTorch 1.0, which integrates capabilities from Caffe2 and ONNX to provide a fast path from AI research to production.
Comparing Sentence Similarity Methods
http://nlp.town/blog/sentence-similarity/
http://nlp.town/blog/sentence-similarity/
Identifying Natural Language with 99% accuracy using Machine Learning (Python and Scikit-Learn)
https://youtu.be/5Bc6-uDcnqg
https://youtu.be/5Bc6-uDcnqg
YouTube
Natural Language Identification Machine Learning Pipeline with Python and Scikit-Learn
Graduate Project for Harvard's Python for Data Science (CSCI E - 29)
In this project, I pulled text data from European Parliament Proceedings in 21 languages. Using Scikit-Learn, I transformed the raw text into a numerical feature matrix, and trained a Multinomial…
In this project, I pulled text data from European Parliament Proceedings in 21 languages. Using Scikit-Learn, I transformed the raw text into a numerical feature matrix, and trained a Multinomial…
torchplus — implements the + operator on pytorch layers, returning nn.Sequential
https://github.com/knighton/torchplus
https://github.com/knighton/torchplus
GitHub
knighton/torchplus
Contribute to knighton/torchplus development by creating an account on GitHub.
Fountain - Natural Language Data Augmentation Tool
https://github.com/tzano/fountain
https://github.com/tzano/fountain
GitHub
GitHub - tzano/fountain: Natural Language Data Augmentation Tool for Conversational Systems
Natural Language Data Augmentation Tool for Conversational Systems - GitHub - tzano/fountain: Natural Language Data Augmentation Tool for Conversational Systems