This media is not supported in your browser
VIEW IN TELEGRAM
YOLOv8 is the newest state-of-the-art YOLO model that can be used for object detection, image classification, and instance segmentation tasks. YOLOv8 includes numerous architectural and developer experience changes and improvements over YOLOv5.
Code:
https://github.com/ultralytics/ultralytics
What's New in YOLOv8 ?
https://blog.roboflow.com/whats-new-in-yolov8/
Yolov8 Instance Segmentation (ONNX):
https://github.com/ibaiGorordo/ONNX-YOLOv8-Instance-Segmentation
👉 @computer_science_and_programming
Code:
https://github.com/ultralytics/ultralytics
What's New in YOLOv8 ?
https://blog.roboflow.com/whats-new-in-yolov8/
Yolov8 Instance Segmentation (ONNX):
https://github.com/ibaiGorordo/ONNX-YOLOv8-Instance-Segmentation
👉 @computer_science_and_programming
👍165👎5
This media is not supported in your browser
VIEW IN TELEGRAM
Box2Mask: Box-supervised Instance Segmentation via Level-set Evolution
BoxInstSeg is a toolbox that aims to provide state-of-the-art box-supervised instance segmentation algorithms. It supports instance segmentation with only box annotations.
Github:
https://github.com/LiWentomng/BoxInstSeg
Paper:
https://arxiv.org/pdf/2212.01579.pdf
👉@computer_science_and_programming
BoxInstSeg is a toolbox that aims to provide state-of-the-art box-supervised instance segmentation algorithms. It supports instance segmentation with only box annotations.
Github:
https://github.com/LiWentomng/BoxInstSeg
Paper:
https://arxiv.org/pdf/2212.01579.pdf
👉@computer_science_and_programming
👍118👎6
This media is not supported in your browser
VIEW IN TELEGRAM
GLIGEN: Open-Set Grounded Text-to-Image Generation.
GLIGEN (Grounded-Language-to-Image Generation) a novel approach that builds upon and extends the functionality of existing pre-trained text-to-image diffusion models by enabling them to also be conditioned on grounding inputs.
Project page:
https://gligen.github.io/
Paper:
https://arxiv.org/abs/2301.07093
Github (coming soon):
https://github.com/gligen/GLIGEN
Demo:
https://huggingface.co/spaces/gligen/demo
👉@computer_science_and_programming
GLIGEN (Grounded-Language-to-Image Generation) a novel approach that builds upon and extends the functionality of existing pre-trained text-to-image diffusion models by enabling them to also be conditioned on grounding inputs.
Project page:
https://gligen.github.io/
Paper:
https://arxiv.org/abs/2301.07093
Github (coming soon):
https://github.com/gligen/GLIGEN
Demo:
https://huggingface.co/spaces/gligen/demo
👉@computer_science_and_programming
👍110👎5
Cut and Learn for Unsupervised Object Detection and Instance Segmentation
Cut-and-LEaRn (CutLER) is a simple approach for training object detection and instance segmentation models without human annotations. It outperforms previous SOTA by 2.7 times for AP50 and 2.6 times for AR on 11 benchmarks.
Paper:
https://arxiv.org/pdf/2301.11320.pdf
Github:
https://github.com/facebookresearch/CutLER
Demo:
https://colab.research.google.com/drive/1NgEyFHvOfuA2MZZnfNPWg1w5gSr3HOBb?usp=sharing
👉@computer_science_and_programming
Cut-and-LEaRn (CutLER) is a simple approach for training object detection and instance segmentation models without human annotations. It outperforms previous SOTA by 2.7 times for AP50 and 2.6 times for AR on 11 benchmarks.
Paper:
https://arxiv.org/pdf/2301.11320.pdf
Github:
https://github.com/facebookresearch/CutLER
Demo:
https://colab.research.google.com/drive/1NgEyFHvOfuA2MZZnfNPWg1w5gSr3HOBb?usp=sharing
👉@computer_science_and_programming
👍99👎1
Audio AI Timeline
Here we will keep track of the latest AI models for audio generation, starting in 2023!
▪️SingSong: Generating musical accompaniments from singing
- Paper
▪️AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
- Paper
- Code
▪️Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion
- Paper
- Code
▪️Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
- Paper
▪️Noise2Music
▪️RAVE2
- Paper
- Code
▪️MusicLM: Generating Music From Text
- Paper
▪️Msanii: High Fidelity Music Synthesis on a Shoestring Budget
- Paper
- Code
- HuggingFace
▪️ArchiSound: Audio Generation with Diffusion
- Paper
- Code
▪️VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
- Paper
👉@computer_science_and_programming
Here we will keep track of the latest AI models for audio generation, starting in 2023!
▪️SingSong: Generating musical accompaniments from singing
- Paper
▪️AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
- Paper
- Code
▪️Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion
- Paper
- Code
▪️Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
- Paper
▪️Noise2Music
▪️RAVE2
- Paper
- Code
▪️MusicLM: Generating Music From Text
- Paper
▪️Msanii: High Fidelity Music Synthesis on a Shoestring Budget
- Paper
- Code
- HuggingFace
▪️ArchiSound: Audio Generation with Diffusion
- Paper
- Code
▪️VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
- Paper
👉@computer_science_and_programming
👍174👎4❤2
This media is not supported in your browser
VIEW IN TELEGRAM
Gen-1: The Next Step Forward for Generative AI
Use words and images to generate new videos out of existing
Introducing Gen-1: a new AI model that uses language and images to generate new videos out of existing ones.
https://research.runwayml.com/gen1
⭐️ Project:
https://research.runwayml.com/gen1
✅ Paper:
https://arxiv.org/abs/2302.03011
📌Request form:
https://docs.google.com/forms/d/e/1FAIpQLSfU0O_i1dym30hEI33teAvCRQ1i8UrGgXd4BPrvBWaOnDgs9g/viewform
👉@computer_science_and_programming
Use words and images to generate new videos out of existing
Introducing Gen-1: a new AI model that uses language and images to generate new videos out of existing ones.
https://research.runwayml.com/gen1
⭐️ Project:
https://research.runwayml.com/gen1
✅ Paper:
https://arxiv.org/abs/2302.03011
📌Request form:
https://docs.google.com/forms/d/e/1FAIpQLSfU0O_i1dym30hEI33teAvCRQ1i8UrGgXd4BPrvBWaOnDgs9g/viewform
👉@computer_science_and_programming
👍154👎7❤1
This media is not supported in your browser
VIEW IN TELEGRAM
YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection
SPATIO-temporal action detection (STAD) aims to detect action instances in the current frame, which it has been widely applied, such as video surveillance and somatosensory game.
Paper:
https://arxiv.org/pdf/2302.06848.pdf
Github:
https://github.com/yjh0410/YOWOv2
Dataset:
https://drive.google.com/file/d/1Dwh90pRi7uGkH5qLRjQIFiEmMJrAog5J/view?usp=sharing
👉@computer_science_and_programming
SPATIO-temporal action detection (STAD) aims to detect action instances in the current frame, which it has been widely applied, such as video surveillance and somatosensory game.
Paper:
https://arxiv.org/pdf/2302.06848.pdf
Github:
https://github.com/yjh0410/YOWOv2
Dataset:
https://drive.google.com/file/d/1Dwh90pRi7uGkH5qLRjQIFiEmMJrAog5J/view?usp=sharing
👉@computer_science_and_programming
👍131👎4
This media is not supported in your browser
VIEW IN TELEGRAM
3D-aware Conditional Image Synthesis (pix2pix3D)
Pix2pix3D synthesizes 3D objects (neural fields) given a 2D label map, such as a segmentation or edge map
Github:
https://github.com/dunbar12138/pix2pix3D
Paper:
https://arxiv.org/abs/2302.08509
Project:
https://www.cs.cmu.edu/~pix2pix3D/
Datasets:
CelebAMask , AFHQ-Cat-Seg , Shapenet-Car-Edge
👉@computer_science_and_programming
Pix2pix3D synthesizes 3D objects (neural fields) given a 2D label map, such as a segmentation or edge map
Github:
https://github.com/dunbar12138/pix2pix3D
Paper:
https://arxiv.org/abs/2302.08509
Project:
https://www.cs.cmu.edu/~pix2pix3D/
Datasets:
CelebAMask , AFHQ-Cat-Seg , Shapenet-Car-Edge
👉@computer_science_and_programming
👍192👎6
Efficient Teacher: Semi-Supervised Object Detection for YOLOv5
✅ Efficient Teacher introduces semi-supervised object detection into practical applications, enabling users to obtain a strong generalization capability with only a small amount of labeled data and large amount of unlabeled data.
✅ Efficient Teacher provides category and custom uniform sampling, which can quickly improve the network performance in actual business scenarios.
Paper:
https://arxiv.org/abs/2302.07577
Github:
https://github.com/AlibabaResearch/efficientteacher
👉@computer_science_and_programming
✅ Efficient Teacher introduces semi-supervised object detection into practical applications, enabling users to obtain a strong generalization capability with only a small amount of labeled data and large amount of unlabeled data.
✅ Efficient Teacher provides category and custom uniform sampling, which can quickly improve the network performance in actual business scenarios.
Paper:
https://arxiv.org/abs/2302.07577
Github:
https://github.com/AlibabaResearch/efficientteacher
👉@computer_science_and_programming
👍174👎2
Multivariate Probabilistic Time Series Forecasting with Informer
Efficient transformer-based model for LSTF.
Method introduces a Probabilistic Attention mechanism to select the “active” queries rather than the “lazy” queries and provides a sparse Transformer thus mitigating the quadratic compute and memory requirements of vanilla attention.
🤗Hugging face:
https://huggingface.co/blog/informer
⏩ Paper:
https://huggingface.co/docs/transformers/main/en/model_doc/informer
⭐️ Colab:
https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multivariate_informer.ipynb
💨 Dataset:
https://huggingface.co/docs/datasets/v2.7.0/en/package_reference/main_classes#datasets.Dataset.set_transform
👉@computer_science_and_programming
Efficient transformer-based model for LSTF.
Method introduces a Probabilistic Attention mechanism to select the “active” queries rather than the “lazy” queries and provides a sparse Transformer thus mitigating the quadratic compute and memory requirements of vanilla attention.
🤗Hugging face:
https://huggingface.co/blog/informer
⏩ Paper:
https://huggingface.co/docs/transformers/main/en/model_doc/informer
⭐️ Colab:
https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multivariate_informer.ipynb
💨 Dataset:
https://huggingface.co/docs/datasets/v2.7.0/en/package_reference/main_classes#datasets.Dataset.set_transform
👉@computer_science_and_programming
👍180👎8❤3
This media is not supported in your browser
VIEW IN TELEGRAM
ViperGPT: Visual Inference via Python Execution for Reasoning
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
Github:
https://github.com/cvlab-columbia/viper
Paper:
https://arxiv.org/pdf/2303.08128.pdf
Project:
https://paperswithcode.com/dataset/beat
👉@computer_science_and_programming
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
Github:
https://github.com/cvlab-columbia/viper
Paper:
https://arxiv.org/pdf/2303.08128.pdf
Project:
https://paperswithcode.com/dataset/beat
👉@computer_science_and_programming
👍225👎7❤1
This media is not supported in your browser
VIEW IN TELEGRAM
Test of Time: Instilling Video-Language Models with a Sense of Time
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
👉 @computer_science_and_programming
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
👉 @computer_science_and_programming
👍180👎7
DragGAN.gif
20.6 MB
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Paper:
https://arxiv.org/abs/2305.10973
Github:
https://github.com/XingangPan/DragGAN
Project page:
https://vcai.mpi-inf.mpg.de/projects/DragGAN/
👉 @computer_science_and_programming
Paper:
https://arxiv.org/abs/2305.10973
Github:
https://github.com/XingangPan/DragGAN
Project page:
https://vcai.mpi-inf.mpg.de/projects/DragGAN/
👉 @computer_science_and_programming
👍182👎10
🔭 GRES: Generalized Referring Expression Segmentation
New benchmark (GRES), which extends the classic RES to allow expressions to refer to an arbitrary number of target objects.
🖥 Github: https://github.com/henghuiding/ReLA
⏩ Paper: https://arxiv.org/abs/2306.00968
🔎 Project: https://henghuiding.github.io/GRES/
📌 New dataset: https://github.com/henghuiding/gRefCOCO
👉 @computer_science_and_programming
New benchmark (GRES), which extends the classic RES to allow expressions to refer to an arbitrary number of target objects.
🖥 Github: https://github.com/henghuiding/ReLA
⏩ Paper: https://arxiv.org/abs/2306.00968
🔎 Project: https://henghuiding.github.io/GRES/
📌 New dataset: https://github.com/henghuiding/gRefCOCO
👉 @computer_science_and_programming
👍131❤1👎1
80+ Jupyter Notebook tutorials on image classification, object detection and image segmentation in various domains
📌 Agriculture and Food
📌 Medical and Healthcare
📌 Satellite
📌 Security and Surveillance
📌 ADAS and Self Driving Cars
📌 Retail and E-Commerce
📌 Wildlife
Classification library
https://github.com/Tessellate-Imaging/monk_v1
Notebooks - https://github.com/Tessellate-Imaging/monk_v1/tree/master/study_roadmaps/4_image_classification_zoo
Detection and Segmentation Library
https://github.com/Tessellate-Imaging/
Monk_Object_Detection
Notebooks: https://github.com/Tessellate-Imaging/Monk_Object_Detection/tree/master/application_model_zoo
👉 @computer_science_and_programming
📌 Agriculture and Food
📌 Medical and Healthcare
📌 Satellite
📌 Security and Surveillance
📌 ADAS and Self Driving Cars
📌 Retail and E-Commerce
📌 Wildlife
Classification library
https://github.com/Tessellate-Imaging/monk_v1
Notebooks - https://github.com/Tessellate-Imaging/monk_v1/tree/master/study_roadmaps/4_image_classification_zoo
Detection and Segmentation Library
https://github.com/Tessellate-Imaging/
Monk_Object_Detection
Notebooks: https://github.com/Tessellate-Imaging/Monk_Object_Detection/tree/master/application_model_zoo
👉 @computer_science_and_programming
👍305👎16
This media is not supported in your browser
VIEW IN TELEGRAM
𝗛𝗼𝘄 𝘁𝗼 𝘁𝗲𝘀𝘁 𝘆𝗼𝘂𝗿 𝗔𝗣𝗜𝘀 𝗱𝗶𝗿𝗲𝗰𝘁𝗹𝘆 𝗳𝗿𝗼𝗺 𝗩𝗶𝘀𝘂𝗮𝗹 𝗦𝘁𝘂𝗱𝗶𝗼 𝗖𝗼𝗱𝗲?
You can immediately do this from your Visual Studio Code, as Postman just released a VS Code extension that integrates API building and testing into your code editor.
What you can do with the extension:
🔹𝗦𝗲𝗻𝗱 (𝗺𝘂𝗹𝘁𝗶𝗽𝗿𝗼𝘁𝗼𝗰𝗼𝗹) 𝗿𝗲𝗾𝘂𝗲𝘀𝘁𝘀
🔹𝗦𝗲𝗻𝗱 𝗿𝗲𝗾𝘂𝗲𝘀𝘁𝘀 𝗳𝗿𝗼𝗺 𝘆𝗼𝘂𝗿 𝗵𝗶𝘀𝘁𝗼𝗿𝘆
🔹𝗨𝘀𝗲 𝗰𝗼𝗹𝗹𝗲𝗰𝘁𝗶𝗼𝗻𝘀
🔹𝗨𝘀𝗲 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗲𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁𝘀
🔹𝗩𝗶𝗲𝘄 𝗮𝗻𝗱 𝗲𝗱𝗶𝘁 𝗰𝗼𝗼𝗸𝗶𝗲𝘀
➡️ Check it here
You can immediately do this from your Visual Studio Code, as Postman just released a VS Code extension that integrates API building and testing into your code editor.
What you can do with the extension:
🔹𝗦𝗲𝗻𝗱 (𝗺𝘂𝗹𝘁𝗶𝗽𝗿𝗼𝘁𝗼𝗰𝗼𝗹) 𝗿𝗲𝗾𝘂𝗲𝘀𝘁𝘀
🔹𝗦𝗲𝗻𝗱 𝗿𝗲𝗾𝘂𝗲𝘀𝘁𝘀 𝗳𝗿𝗼𝗺 𝘆𝗼𝘂𝗿 𝗵𝗶𝘀𝘁𝗼𝗿𝘆
🔹𝗨𝘀𝗲 𝗰𝗼𝗹𝗹𝗲𝗰𝘁𝗶𝗼𝗻𝘀
🔹𝗨𝘀𝗲 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗲𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁𝘀
🔹𝗩𝗶𝗲𝘄 𝗮𝗻𝗱 𝗲𝗱𝗶𝘁 𝗰𝗼𝗼𝗸𝗶𝗲𝘀
Please open Telegram to view this post
VIEW IN TELEGRAM
👍250❤5👎3
This media is not supported in your browser
VIEW IN TELEGRAM
Wondering how C++, Java, Python Work?
🔵 C++
C++ is like the superhero of programming languages. It's a compiled language, meaning your code is transformed into machine code that your computer can understand before it runs. This compilation process is crucial for efficiency and performance. C++ gives you precise control over memory and hardware, making it a top choice for systems programming and game development. It's like wielding a finely-tuned instrument in the world of code! 🎸💻
🔴 Java
Java, on the other hand, is the coffee of programming languages. It's a compiled language too but with a twist. Java code is compiled into bytecode, which runs on the Java Virtual Machine (JVM). This bytecode can run on any platform with a compatible JVM, making Java highly portable and platform-independent. It's a bit like sending your code to a virtual coffee machine that serves it up just the way you like it on any OS! ☕️💼
🐍 Python
Python is the friendly neighborhood programming language. It's an interpreted language, which means there's no compilation step. Python code is executed line by line by the Python interpreter. This simplicity makes it great for beginners and rapid development. Python's extensive library ecosystem and easy syntax make it feel like you're noscripting magic spells in a magical world! 🪄🐍
In the end, the choice of programming language depends on your project's needs and your personal preferences. Each language has its strengths and weaknesses, but they all share the goal of bringing your ideas to life through code. 🚀💡
So, whether you're crafting the perfect C++ masterpiece, brewing up Java applications, or noscripting Python magic, remember that programming languages are the tools that empower us to create amazing things in the digital realm. Embrace the language that speaks to you and keep coding! 💻🌟
🔵 C++
C++ is like the superhero of programming languages. It's a compiled language, meaning your code is transformed into machine code that your computer can understand before it runs. This compilation process is crucial for efficiency and performance. C++ gives you precise control over memory and hardware, making it a top choice for systems programming and game development. It's like wielding a finely-tuned instrument in the world of code! 🎸💻
🔴 Java
Java, on the other hand, is the coffee of programming languages. It's a compiled language too but with a twist. Java code is compiled into bytecode, which runs on the Java Virtual Machine (JVM). This bytecode can run on any platform with a compatible JVM, making Java highly portable and platform-independent. It's a bit like sending your code to a virtual coffee machine that serves it up just the way you like it on any OS! ☕️💼
🐍 Python
Python is the friendly neighborhood programming language. It's an interpreted language, which means there's no compilation step. Python code is executed line by line by the Python interpreter. This simplicity makes it great for beginners and rapid development. Python's extensive library ecosystem and easy syntax make it feel like you're noscripting magic spells in a magical world! 🪄🐍
In the end, the choice of programming language depends on your project's needs and your personal preferences. Each language has its strengths and weaknesses, but they all share the goal of bringing your ideas to life through code. 🚀💡
So, whether you're crafting the perfect C++ masterpiece, brewing up Java applications, or noscripting Python magic, remember that programming languages are the tools that empower us to create amazing things in the digital realm. Embrace the language that speaks to you and keep coding! 💻🌟
👍522👎6