Photo Tinder
Hi, I got sick of trawling through images manually and using destructive processes to figure out which images to keep, which to throw away and which were best - so I vibe coded Photo Tinder with Claude (tested on OSX and Linux with no issues - windows available but untested).
Basically you have two modes
\- triage - which outputs rejected into one folder and accepted into the other -
\- ranking - which uses the glick algorithm to compare two photos and you pick the winner - the score gets updated and you repeat until your results are certain.
You have a browser which allows you to look at the rejected and accepted folders and filter them by ranking, recency etc...
Hope this is useful. Preparing datasets is hard - this tool makes it that much more easy.
https://github.com/relaxis/photo-tinder-desktop
https://redd.it/1ppwx68
@rStableDiffusion
Hi, I got sick of trawling through images manually and using destructive processes to figure out which images to keep, which to throw away and which were best - so I vibe coded Photo Tinder with Claude (tested on OSX and Linux with no issues - windows available but untested).
Basically you have two modes
\- triage - which outputs rejected into one folder and accepted into the other -
\- ranking - which uses the glick algorithm to compare two photos and you pick the winner - the score gets updated and you repeat until your results are certain.
You have a browser which allows you to look at the rejected and accepted folders and filter them by ranking, recency etc...
Hope this is useful. Preparing datasets is hard - this tool makes it that much more easy.
https://github.com/relaxis/photo-tinder-desktop
https://redd.it/1ppwx68
@rStableDiffusion
GitHub
GitHub - relaxis/photo-tinder-desktop: Photo Tinder - Desktop app for image triage and ranking (Tauri)
Photo Tinder - Desktop app for image triage and ranking (Tauri) - relaxis/photo-tinder-desktop
KLing released a video model few days ago MemFlow . Long 60s video generation ( Realtime 18 fps on a H100 GPU / ) lots of examples on project page
https://redd.it/1pq2uxb
@rStableDiffusion
https://redd.it/1pq2uxb
@rStableDiffusion
New incredibly fast realistic TTS: MiraTTS
Current TTS models are great but unfortunately, they either lack emotion/realism or speed. So I heavily optimized the finetuned LLM based TTS model: MiraTTS. It's extremely fast and great quality by using lmdeploy and FlashSR respectively.
The main benefits of this repo and model are
1. Extremely fast: Can reach speeds up to 100x realtime through lmdeploy and batching!
2. High quality: Generates 48khz clear audio(most other models generate 16khz-24khz audio which is lower quality) using FlashSR
3. Very low latency: Latency as low as 150ms from initial tests.
4. Very low vram usage: can be low as 6gb vram so great for local users.
I am planning on multilingual versions, native 48khz bicodec, and possibly multi-speaker models.
Github link: https://github.com/ysharma3501/MiraTTS
Model and non-cherrypicked examples link: https://huggingface.co/YatharthS/MiraTTS
Blog explaining llm tts models: https://huggingface.co/blog/YatharthS/llm-tts-models
I would very much appreciate stars or likes, thank you.
https://redd.it/1pq5t35
@rStableDiffusion
Current TTS models are great but unfortunately, they either lack emotion/realism or speed. So I heavily optimized the finetuned LLM based TTS model: MiraTTS. It's extremely fast and great quality by using lmdeploy and FlashSR respectively.
The main benefits of this repo and model are
1. Extremely fast: Can reach speeds up to 100x realtime through lmdeploy and batching!
2. High quality: Generates 48khz clear audio(most other models generate 16khz-24khz audio which is lower quality) using FlashSR
3. Very low latency: Latency as low as 150ms from initial tests.
4. Very low vram usage: can be low as 6gb vram so great for local users.
I am planning on multilingual versions, native 48khz bicodec, and possibly multi-speaker models.
Github link: https://github.com/ysharma3501/MiraTTS
Model and non-cherrypicked examples link: https://huggingface.co/YatharthS/MiraTTS
Blog explaining llm tts models: https://huggingface.co/blog/YatharthS/llm-tts-models
I would very much appreciate stars or likes, thank you.
https://redd.it/1pq5t35
@rStableDiffusion
GitHub
GitHub - ysharma3501/MiraTTS: A high quality and fast TTS repository
A high quality and fast TTS repository. Contribute to ysharma3501/MiraTTS development by creating an account on GitHub.
Z-Image-Turbo - Smartphone Snapshot Photo Reality - LoRa - Release
https://redd.it/1pqgjxy
@rStableDiffusion
https://redd.it/1pqgjxy
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Z-Image-Turbo - Smartphone Snapshot Photo Reality - LoRa - Release
Explore this post and more from the StableDiffusion community
Wan Time to Move
https://youtu.be/s3Fch5zLzdM?si=YAYJnOZ29Kgw7XGO
https://redd.it/1pqgeon
@rStableDiffusion
https://youtu.be/s3Fch5zLzdM?si=YAYJnOZ29Kgw7XGO
https://redd.it/1pqgeon
@rStableDiffusion
YouTube
Wan Time To Move
Wan Time-To-Move is a very interesting workflow, built on the Wan 2.2 model. It has the ability to take rough slap comps and refine them into more polished results, giving us greater control over the effects timing. In this test, I experimented with ignition…