r/StableDiffusion – Telegram
Not very satisfied by Qwen Edit 2511

I've been testing it all day, but I'm not really happy with the results. I'm using the comfy workflow without the lighting LORA, with the FP8 model on a 5090 and the results are usually sub-par (a lot of detail changed, blurred images and so forth). Are your results perfect? Is there anything you'd suggest? Thanks in advance.

https://redd.it/1puynjh
@rStableDiffusion
You subscribed to Gemini pro, so naturally Google decided it's time for the model's daily lobotomy.

Let’s give a round of applause to the absolute geniuses in Google’s finance department. 👏

https://preview.redd.it/9wkvb492y99g1.png?width=2816&format=png&auto=webp&s=9029d735caa2e24a33929529b45cfc93b38d40d3

The strategy is brilliant, really. It’s the classic “Bait and Switch” masterclass.



Phase 1: Release a "Full Power" model. It’s smart, it follows instructions, it actually codes. It feels like magic. We all rush in, credit cards in hand, thinking, "Finally, a worthy competitor."



Phase 2 (We are here): Once the subnoscription revenue is locked in, start the "Dynamic Compute Rationing" (or whatever corporate euphemism they use for "throttling the hell out of it").



Has anyone else noticed that Gemini Advanced feels like it’s undergoing a progressive cognitive decline? It’s not just “hallucinating”; it’s straight-up refusing to think. It feels like they are actively A/B testing our pain threshold: "How much can we lower the parameter count and reduce the inference compute before this user cancels?"



It’s insulting. We are paying a premium for a “Pro” model, yet the output quality varies wildly depending on traffic, often degrading to the level of a free tier chatbot that’s had a few too many drinks.



It’s corporate gaslighting at its finest. They get us hooked on the high-IQ version, then quietly swap it out for a cheaper, dumber, quantized version to save on server costs, banking on the fact that we’re too lazy to switch ecosystems.



So, here is the million-dollar question:



At what point does this become actual consumer fraud or false advertising? We are paying for a specific tier of service (Advanced/Ultra), but the backend delivery is opaque and manipulated.



For those of you with legal backgrounds or experience in consumer protection:



Is there any precedent for class-action pressure against SaaS companies that dynamically degrade product quality after payment?



How do we actually verify the compute/model version we are being served?



Aside from voting with our wallets and cancelling (which I’m about to do), is there any regulatory body that actually cares about this kind of "digital shrinkflation"?



Disappointed but not surprised. Do better, Google. Or actually, just do what you advertised.

https://redd.it/1pv4v3i
@rStableDiffusion
PhotomapAI - A tool to optimise your dataset for lora training

One of the most common questions I see is: "How many images do I need for a good LoRA?"

The raw number matters much less than the diversity and value of each image. Even if all your images are high quality, if you have 50 photos of a person, but 40 of them are from the same angle in the same lighting, you aren’t training the lora on a concept, you’re training it to overfit on a single moment.

For example: say you’re training an Arcane LoRA. If your dataset has 100 images of Vi and only 10 images of other characters, you won't get a generalized style. Your LoRA will be heavily biased toward Vi (overfit) and won't know how to handle other characters (underfit).

I struggled with this in my own datasets, so I built a tool for my personal workflow based on PhotoMapAI (an awesome project by lstein on GitHub). It’s been invaluable for identifying low-quality images and refining my datasets to include only semantically different images. I thought this would be invaluable for others too so I created a PR.

Lstein’s original tool uses clip embeddings generated 100% locally to "map" your images based on their relationship to one another, the closer two images are on the map, the more similar they are. The feature I've added builds on this functionality, a feature called the Dataset Curator, which has now been merged into the official 1.0 release. It uses math to pick the most "valuable" images so you don't have to do it manually (which images are the most different based on the clip embeddings).


Have a read here first to understand how it works:
Image Dataset Curation - PhotoMapAI

Here's a quick summary:

How it works:

Diversity (Farthest Point Sampling): This algorithm finds "outliers." It’s great for finding rare angles or unique lighting. Warning: It also finds the "garbage" (blurry or broken images), which is actually helpful because it shows you exactly what you need to exclude first! Use this to balance out your dataset to optimise for variability.

Balance (K-Means): This groups your photos into clusters and picks a representative from each. If you have 100 full-body shots and 10 close-ups, it ensures your final selection preserves those ratios so the model doesn't "forget" the rare concepts. Use this to thin-out your dataset but maintain ratios.

The workflow I use:

1. Run the Curator with 20 iterations on FPS mode: This uses a Monte Carlo simulation to find "consensus" selections. Since these algorithms can be sensitive to the starting point, running multiple passes helps identify the images that are statistically the most important regardless of where the algorithm starts.
2. Check the Magenta (Core Outliers): These are the results that showed up in >90% of the Monte Carlo runs. If any of these are blurry or "junk," I just hit "Exclude." If they are not junk, this is good, it means that the analysis shows these images have the most different clip embeddings (and for good reasons).
3. Run it again if you excluded images. The algorithm will now ignore the junk and find the next best unique (but clean) images to fill the gap.
4. Export: It automatically copies your images and your .txt captions to a new folder, handling any filename collisions for you. You can even export an analysis to see how many times the images were selected in the process.

The goal isn't to have the most images; it’s to have a dataset where every single image teaches the model something new.


Huge thanks to lstein for creating the original tool which is incredible for its original use too.

Here's the release notes for 1.0.0 by lstein and install files:
Release v1.0.0 · lstein/PhotoMapAI



https://redd.it/1pv6aok
@rStableDiffusion
我做了一些列 LoRA 训练的教学视频及配套的汉化版 AITOOLKIT I've created a series of tutorial videos on LoRA training (with English subnoscripts)

我做了一些列 LoRA 训练的教学视频(配有英语字幕)及配套的汉化版 AITOOLKIT,以尽可能通俗易懂的方式详细介绍了每个参数的设置以及它们的作用,帮助你开启炼丹之路,如果你觉得视频内容对你有帮助,请帮我点赞关注支持一下✧٩(ˊωˋ)و✧
_
I've created a series of tutorial videos on LoRA training (with English subnoscripts) and a localized version of AITOOLKIT. These resources provide detailed explanations of each parameter's settings and their functions in the most accessible way possible, helping you embark on your AI model training journey. If you find the content helpful, please show your support by liking, following, and subscribing. ✧٩(ˊωˋ
)و✧

https://youtube.com/playlist?list=PLFJyQMhHMt0lC4X7LQACHSSeymynkS7KE&si=JvFOzt2mf54E7n27

https://redd.it/1pvb4x2
@rStableDiffusion