Reddit Sysadmin – Telegram
The Website is Down #1: Sales Guy vs. Web Dude (Classic Cloudflare)

I am SURE it has been posted here COUNTLESS TIMES, but today - with Cloudflare on fire, we should all sit back, relax, and laugh our assess off with this historical nugget of internet gold.

https://youtu.be/uRGljemfwUE?si=TJhlwE5obrQbGyYJ

I'm always amazed by how many of the "new generation" of SysAdmins have never even heard of it. Sigh, kids these days. Maybe NSFW, but just a little.

https://redd.it/1p0cppu
@r_systemadmin
Is it just me or institutional knowledge is no longer valued?

I've been at the same place for close to 22 years now, and I've survived a LOT of layoffs. But I know plenty of old-timers that did not, and when they left, there was a massive amount of institutional knowledge that got lost. And management doesn't give a crap. They just tell you to figure it out when you need to reach out to someone that is no longer there.

When I started here 22 years ago, loyalty was rewarded. I met plenty of people that had been here 20+ years and managed to retire from this place.

Since the pandemic ended, I'm noticing that this place no longer rewards loyalty, and even having intimate knowledge on how something works, or being the company subject matter expert on something doesn't guarantee any kind of job security.

https://redd.it/1p0dxev
@r_systemadmin
Cloudflare is Down! Here's what you can do.

We have monitoring placed on all the system, we got bombarded with alerts back to back.

Instead of panicking we changed the DNS proxy and generated new SSL certs for all the proxied domains.

All of our customers are back online within 30 minutes from the outage started.

If you're unable login to Cloudflare, their API access is still working you can use the API keys to update the DNS records!

If you're unable to access cloudflare you can change your DNS from cloudflare to your domain provider OR can transfer it to Fastly, bunny or Akamai and use the alternative providers.

If you've purchased the domain from Cloudflare or they use cloudflare (namecheap 😒) sadly you will have to wait.

You can try emailing your domain provider to change the nameservers they will help you out, try cloudns or similar options.

https://redd.it/1p0d69g
@r_systemadmin
As CTO, I’m pleased to announce our platform outperformed Cloudflare during the incident,....

....maintaining flawless availability across our primary production environment at http://localhost:3000, a testament to the robustness of our enterprise architecture.

https://redd.it/1p0li15
@r_systemadmin
Hot take: The outage isn't the problem everyone going down at once is

It’s happening again. Cloudflare is down, and with it, a massive chunk of the internet has simply vanished. We see the usual panic: 500 errors on major platforms, broken APIs, and businesses bleeding revenue by the second.

But if we just treat this as "another technical glitch," we are missing the point.

This isn't a reliability issue; it’s a topology issue. We have allowed the internet (designed to be the ultimate decentralized network btw) to atrophy into a fragile oligopoly. When "the cloud" is effectively just three or four giant computers in Northern Virginia and Frankfurt, outages aren't accidents; they are statistical certainties.

https://redd.it/1p0g2c7
@r_systemadmin
So the Cloudflare outage was basically the Windows .LOG size bug on steroids?

https://www.axios.com/2025/11/18/cloudflare-outage-cause-systems-down

>What they're saying: Cloudflare spokesperson Jackie Dutton said the outage was caused by a "configuration file that is automatically generated to manage threat traffic."

>"The file grew beyond an expected size of entries and triggered a crash in the software system that handles traffic for a number of Cloudflare's services," Dutton said.

Seeing the larger explanation for this in the near future (assuming they actually give one) are probably going to make both eyes and heads roll. Going to guess that this one is going to take a while for people to trust again after they claim it to be fixed.

https://redd.it/1p0rb4i
@r_systemadmin
I built a DownDetector for DownDetector

After DownDetector went down with the CloudFlare outage today I decided to build a robust, independent tool which can act as a DownDetector for DownDetector

Hosted on AWS plus a static mirror on Netlify and also Vercel for triple redundancy !!!

https://redd.it/1p0svc7
@r_systemadmin
I can't take it anymore guys

"Oops, something went wrong!"

Buttons greyed out for no discernible reason with no explanation why.

Extra buttons loading so slow that your mouse is already there, and then you click the new button that just suddenly appeared on accident.

Email alerts that send you a link, make you log in, and then don't redirect you to the link.

Micropenissoft shitwindows changing your settings automatically for no reason.

Licenses to use features that already exist on hardware you already spent thousands of dollars on.

AI features I didn't ask for.

Updates that give you a "new and improved interface" that requires you to search for things to find them and click through more menus than before.

Popups that interrupt me in the middle of typing to tell me about some new feature I don't fucking care about.

I'm losing my mind, guys. Was it always this bad?



https://redd.it/1p0s7x3
@r_systemadmin
Spent 5 hours debugging AWS Elastic Beanstalk… turns out my client just hadn’t paid the bills.

So today I learned a very important lesson about AWS:
It won’t tell you *why* it’s ruining your life.

I’m working for a client, right?
Simple task: **“Can you deploy this updated Node backend on EB?”**
Cool, no problem. I’ve done this a hundred times.

Except today EB woke up and chose violence.

* Stuck at “Updating environment”
* Stuck at “No Data”
* Rebuild fails
* Auto Scaling group refuses to exist
* Logs won’t download
* Node 22 acting like it hates me
* Even a brand new environment wouldn’t launch
* EC2 keeps screaming “vCPU limit exceeded”
* Support rejects quota increase in 30 seconds flat

At this point I’m sweating thinking I corrupted their entire environment.
I’m googling every possible error under the sun.
I'm blaming my ZIP file, my code, my past life sins, everything.

**FOUR HOURS later…**

I open the billing section and see:

>

BRO.
AWS basically put the entire account into **timeout mode**, silently.
Didn’t tell me upfront.
Didn’t show a warning in EB.
Didn’t say “Hey genius, your client didn’t pay the bills.”
Just let me fight ghosts for half a day.

The whole infrastructure was literally **blocked** because the client hadn’t paid MONTHS of invoices.

And here I was debugging like I broke production.

*Me:* Why won’t EC2 launch??
*AWS:* 😐
*Me:* Why is my quota suddenly 1 vCPU??
*AWS:* 😐
*Me:* Why did you reject my quota request in 0.2 seconds??
*AWS:* 😐
*Billing page:* “Past due: ₹23,659.”
*Me:* OH.

Anyway, client is like “ohhh yeah, we forgot to pay that.”

So yeah, shoutout to AWS for letting me believe I destroyed the entire system, when the real root cause was basically, “We don’t run servers for broke people.”

Day ruined, self-esteem shattered, but at least I earned Reddit content.

https://redd.it/1p10mv9
@r_systemadmin
Microsoft Ignite 2025 updates

Sharing a quick summary of the today's Ignite updates that are actually useful for admins:

* **Security Copilot for All M365 E5** \-Now included at no extra cost. Integrated directly into Defender, Entra, Intune, and Purview with ready-to-use agents.
* **Organization-Wide Security Baseline** \- Easy way to apply baseline security settings across the tenant. It reduces the need to navigate multiple portals and allows to apply in a fewer clicks.
* **AI Security Dashboard** \- A consolidated dashboard showing real-time signals from Defender, Entra, and Purview. Helps monitor AI-related risks in one place.
* **Microsoft Agent 365** \- It's a plane to manage AI agents across the organization, whether built on Microsoft tools or external frameworks. Centralized deployment and governance.
* **Purview Enhancements for M365 Copilot** \- New additions include:
* Detailed data oversharing reports inside the M365 admin center
* Automated bulk cleanup of overshared links
* DLP controls for M365 Copilot and chat prompt interactions
* **Predictive Shielding in Microsoft Defender** \- Uses threat intelligence and graph data to predict likely attacker movement and automatically harden vulnerable paths before they’re exploited.

https://redd.it/1p11a5l
@r_systemadmin
Does this annoy anyone else?

Someone asked why certain emails were being caught up in a spam filter, I explained why as non-techical as I could and all I hear is a sigh and "cool story bro" or usually its that look of "I really didnt want to know"

If you dont want to know, dont ask in the first place FFS.

https://redd.it/1p18e86
@r_systemadmin
What's the most ridiculous request you've received?

We got a request today in our servicedesk saying they ordered and received a new kettle and wanted IT to check it out and make sure it was OK. Umm...don't think kettles are our problem. IT does get some silly requests sometimes (this was the silliest I've seen for some time) so was wondering what kind of strange or silly requests have you received?

https://redd.it/1p18k0a
@r_systemadmin
The spreadsheet from hell

We’ve got 220 employees, and our entire device management system is one Excel file called IT Inventory Final v19 USE THIS ONE.xlsx.

Half the data’s wrong. Laptops marked as in use by people who quit months ago. Others say unknown. No one knows what unknown even means anymore.

I automate everything, deployments, patches, backups, monitoring but tracking physical equipment? Still 100% manual chaos.

Every quarter I tell myself I’ll fix it. Then I open the same damn spreadsheet, scroll through 400 rows, and die a little inside.

There has to be a better way.



https://redd.it/1p1gml8
@r_systemadmin
OK which one of you was bored today?

Looks like someone created a 4X downdetector...

https://downdetectorsdowndetectorsdowndetectorsdowndetector.com/

It's turtles all the way down.

Edit:
https://downdetectorsdowndetectorsdowndetectorsdowndetector.com/ is currently reporting everything down even though https://downdetectorsdowndetectorsdowndetector.com/ is still online. This is crazy, I feel another mass internet calamity incoming.

https://redd.it/1p1jbnu
@r_systemadmin
Can we recover access to this server?

We have a fully patched Windows 2022 server that has lost its trust in the domain. Attempting to login with a domain account gives a bad username/password error. No one knows a good, local username/password pair for the server. If it matters, the server is a VMware VM.

We had something similar happen to another server recently and we tried replacing utilman.exe with cmd.exe. We could get cmd.exe to initially execute but Windows Defender kept shutting it down.

Any suggestions for how we can regain access?


EDIT: Huge thank you to those who suggested disconnecting the NIC and trying to use cached creds! Worked like a charm.

https://redd.it/1p1ewoc
@r_systemadmin
Disgruntled IT employee causes Houston company $862K cyber chaos

Per the Houston Chronicle:

Waste Management found itself in a tech nightmare after a former contractor, upset about being fired, broke back into the Houston company's network and reset roughly 2,500 passwords-knocking employees offline across the country.



Maxwell Schultz, 35, of Ohio, admitted he hacked into his old employer's network after being fired in May 2021.



While it's unclear why he was let go, prosecutors with the U.S. Attorney's Office for the Southern District of Texas said Schultz posed as another contractor to snag login credentials, giving him access to the company's network. 



Once he logged in, Schultz ran what court documents described as a "PowerShell noscript," which is a command to automate tasks and manage systems. In doing so, prosecutors said he reset "approximately 2,500 passwords, locking thousands of employees and contractors out of their computers nationwide." 



The cyberattack caused more than $862,000 in company losses, including customer service disruptions and labor needed to restore the network. Investigators said Schultz also looked into ways to delete logs and cleared several system logs. 



During a plea agreement, Shultz admitted to causing the cyberattack because he was "upset about being fired," the U.S. Attorney's Office noted. He is now facing 10 years in federal prison and a possible fine of up to $250,000. 



Cybersecurity experts say this type of retaliation hack, also known as "insider threats," is growing, especially among disgruntled former employees or contractors with insider access. Especially in Houston's energy and tech sectors, where contractors often have elevated system privileges, according to the Cybersecurity & Infrastructure Security Agency (CISA)



Source: (non paywall version) https://www.msn.com/en-us/technology/cybersecurity/disgruntled-it-employee-causes-houston-company-862k-cyber-chaos/ar-AA1QLcW3


======

edit: formatting

https://redd.it/1p1moyt
@r_systemadmin
Pro tip for interviews

Be honest with your answers. Short and sweet. If your cert lapsed pr you don't have specific experience, be up front. It's not that big of a deal. Many places will help you get back into compliance/train you.

Interviewed someone today and they had very long answers without just saying "I do not have experience with that" or "no my cert has lapsed but I am willing to put the work in and re test".

https://redd.it/1p1iodm
@r_systemadmin
In MY day… (sysadmin edition)

In my day we didn’t have no…“cloudflare” outages. When the websites were down we put on our jackets and got on the elevator down to the basement, walked through the snow to get to the server room, and rebooted the web server! We didn’t just tell the helpdesk to send an email letting the clients know we had a vendor outage and were waiting for them to fix it, we took care of it ourselves! *shakes fist 🤛

https://redd.it/1p1pmch
@r_systemadmin
Why do healthcare orgs buy automation tools then keep doing everything manually??

Worked with three different healthcare places this past year and I swear it's always the same story. They drop $50k on some enterprise platform, do the whole implementation thing, train everyone... then 6 months later they're still using excel and emailing pdfs to each other.

The excuses change but the result doesn't. "Doesn't handle our edge cases" (okay but 90% of your work isn't edge cases). "Staff doesn't trust it" (you hired me because they were drowning). "Waiting for the next version" (the current one would save you 15 hours a week TODAY).

The automation actually works when you test it. Intake forms populate the ehr, reminders go out, insurance stuff happens automatically… It does what it's supposed to but then someone's assistant likes the old way or one doctor refuses to use it and the whole thing falls apart.

Don't know if this is healthcare specific or everywhere, the compliance stuff is real (hipaa, audit trails, whatever) but those are solvable. What's not solvable is "this is how we've always done it" even when that way is burning out your entire staff.

Has anyone actually gotten a healthcare org to fully adopt automation? What was different? Starting to think this isn't a tech problem at all, it's purely people refusing change. Which sucks because I got into this to build systems not be a therapist for resistant employees.

Maybe I need to focus on smaller stuff people don't notice instead of trying to overhaul everything? Idk. Would love other perspectives especially from regulated industries.

https://redd.it/1p1vlu5
@r_systemadmin
My boss doesn't think anyone wants to be a Jr Messaging Engineer/Sysadmin

Is this like a corporate thing now that Junior Engineers are a worthless expense?

https://redd.it/1p1w93p
@r_systemadmin
Thickheaded Thursday - November 20, 2025

Howdy, /r/sysadmin!

It's that time of the week, Thickheaded Thursday! This is a safe (mostly) judgement-free environment for all of your questions and stories, no matter how silly you think they are. Anybody can answer questions! My name is AutoModerator and I've taken over responsibility for posting these weekly threads so you don't have to worry about anything except your comments!

https://redd.it/1p1zbzf
@r_systemadmin