How you track what would break if main cloud region goes down
We had a chat after the last AWS/Azure outage and honestly realized… none of us really know what would die if our primary region disappeared for a few hours.
We’ve got “multi-AZ everything”, backups, health checks, all the standard playbook stuff. But that’s still all inside one provider. Once you start asking “what if IAM or S3 or DNS in that region stops working?” it gets ugly fast.
Turns out half our “redundant” systems depend on the same control plane or managed service anyway. Even our monitoring stack isn’t as isolated as we thought.
Curious how other teams handle this:
• Do you actually simulate provider/region outages, or just hope it never happens?
• How do you figure out what’s truly single-point vs redundant?
• Anyone built good visibility around this without going full multi-cloud?
• Is your multi cloud really fail proof?
• And when something does go down, what’s the hardest part — detection, failover, or explaining it upstairs?
Not trying to start a multi-cloud debate — just wondering how others think about dependency risk in real life.
https://redd.it/1olu2rc
@r_systemadmin
We had a chat after the last AWS/Azure outage and honestly realized… none of us really know what would die if our primary region disappeared for a few hours.
We’ve got “multi-AZ everything”, backups, health checks, all the standard playbook stuff. But that’s still all inside one provider. Once you start asking “what if IAM or S3 or DNS in that region stops working?” it gets ugly fast.
Turns out half our “redundant” systems depend on the same control plane or managed service anyway. Even our monitoring stack isn’t as isolated as we thought.
Curious how other teams handle this:
• Do you actually simulate provider/region outages, or just hope it never happens?
• How do you figure out what’s truly single-point vs redundant?
• Anyone built good visibility around this without going full multi-cloud?
• Is your multi cloud really fail proof?
• And when something does go down, what’s the hardest part — detection, failover, or explaining it upstairs?
Not trying to start a multi-cloud debate — just wondering how others think about dependency risk in real life.
https://redd.it/1olu2rc
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
VDI with VOIP Would you recommend?
Heya ,
Company wants to go in the direction of VDI but we have about 400 users who use Five9 Softphone daily. Also heavy use.
Five9 has been a nightmare - everyday there is a new issue or ticket created in our help desk to help a user with Five9 ( brower refresh errors , or not recognizing the softphone app). Inorder to save money being laptops my company is thinking of introducing VDI in the upcoming year.
I have concerns with reliability and call quality.
Anyone have experience with VDI and VOIP? Would you recommend ?
These will be loaded on thin clients.
https://redd.it/1olst2o
@r_systemadmin
Heya ,
Company wants to go in the direction of VDI but we have about 400 users who use Five9 Softphone daily. Also heavy use.
Five9 has been a nightmare - everyday there is a new issue or ticket created in our help desk to help a user with Five9 ( brower refresh errors , or not recognizing the softphone app). Inorder to save money being laptops my company is thinking of introducing VDI in the upcoming year.
I have concerns with reliability and call quality.
Anyone have experience with VDI and VOIP? Would you recommend ?
These will be loaded on thin clients.
https://redd.it/1olst2o
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
SSH with pubkey accidentally left opened. Any issue?
I normally check the server security carefully, but finally made a mistake.
When I create servers in cloud, the firewall is enabled and only 443 is allowed, which I usually also manually remove. No allow rules, no incoming traffic. This is the default behavior in my provider.
I changed the cloud provider, and didn’t notice that the default behavior is different: if there are no rules in dashboard, it means everything is allowed by default. The UI is different. Somehow I didn’t catch it in my test.
On VM, ufw default is block all incoming except SSH. SSHD is configured correctly with a custom sshd_config to allow only public key authentication and nothing else.
I noticed the issue, and found tens of thousands of failed connection attempts. Logs on the same server show nothing was accepted other than with my public key and IP.
Is there any concern?
Should the server be deleted? It takes a lot of work.
**Update**
I also worry if some non-SSH services could bypass ufw. I know Docker could do it (not in my case). But I wonder if there could be any other services bypassing UFW via IPtables rules in a default installation of Ubuntu server (kept up to date)?
Obviously IPtables and logs could be checked. But if someone got in, they could erase traces left. The server doesn’t have anything super important, and is isolated, but malware could still potentially spread through HTTPS pages accessed (malicious javanoscript pushed to the viewers).
https://redd.it/1olwrty
@r_systemadmin
I normally check the server security carefully, but finally made a mistake.
When I create servers in cloud, the firewall is enabled and only 443 is allowed, which I usually also manually remove. No allow rules, no incoming traffic. This is the default behavior in my provider.
I changed the cloud provider, and didn’t notice that the default behavior is different: if there are no rules in dashboard, it means everything is allowed by default. The UI is different. Somehow I didn’t catch it in my test.
On VM, ufw default is block all incoming except SSH. SSHD is configured correctly with a custom sshd_config to allow only public key authentication and nothing else.
I noticed the issue, and found tens of thousands of failed connection attempts. Logs on the same server show nothing was accepted other than with my public key and IP.
Is there any concern?
Should the server be deleted? It takes a lot of work.
**Update**
I also worry if some non-SSH services could bypass ufw. I know Docker could do it (not in my case). But I wonder if there could be any other services bypassing UFW via IPtables rules in a default installation of Ubuntu server (kept up to date)?
Obviously IPtables and logs could be checked. But if someone got in, they could erase traces left. The server doesn’t have anything super important, and is isolated, but malware could still potentially spread through HTTPS pages accessed (malicious javanoscript pushed to the viewers).
https://redd.it/1olwrty
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Why has system admnistrator pay gone down in Canada?
Before about a year ago, i was seeing regularly pay around 90k. Now all I see is 68k-75k and thats with 5 years of experience.
Is the market down or is this the new normal?
Im in the windows sysadmin environment (citrix, vmware, SolarWinds, windows)
https://redd.it/1olzebc
@r_systemadmin
Before about a year ago, i was seeing regularly pay around 90k. Now all I see is 68k-75k and thats with 5 years of experience.
Is the market down or is this the new normal?
Im in the windows sysadmin environment (citrix, vmware, SolarWinds, windows)
https://redd.it/1olzebc
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Sandboxie plus error
I used to use sandboxie plus here and there and never used to have an issue with it, it would open up a web browser just fine. Lately though, when I go to open a web browser through it by right Clicking default box, then Run-> Standard applications -> default web browser (which for me is firefox), it gives me the following error:
procedure entry point pk11sdr_encryptwithmechanism could not be located in the DLL c:\\ProgramFiles\\Mozilla firefox\\xul.dll
I don't know why it would give me this error. Firefox opens up just fine outside of the sandbox.
https://redd.it/1om4jwm
@r_systemadmin
I used to use sandboxie plus here and there and never used to have an issue with it, it would open up a web browser just fine. Lately though, when I go to open a web browser through it by right Clicking default box, then Run-> Standard applications -> default web browser (which for me is firefox), it gives me the following error:
procedure entry point pk11sdr_encryptwithmechanism could not be located in the DLL c:\\ProgramFiles\\Mozilla firefox\\xul.dll
I don't know why it would give me this error. Firefox opens up just fine outside of the sandbox.
https://redd.it/1om4jwm
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
hyper-v instead vmware
hi
we have a standalone cluster with 8 hosts.
they don't have shared storage - each host have its owed local storage, of course no migration between the hosts..
today we are running vmware esxi, our license will expire next year
i consider hyper-v as replacement, all our servers-based windows server OS on this cluster
also, i consider proxmox as candidate..
https://redd.it/1olx3hn
@r_systemadmin
hi
we have a standalone cluster with 8 hosts.
they don't have shared storage - each host have its owed local storage, of course no migration between the hosts..
today we are running vmware esxi, our license will expire next year
i consider hyper-v as replacement, all our servers-based windows server OS on this cluster
also, i consider proxmox as candidate..
https://redd.it/1olx3hn
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
8TB spinner have been hovering around $150 for the last 7 years and I need someone to blame
Any researched takes on why I can't reasonably upgrade my array?
https://redd.it/1om9ei8
@r_systemadmin
Any researched takes on why I can't reasonably upgrade my array?
https://redd.it/1om9ei8
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
MacM2 I can't open my chrome profile on debugger port anymore (since 4 months ago)
4 months ago when I opened an existing google chrome profile on a debugging port (9222), it worked (
However, when I open a new profile on this debugging port, it works. But I need it to latch on to the correct profile. Did something change since 4 months ago? Please help, I hope to find a way to reconnect my profile to a debugging port, not just any random profile. Example of opening a chrome profile on a debugger port (in mac terminal):
https://redd.it/1omcksb
@r_systemadmin
4 months ago when I opened an existing google chrome profile on a debugging port (9222), it worked (
lsof -i :9222 in terminal returned the session json). Now when i run it, lsof -i :9222 returns nothing. Likewise, going to http://localhost:9222/json/version returns nothing. This means its not working.However, when I open a new profile on this debugging port, it works. But I need it to latch on to the correct profile. Did something change since 4 months ago? Please help, I hope to find a way to reconnect my profile to a debugging port, not just any random profile. Example of opening a chrome profile on a debugger port (in mac terminal):
"/Applications/Google `Chrome.app/Contents/MacOS/Google` Chrome" \ --remote-debugging-port=9222 \ --profile-directory="Profile 1" \ --no-default-browser-checkhttps://redd.it/1omcksb
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Keep failing/Upsetting Manager
I’m so sick of this. I keep messing up and feel like I’m being written up one week and then the next week commended for all the work I’m doing.
For example, this last week I got a notification that I needed to renew a few client secrets. So I went to notify the users who own the apps but then I got pulled away from the ticket and never followed up with them.
Come Sunday morning/Saturday night, (extremely unfortunate timing…) the secret expires and the platform is for reporting. So engineering flags me down and asks me to update the secret. I jump on it immediately and it’s resolved within 15 minutes.
I get a notification from my manager that he’s asked me several times to resolve this problem of secrets not being updated. I need it fixed by EOD Monday. With the slightly cryptic “We’ll discuss in our 1:1.”
Now I’ve been up all night stressed bc ugh, I messed up. I know it was my fault, and it was an issue and I am the single point of failure here but I can’t wrap my head around how to fix this/what I’m going to tell my manager on Monday.
Mind you I have tried to take care of this with our existing support system (that is implemented so terribly for internal use) — there’s a reoccurring ticket that comes up once a month for audits. But again, I just can’t keep up with the tickets, onboarding’s, device management all while trying to implement full on projects like a vpn, asset tracking solutions, third party patching and well cleaning up this god awful support system. Meanwhile I get 10-15 messages every morning in slack that are not put in as tickets. And I’m weary of even having the users use the ticketing platform because I know that it’s shitty and I can’t keep up on them.
I just feel overwhelmed and don’t know how to show it because I’m stuck using the crappy system. And it’s probably not even the platform but just the implementation. Anytime I try and change something I get a notification from our service team saying I broke something because they are using it too. I know I know I need to test first before pushing out, but I don’t have the time to fix the system in the first place. I’ve always had at least enough time to get my stuff documented, I just don’t feel like I can here due to my tooling.
Anyways, I know I need to fix the system, but I also need to fix my process. I have a feeling it’s definitely a culture fix and no tool will help with this but I can’t help but feel horrible when I make these mistakes.
I know I’m doing good work and am probably just tired because I was recently brought up by the leadership team for helping with multiple projects and moving things along. But omg why do I feel so helpless with the medial tasks that should be easy but take so much dang time.
Thanks for letting me get this out, it’s been a long fricken week.
https://redd.it/1omd495
@r_systemadmin
I’m so sick of this. I keep messing up and feel like I’m being written up one week and then the next week commended for all the work I’m doing.
For example, this last week I got a notification that I needed to renew a few client secrets. So I went to notify the users who own the apps but then I got pulled away from the ticket and never followed up with them.
Come Sunday morning/Saturday night, (extremely unfortunate timing…) the secret expires and the platform is for reporting. So engineering flags me down and asks me to update the secret. I jump on it immediately and it’s resolved within 15 minutes.
I get a notification from my manager that he’s asked me several times to resolve this problem of secrets not being updated. I need it fixed by EOD Monday. With the slightly cryptic “We’ll discuss in our 1:1.”
Now I’ve been up all night stressed bc ugh, I messed up. I know it was my fault, and it was an issue and I am the single point of failure here but I can’t wrap my head around how to fix this/what I’m going to tell my manager on Monday.
Mind you I have tried to take care of this with our existing support system (that is implemented so terribly for internal use) — there’s a reoccurring ticket that comes up once a month for audits. But again, I just can’t keep up with the tickets, onboarding’s, device management all while trying to implement full on projects like a vpn, asset tracking solutions, third party patching and well cleaning up this god awful support system. Meanwhile I get 10-15 messages every morning in slack that are not put in as tickets. And I’m weary of even having the users use the ticketing platform because I know that it’s shitty and I can’t keep up on them.
I just feel overwhelmed and don’t know how to show it because I’m stuck using the crappy system. And it’s probably not even the platform but just the implementation. Anytime I try and change something I get a notification from our service team saying I broke something because they are using it too. I know I know I need to test first before pushing out, but I don’t have the time to fix the system in the first place. I’ve always had at least enough time to get my stuff documented, I just don’t feel like I can here due to my tooling.
Anyways, I know I need to fix the system, but I also need to fix my process. I have a feeling it’s definitely a culture fix and no tool will help with this but I can’t help but feel horrible when I make these mistakes.
I know I’m doing good work and am probably just tired because I was recently brought up by the leadership team for helping with multiple projects and moving things along. But omg why do I feel so helpless with the medial tasks that should be easy but take so much dang time.
Thanks for letting me get this out, it’s been a long fricken week.
https://redd.it/1omd495
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Endpoint Protection for Small Business with old machines
Hello,
We have 13 machines, some 7, one 8, a few 10, and a few 11. Plus a server 2016 for AD.
Our IT company no longer does IT stuff, so they won’t sell me a new Symantec license. I’m winging it at the moment. Unintentional sysadmin. Getting approval to spend money on anything tech is difficult.
We currently have Symantec endpoint security enterprise, but it expires in a week. It’s been busy, and I haven’t been able to shop around. I got a quote for Crowdstrike, which I was able to get approved, but now the company I got the quote from is ghosting me, so I can’t actually buy it. Their quote was cheaper than how much crowdstrike is on crowdstrike’s site, and I’m confused about the Falcon Sensor for Legacy systems thing for our one windows 8 machine. I need something that just works for older machines (if that exists).
What endpoint protection would you guys suggest for our out-of-date setup? I was authorized to spend about $700, so I need to come in under that.
https://redd.it/1ome6ie
@r_systemadmin
Hello,
We have 13 machines, some 7, one 8, a few 10, and a few 11. Plus a server 2016 for AD.
Our IT company no longer does IT stuff, so they won’t sell me a new Symantec license. I’m winging it at the moment. Unintentional sysadmin. Getting approval to spend money on anything tech is difficult.
We currently have Symantec endpoint security enterprise, but it expires in a week. It’s been busy, and I haven’t been able to shop around. I got a quote for Crowdstrike, which I was able to get approved, but now the company I got the quote from is ghosting me, so I can’t actually buy it. Their quote was cheaper than how much crowdstrike is on crowdstrike’s site, and I’m confused about the Falcon Sensor for Legacy systems thing for our one windows 8 machine. I need something that just works for older machines (if that exists).
What endpoint protection would you guys suggest for our out-of-date setup? I was authorized to spend about $700, so I need to come in under that.
https://redd.it/1ome6ie
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Storage Maintenance - Best Practices
Dear Friends,
I have a storage activity. We need to power it off and dismount it then repower it again.
I need to know the proper way/steps to do this activity as we have San switches and servers (all hyper-v).
My plan/steps are as follows:
First - Host Side:
1. Shut down all VMs in Hyper-V.
2. Shut down cluster in Hyper-V.
3. Take off-line storage disks in Hyper-V.
4. Shut down physical servers.
Second - San Switches:
Shut down san switches one by one.
Kindly share your thoughts.
https://redd.it/1omfg5b
@r_systemadmin
Dear Friends,
I have a storage activity. We need to power it off and dismount it then repower it again.
I need to know the proper way/steps to do this activity as we have San switches and servers (all hyper-v).
My plan/steps are as follows:
First - Host Side:
1. Shut down all VMs in Hyper-V.
2. Shut down cluster in Hyper-V.
3. Take off-line storage disks in Hyper-V.
4. Shut down physical servers.
Second - San Switches:
Shut down san switches one by one.
Kindly share your thoughts.
https://redd.it/1omfg5b
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Is there a modern equivalent to the old relaxing Windows defrag?
Saw a post about the windows defrag emulator and got me thinking about how much I used to enjoy watching the damn thing while it actually did something worthwhile. Is there a modern equivalent where you’re actually getting work done but also enjoying just watching it?
https://redd.it/1omfzjf
@r_systemadmin
Saw a post about the windows defrag emulator and got me thinking about how much I used to enjoy watching the damn thing while it actually did something worthwhile. Is there a modern equivalent where you’re actually getting work done but also enjoying just watching it?
https://redd.it/1omfzjf
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
No-IP down
The website www.noip.com is returning a 502 error, and domains are not begin resolved.
From what i've seen, it seems to have been down for 3-4 hours.
https://redd.it/1omhafz
@r_systemadmin
The website www.noip.com is returning a 502 error, and domains are not begin resolved.
From what i've seen, it seems to have been down for 3-4 hours.
https://redd.it/1omhafz
@r_systemadmin
Noip
No-IP | Smarter DNS Starts Here
No-IP is a Free Dynamic DNS and Managed DNS provider with 100% uptime. Get free DDNS, plus domain registration and SSL certificates, for home and business
Dealing with Boss
For over 20 years, I’ve managed a company through all changes, all systems, upgrades, migrations, improvements that need to be made in the IT category. You could say I’m the system administrator, the network administrator, and the support desk.
Every time I discuss with my boss the need for a “ fill in the blank“ -it could be new fiber, new hardware, new phone IP system, his response is always “we should do the research first”. Then he completely acts like I don’t know what I’m talking about. The other day I almost had to explain to him why having the Internet was necessary.
Now mind you before any change or upgrade, I’ve already talked to two or three vendors for each system. I’ve already done my research reviewing products and protocols and I still get no respect. I have discussed with others in the business as well. On top of that, all of our systems are running great.
Boss is a misogynist who constantly gaslights me and sometimes makes “jokes“ and thinks he’s funny. Oh yeah, I’m a woman in a male dominated role. My response to him is, “well I am the expert in this area and this is what needs to be done”. Have any of you experienced this type of non-support? What advice do you have for dealing with this type of narcissist?
https://redd.it/1omjtu7
@r_systemadmin
For over 20 years, I’ve managed a company through all changes, all systems, upgrades, migrations, improvements that need to be made in the IT category. You could say I’m the system administrator, the network administrator, and the support desk.
Every time I discuss with my boss the need for a “ fill in the blank“ -it could be new fiber, new hardware, new phone IP system, his response is always “we should do the research first”. Then he completely acts like I don’t know what I’m talking about. The other day I almost had to explain to him why having the Internet was necessary.
Now mind you before any change or upgrade, I’ve already talked to two or three vendors for each system. I’ve already done my research reviewing products and protocols and I still get no respect. I have discussed with others in the business as well. On top of that, all of our systems are running great.
Boss is a misogynist who constantly gaslights me and sometimes makes “jokes“ and thinks he’s funny. Oh yeah, I’m a woman in a male dominated role. My response to him is, “well I am the expert in this area and this is what needs to be done”. Have any of you experienced this type of non-support? What advice do you have for dealing with this type of narcissist?
https://redd.it/1omjtu7
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Storage expandability and noise concerns
Howdy!
My client has data in 3 locations:
1. on-prem NAS with 150 TB of storage (inherited setup that has been rock solid).
2. offsite backup (Veeam), expandable over a PB, currently 250 TB used.
3. offsite backup (automated copy job to a remote server across the globe). Currently around 250 TB, also easily expandable.
They are projected to grow 50% storage-wise in the next 6-8 months. While the backup locations (2 and 3) are very expandable, the on-prem storage is becoming a problem.
The NAS is full of hard drives with no room to add more, (they have about 20-ish % left of free space) and while I could replace the drives for bigger models and get them to roughly to 400-500TB depending on the RAID config I go with, management has requested that I provide a more long-term solution.
Easy-peasy you say, just get a nice Dell or something similar and call it a day...
The client is adamant that the on-prem box must be whisper quiet just like the current one, not to "disturb the office workers". It's in the IT closet, far from them, so I don't see how that would be the case.
Another request that was made was that the storage had to be easily expandable and scalable for the next three years minimum, even if their growth continued at this rate, which would put them over 1 PB, which means I would have to plan for 2-3 PB minimum, although unlikely, I have to honor this request or at the very least find something with at least 1 PB for now.
So far, my best idea is to simply build 2-3 almost identical systems to the NAS one and just create shares/configure permissions and organize data in several logical units that would make sense for the client.
For example:
Drive F: - Projects 2016-2018. NAS1
Drive G: - Projects 2019-2022. NAS2
Drive H: - Projects 2023-2025. NAS3
This is not something I would normally do and I'm looking to get some advice. My approach would be HA multi-node Dell (or similar) system to ensure high-availability and redundancy.
https://redd.it/1omjn46
@r_systemadmin
Howdy!
My client has data in 3 locations:
1. on-prem NAS with 150 TB of storage (inherited setup that has been rock solid).
2. offsite backup (Veeam), expandable over a PB, currently 250 TB used.
3. offsite backup (automated copy job to a remote server across the globe). Currently around 250 TB, also easily expandable.
They are projected to grow 50% storage-wise in the next 6-8 months. While the backup locations (2 and 3) are very expandable, the on-prem storage is becoming a problem.
The NAS is full of hard drives with no room to add more, (they have about 20-ish % left of free space) and while I could replace the drives for bigger models and get them to roughly to 400-500TB depending on the RAID config I go with, management has requested that I provide a more long-term solution.
Easy-peasy you say, just get a nice Dell or something similar and call it a day...
The client is adamant that the on-prem box must be whisper quiet just like the current one, not to "disturb the office workers". It's in the IT closet, far from them, so I don't see how that would be the case.
Another request that was made was that the storage had to be easily expandable and scalable for the next three years minimum, even if their growth continued at this rate, which would put them over 1 PB, which means I would have to plan for 2-3 PB minimum, although unlikely, I have to honor this request or at the very least find something with at least 1 PB for now.
So far, my best idea is to simply build 2-3 almost identical systems to the NAS one and just create shares/configure permissions and organize data in several logical units that would make sense for the client.
For example:
Drive F: - Projects 2016-2018. NAS1
Drive G: - Projects 2019-2022. NAS2
Drive H: - Projects 2023-2025. NAS3
This is not something I would normally do and I'm looking to get some advice. My approach would be HA multi-node Dell (or similar) system to ensure high-availability and redundancy.
https://redd.it/1omjn46
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Unusual behavior with TCP port 53 (TCP DNS)
Hi! I’m trying to track down an unusual behavior in my environment that I think might be a misconfiguration or poorly documented behavior. For starters, I am not a Windows system admin. I’m more on the network and firewall side of the house. We have rolled out a network performance monitoring product after it tested well with multiple teams in my department. The product basically watches traffic that comes off of in-line taps and port mirrors and alerts us to potential performance problems in our environment.
Our dashboard is lit up bright red with an alert “many failed connections to dns servers.”
Well we don’t have any tickets or user complaints related to dns resolution but we paid good money for the monitoring product so I was highly interested and tracking down what the tool is reporting on and resolving the issue if possible. What I found is weird!
Basically PC workstations all over our network are opening a connection on TCP port 53 to our primary internal dns servers, and not completing the 3-way handshake.
I see TCP SYN from pc to dns server
DNS server replies SYN+ACK to the PC
PC never replies with ACK back to the DNS server
The DNS Server sends SYN+ACK 2-3 times never gets a reply and eventually sends RST to the PC as it gives up.
I did a direct packet capture on a remote PC and found the SYN+ACK is getting all the way to the PC, the PC is just ignoring it and not replying.
Actual dns queries to the same servers on UDP 53 are always promptly answered and working fine.
So I have no idea what’s going on. Is this some kind of keep alive probe? The PCs are just checking to see if the dns servers are still out there?
The “failed” connections are happening very often like every 30 seconds, from hundreds of endpoints. It’s making our dashboard look bright red.
I’ve opened tickets with our windows system guys provided screenshots pcaps, detail explanations on what’s going on. They just keep replying nothing seems to be wrong. I’m kind of at a loss. This is so far outside of my wheelhouse.
What is going on?
https://redd.it/1omolhg
@r_systemadmin
Hi! I’m trying to track down an unusual behavior in my environment that I think might be a misconfiguration or poorly documented behavior. For starters, I am not a Windows system admin. I’m more on the network and firewall side of the house. We have rolled out a network performance monitoring product after it tested well with multiple teams in my department. The product basically watches traffic that comes off of in-line taps and port mirrors and alerts us to potential performance problems in our environment.
Our dashboard is lit up bright red with an alert “many failed connections to dns servers.”
Well we don’t have any tickets or user complaints related to dns resolution but we paid good money for the monitoring product so I was highly interested and tracking down what the tool is reporting on and resolving the issue if possible. What I found is weird!
Basically PC workstations all over our network are opening a connection on TCP port 53 to our primary internal dns servers, and not completing the 3-way handshake.
I see TCP SYN from pc to dns server
DNS server replies SYN+ACK to the PC
PC never replies with ACK back to the DNS server
The DNS Server sends SYN+ACK 2-3 times never gets a reply and eventually sends RST to the PC as it gives up.
I did a direct packet capture on a remote PC and found the SYN+ACK is getting all the way to the PC, the PC is just ignoring it and not replying.
Actual dns queries to the same servers on UDP 53 are always promptly answered and working fine.
So I have no idea what’s going on. Is this some kind of keep alive probe? The PCs are just checking to see if the dns servers are still out there?
The “failed” connections are happening very often like every 30 seconds, from hundreds of endpoints. It’s making our dashboard look bright red.
I’ve opened tickets with our windows system guys provided screenshots pcaps, detail explanations on what’s going on. They just keep replying nothing seems to be wrong. I’m kind of at a loss. This is so far outside of my wheelhouse.
What is going on?
https://redd.it/1omolhg
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Help me source some software
I might be better posting this in onthetipofmytongue, however it's old software and I know there's some older sysadmins in here from DOS (and before days. It's an older software, for sure.
Donkeys years ago I used to have a music player, I'm sure it was back in DOS days, and when you played a CD it created fractels or soundwaves in various forms. It was epically hypnotic to watch.
Any idea what it was?
Edit: It looks like ProjectM does what I'm looking for, which is grand. Also, it was before Winamp.
Found it, I wasn't searching for the right terms: Cthugha!
https://en.wikipedia.org/wiki/Cthugha_(software)
https://redd.it/1omo531
@r_systemadmin
I might be better posting this in onthetipofmytongue, however it's old software and I know there's some older sysadmins in here from DOS (and before days. It's an older software, for sure.
Donkeys years ago I used to have a music player, I'm sure it was back in DOS days, and when you played a CD it created fractels or soundwaves in various forms. It was epically hypnotic to watch.
Any idea what it was?
Edit: It looks like ProjectM does what I'm looking for, which is grand. Also, it was before Winamp.
Found it, I wasn't searching for the right terms: Cthugha!
https://en.wikipedia.org/wiki/Cthugha_(software)
https://redd.it/1omo531
@r_systemadmin
I need a good iPXE netboot solution to be installed in ARM64 Linux
Hello, I need a simple iPXE server with DHCP and ISO boot capabilities without needing an internet connection, where I can boot ISO files both in BIOS and UEFI devices using a local DHCP server(I have an ethernet interface to bind to DHCP, so I will boot there). I tried some general recommendations, but none of them worked as I wanted. I will list those I've tried so far. Any recommendations of software or any ways to fix things I've tried are welcome.
Tried those:
* FOG Project - Can't boot ISO files on UEFI devices.
* Netboot.xyz - Their Docker container can't even download the menus.tar.gz file, and their self-host guide with Ansible can't even finish without throwing errors.
* iVentroy - Don't have ARM version.
https://redd.it/1omnzdy
@r_systemadmin
Hello, I need a simple iPXE server with DHCP and ISO boot capabilities without needing an internet connection, where I can boot ISO files both in BIOS and UEFI devices using a local DHCP server(I have an ethernet interface to bind to DHCP, so I will boot there). I tried some general recommendations, but none of them worked as I wanted. I will list those I've tried so far. Any recommendations of software or any ways to fix things I've tried are welcome.
Tried those:
* FOG Project - Can't boot ISO files on UEFI devices.
* Netboot.xyz - Their Docker container can't even download the menus.tar.gz file, and their self-host guide with Ansible can't even finish without throwing errors.
* iVentroy - Don't have ARM version.
https://redd.it/1omnzdy
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Proxmox
Okay, so, bit of a brain fart. My bosses boss was doing a bit of a ride along thing, just asking questions, getting to know IT (I know, odd but, good. The leadership has always had these rules about spending time with staff). I was showing him Proxmox and how we can setup VM's and bla bla bla... I didn't mean to over sell it or anything but, it's great. Anyway, he asked, why don't we setup every computer first with proxmox then add a windows VM. Would be the ultimate way to recover a computer quickly with longer term backups on another server (whatever your backup plan is). I did address the loss of power, as some CPU and resources would been needed just for proxmox. He asked about building a super computer with proxmox and having everyone access VM's. I congratulated him for inventing thin clients but also thought it would permit a lot of flexibility for staff and maybe it wouldn't be a bad idea. All I did was pause for a few moments to consider my answer and now he wants me to write up some pros and cons. When it might be appropriate to use thin clients, would there ever be a time when it would make sense to have a singe PC with Proxmox running just one VM for the end user or (this came up right at the end of the convo) eliminating windows users in favor of VM's (which I basically said no to that right away) but, now I'm thinking about redoing my homelab computer with proxmox first.
1. Proxmox as main OS with NinjaOne installed with image level backup enabled.
2. Windows 11 Pro from me
3. Linux for fileserver
4. Grandstream UCM Multi Tenant Software PBX (Just something I'm playing with these days).
What would you tell my boss, pro or con, about single computer / super computer with thin client?
Yes, this is probably an easy thing to answer but my mind is distracted with planning the PC that will be powerful enough to design the PC that will eventually be my home lab PC (very loose nod to Douglas Adams)
https://redd.it/1omtfes
@r_systemadmin
Okay, so, bit of a brain fart. My bosses boss was doing a bit of a ride along thing, just asking questions, getting to know IT (I know, odd but, good. The leadership has always had these rules about spending time with staff). I was showing him Proxmox and how we can setup VM's and bla bla bla... I didn't mean to over sell it or anything but, it's great. Anyway, he asked, why don't we setup every computer first with proxmox then add a windows VM. Would be the ultimate way to recover a computer quickly with longer term backups on another server (whatever your backup plan is). I did address the loss of power, as some CPU and resources would been needed just for proxmox. He asked about building a super computer with proxmox and having everyone access VM's. I congratulated him for inventing thin clients but also thought it would permit a lot of flexibility for staff and maybe it wouldn't be a bad idea. All I did was pause for a few moments to consider my answer and now he wants me to write up some pros and cons. When it might be appropriate to use thin clients, would there ever be a time when it would make sense to have a singe PC with Proxmox running just one VM for the end user or (this came up right at the end of the convo) eliminating windows users in favor of VM's (which I basically said no to that right away) but, now I'm thinking about redoing my homelab computer with proxmox first.
1. Proxmox as main OS with NinjaOne installed with image level backup enabled.
2. Windows 11 Pro from me
3. Linux for fileserver
4. Grandstream UCM Multi Tenant Software PBX (Just something I'm playing with these days).
What would you tell my boss, pro or con, about single computer / super computer with thin client?
Yes, this is probably an easy thing to answer but my mind is distracted with planning the PC that will be powerful enough to design the PC that will eventually be my home lab PC (very loose nod to Douglas Adams)
https://redd.it/1omtfes
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
WMIC and 25H2
Anyone know the real story about WMIC in Windows 11 25H2? Microsoft said that WMIC would be removed as part of the upgrade, but that doesn't seem to be true - we've checked several machines upgraded to 25H2 and they all still have WMIC.
A newly installed Windows 11 25H2 doesn't have WMIC but it can be installed from Optional Features, exactly the same as 24H2. (And just like 24H2, WMIC is present during the install process - it is only removed when the first user logs in.)
As far as I can see, 25H2 doesn't change anything about WMIC at all! What am I missing?
https://redd.it/1omv5ci
@r_systemadmin
Anyone know the real story about WMIC in Windows 11 25H2? Microsoft said that WMIC would be removed as part of the upgrade, but that doesn't seem to be true - we've checked several machines upgraded to 25H2 and they all still have WMIC.
A newly installed Windows 11 25H2 doesn't have WMIC but it can be installed from Optional Features, exactly the same as 24H2. (And just like 24H2, WMIC is present during the install process - it is only removed when the first user logs in.)
As far as I can see, 25H2 doesn't change anything about WMIC at all! What am I missing?
https://redd.it/1omv5ci
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community
Need help with getting HPE SAS drives usable in non-HP enclosures
So yea, I bought some of these - HPE 3PAR SMBP6000S5xeF7.2 (HP version of Seagate ST6000NM0285).
They are unsupported in my non-HP arrays. They refuse to accept PSID revert (sedutil-cli) and they refuse to accept Seagate OEM equivalent firmware (hdparm and Seatools both fail). They show up as SCSI devices (eg /dev/sg3) but not as blk devices. Pretty much at the end of my rope with these things.
Any suggestions about how this might be made to work? Available to run commands and report results for troubleshooting at your convenience. Really would like to be able to use these / not have to junk them.
https://redd.it/1omv3in
@r_systemadmin
So yea, I bought some of these - HPE 3PAR SMBP6000S5xeF7.2 (HP version of Seagate ST6000NM0285).
They are unsupported in my non-HP arrays. They refuse to accept PSID revert (sedutil-cli) and they refuse to accept Seagate OEM equivalent firmware (hdparm and Seatools both fail). They show up as SCSI devices (eg /dev/sg3) but not as blk devices. Pretty much at the end of my rope with these things.
Any suggestions about how this might be made to work? Available to run commands and report results for troubleshooting at your convenience. Really would like to be able to use these / not have to junk them.
https://redd.it/1omv3in
@r_systemadmin
Reddit
From the sysadmin community on Reddit
Explore this post and more from the sysadmin community