Python for Automating stuff on Azure and Kafka
Hi,
I need some suggestions from the community here, I been working bash for noscripting in CI and CD pipeline jobs with minimal exposure to python in the automation pipelines.
I am looking to start focusing on developing my python skills and get some hands on with Azure python SDK and Kafka libraries to start using python at my workplace.
Need some suggestions on online learning platform and books to get started. Looking to invest about 10-12 hours each week in learning.
https://redd.it/1oyctub
@r_devops
Hi,
I need some suggestions from the community here, I been working bash for noscripting in CI and CD pipeline jobs with minimal exposure to python in the automation pipelines.
I am looking to start focusing on developing my python skills and get some hands on with Azure python SDK and Kafka libraries to start using python at my workplace.
Need some suggestions on online learning platform and books to get started. Looking to invest about 10-12 hours each week in learning.
https://redd.it/1oyctub
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Manage Vault in GitOps way
Hi all,
In my home cluster I'm introducing Vault and Vault operator to handle secrets within the cluster.
How to you guys manage Vault in an automated way? For example I would like to create kv and policies in a declarative way maybe managed with Argo CD
Any suggestings?
https://redd.it/1oygbil
@r_devops
Hi all,
In my home cluster I'm introducing Vault and Vault operator to handle secrets within the cluster.
How to you guys manage Vault in an automated way? For example I would like to create kv and policies in a declarative way maybe managed with Argo CD
Any suggestings?
https://redd.it/1oygbil
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Productizing LangGraph Agents
Hey,
I'm trying to understand which option is better based on your experience. I
want to deploy enterprise-ready agentic applications, my current agent framework is Langgraph.
To be production-ready, I need horizontal scaling and durable state so that if a failure occurs, the system can resume from the last successful step.
I’ve been reading a lot about Temporal and the Langsmith Agent Server, both seem to offer similar capabilities and promise durable execution for agents, tools, and MCPs.
I'm not sure which one is more recommended.
I did notice one major difference: in Langgraph I need to explicitly define retry policies in my code, while Temporal handles retries more transparently.
I’d love to get your feedback on this.
https://redd.it/1oyh93l
@r_devops
Hey,
I'm trying to understand which option is better based on your experience. I
want to deploy enterprise-ready agentic applications, my current agent framework is Langgraph.
To be production-ready, I need horizontal scaling and durable state so that if a failure occurs, the system can resume from the last successful step.
I’ve been reading a lot about Temporal and the Langsmith Agent Server, both seem to offer similar capabilities and promise durable execution for agents, tools, and MCPs.
I'm not sure which one is more recommended.
I did notice one major difference: in Langgraph I need to explicitly define retry policies in my code, while Temporal handles retries more transparently.
I’d love to get your feedback on this.
https://redd.it/1oyh93l
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Trouble sharing a Windows Server 2022 AMI between AWS accounts (no RDP password, no SSM connection)
Hello everyone,
I've been trying for the last two days to share a custom Windows Server 2022 AMI from Account A to Account B, but without success.
The source AMI is based on the official Windows_Server-2022-English-Full-Base image, and I installed a few internal programs and agents on it.
After creating and sharing the AMI, I can successfully launch instances from it in the target account (Account B), but:
I cannot retrieve the Windows password via “Get Windows password” (it says “This instance was launched from a custom AMI...”);
The SSM Agent doesn’t start or connect to Systems Manager;
The instance shows 3/3 health checks OK, but remains inaccessible over RDP or SSM.
---
🔹 What I have tried so far
1. Standard AMI creation:
Created the image via EC2 console → Create image.
Shared both the AMI and its snapshot with the target AWS account (including Allow EBS volume creation).
2. First attempt (no sysprep):
The image worked but AWS couldn’t decrypt the Windows password.
Expected behavior, since Windows wasn’t generalized.
3. Second attempt (sysprep with /oobe /generalize /shutdown):
Ran from SSM:
Start-Process "C:\Windows\System32\Sysprep\sysprep.exe" -ArgumentList "/oobe /generalize /shutdown" -Wait
Result: instance stopped correctly, but when launching from this AMI the system got stuck on the “Hi there” screen (OOBE GUI), so no EC2Launch automation, no RDP, no SSM.
4. Third attempt (sysprep with /generalize /shutdown only):
Based on the AWS official documentation, /oobe should not be used — EC2LaunchV2 handles first boot automatically.
However, the AMI was based on an older image that had EC2Launch v1, not EC2LaunchV2, so I verified this via:
Get-Service | Where-Object { $_.Name -like "EC2Launch*" }
and confirmed it was the legacy EC2Launch service.
Started the service:
Set-Service EC2Launch -StartupType Automatic
Start-Service EC2Launch
Re-ran:
Start-Process "C:\Windows\System32\Sysprep\sysprep.exe" -ArgumentList "/generalize /shutdown" -Wait
The process completed and the instance shut down, but in the new account I still couldn’t decrypt the Windows password (AWS said custom AMI).
5. Tried reinstalling EC2LaunchV2 manually:
Using:
Invoke-WebRequest "https://ec2-launch-v2.s3.amazonaws.com/latest/EC2LaunchV2.msi" -OutFile "$env:TEMP\EC2LaunchV2.msi"
Start-Process msiexec.exe -ArgumentList "/i $env:TEMP\EC2LaunchV2.msi /quiet" -Wait
However, the service didn’t register, likely because the image is built on a base that doesn’t support EC2LaunchV2 natively (Windows Server 2022 + legacy AMI lineage).
https://redd.it/1oyh932
@r_devops
Hello everyone,
I've been trying for the last two days to share a custom Windows Server 2022 AMI from Account A to Account B, but without success.
The source AMI is based on the official Windows_Server-2022-English-Full-Base image, and I installed a few internal programs and agents on it.
After creating and sharing the AMI, I can successfully launch instances from it in the target account (Account B), but:
I cannot retrieve the Windows password via “Get Windows password” (it says “This instance was launched from a custom AMI...”);
The SSM Agent doesn’t start or connect to Systems Manager;
The instance shows 3/3 health checks OK, but remains inaccessible over RDP or SSM.
---
🔹 What I have tried so far
1. Standard AMI creation:
Created the image via EC2 console → Create image.
Shared both the AMI and its snapshot with the target AWS account (including Allow EBS volume creation).
2. First attempt (no sysprep):
The image worked but AWS couldn’t decrypt the Windows password.
Expected behavior, since Windows wasn’t generalized.
3. Second attempt (sysprep with /oobe /generalize /shutdown):
Ran from SSM:
Start-Process "C:\Windows\System32\Sysprep\sysprep.exe" -ArgumentList "/oobe /generalize /shutdown" -Wait
Result: instance stopped correctly, but when launching from this AMI the system got stuck on the “Hi there” screen (OOBE GUI), so no EC2Launch automation, no RDP, no SSM.
4. Third attempt (sysprep with /generalize /shutdown only):
Based on the AWS official documentation, /oobe should not be used — EC2LaunchV2 handles first boot automatically.
However, the AMI was based on an older image that had EC2Launch v1, not EC2LaunchV2, so I verified this via:
Get-Service | Where-Object { $_.Name -like "EC2Launch*" }
and confirmed it was the legacy EC2Launch service.
Started the service:
Set-Service EC2Launch -StartupType Automatic
Start-Service EC2Launch
Re-ran:
Start-Process "C:\Windows\System32\Sysprep\sysprep.exe" -ArgumentList "/generalize /shutdown" -Wait
The process completed and the instance shut down, but in the new account I still couldn’t decrypt the Windows password (AWS said custom AMI).
5. Tried reinstalling EC2LaunchV2 manually:
Using:
Invoke-WebRequest "https://ec2-launch-v2.s3.amazonaws.com/latest/EC2LaunchV2.msi" -OutFile "$env:TEMP\EC2LaunchV2.msi"
Start-Process msiexec.exe -ArgumentList "/i $env:TEMP\EC2LaunchV2.msi /quiet" -Wait
However, the service didn’t register, likely because the image is built on a base that doesn’t support EC2LaunchV2 natively (Windows Server 2022 + legacy AMI lineage).
https://redd.it/1oyh932
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Is there a standard list of all potential metrics that one can / should extract from technologies like HTTP / gRPC / GraphQL server & clients? Or for Request Response systems in general?
We all deal with developing / maintaining servers and clients. With observability playing its part, I am trying to figure out wouldn't we have standardized metrics that one can by default use for such servers?
If so is there actually a project / foundation / tool that is working on it?
e.g. with server there can prometheus metrics for requests, responses
for client could be something similar. I mean developers can choose metrics they deem useful but having a list of what are potentially available metrics would be much better strategy IMHO.
I don't know if OpenTelemetry solves this issue, from what I understand it provides tools to obtain metrics, traces, logs but doesn't define a definitive set as to what most of these standard models can provide
https://redd.it/1oylwuc
@r_devops
We all deal with developing / maintaining servers and clients. With observability playing its part, I am trying to figure out wouldn't we have standardized metrics that one can by default use for such servers?
If so is there actually a project / foundation / tool that is working on it?
e.g. with server there can prometheus metrics for requests, responses
for client could be something similar. I mean developers can choose metrics they deem useful but having a list of what are potentially available metrics would be much better strategy IMHO.
I don't know if OpenTelemetry solves this issue, from what I understand it provides tools to obtain metrics, traces, logs but doesn't define a definitive set as to what most of these standard models can provide
https://redd.it/1oylwuc
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How do you handle infrastructure audits across multiple monitoring tools?
Our team just went through an annual audit of our internal tools.
Some of the audits we do are the following:
1. Alerts - We have alerts spanning across Cloudwatch, Splunk, Chronosphere, Grafana, and custom cron jobs. We audit for things like if we still need the alert, is it still accurate, etc..
2. ASGs - We went through all the AWS ASGs that we own and ensured they have appropriate resources (not too much or too little), does our team still own it, etc…
That’s just a small portion of our audit.
Often these audits require the auditor to go to different systems and pull some data to get an idea on the current status of the infrastructure/tool in question.
All of this data is put into a spreadsheet and different audits are assigned to different team members.
Curious on a few things:
- Are you auditing your infra/tools regularly?
- Do you have tooling for this? Something beyond simple spreadsheets.
- How long does it take you to audit?
Looking to hear what works well for others!
https://redd.it/1oyomjm
@r_devops
Our team just went through an annual audit of our internal tools.
Some of the audits we do are the following:
1. Alerts - We have alerts spanning across Cloudwatch, Splunk, Chronosphere, Grafana, and custom cron jobs. We audit for things like if we still need the alert, is it still accurate, etc..
2. ASGs - We went through all the AWS ASGs that we own and ensured they have appropriate resources (not too much or too little), does our team still own it, etc…
That’s just a small portion of our audit.
Often these audits require the auditor to go to different systems and pull some data to get an idea on the current status of the infrastructure/tool in question.
All of this data is put into a spreadsheet and different audits are assigned to different team members.
Curious on a few things:
- Are you auditing your infra/tools regularly?
- Do you have tooling for this? Something beyond simple spreadsheets.
- How long does it take you to audit?
Looking to hear what works well for others!
https://redd.it/1oyomjm
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
FREE Security audit for your code in exchange for 10 min feedback
Hey everyone,
I'm building a security analyzer called CodeSlick.dev that detects OWASP Top 10 vulnerabilities in JavaScript, Python, Java, and TypeScript.
To improve it, I'm offering free security audits in exchange for honest feedback.
What you get:
\- Instant security analysis (<3 seconds)
\- AI-powered fix suggestions with one-click apply
\- CVSS severity scoring
\- Downloadable HTML report
What I need:
\- 10-minute feedback survey after you see results
\- Your honest thoughts on what worked/what didn't
Zero friction:
\- No signup required
\- No installation
\- Just paste code → Get report → Share feedback
Interested? Please feel free to comment below or DM me.
https://redd.it/1oyq3gi
@r_devops
Hey everyone,
I'm building a security analyzer called CodeSlick.dev that detects OWASP Top 10 vulnerabilities in JavaScript, Python, Java, and TypeScript.
To improve it, I'm offering free security audits in exchange for honest feedback.
What you get:
\- Instant security analysis (<3 seconds)
\- AI-powered fix suggestions with one-click apply
\- CVSS severity scoring
\- Downloadable HTML report
What I need:
\- 10-minute feedback survey after you see results
\- Your honest thoughts on what worked/what didn't
Zero friction:
\- No signup required
\- No installation
\- Just paste code → Get report → Share feedback
Interested? Please feel free to comment below or DM me.
https://redd.it/1oyq3gi
@r_devops
codeslick.dev
CodeSlick - Your Code Fixer
Instant AI code validation. Catch errors, missing imports, and syntax issues before running. Supports JavaScript, TypeScript, Python, and Java.
Offline Scalable CICD Platform Recommendations
Hello all,
I was wondering if anyone could recommend any scalable platforms for running CICD in an offline environment. At present we have a bunch of VMs with GitLab runners on them, but due to mixed use of the VMs (like users logging in to do other stuff) it’s quite hard to manage security and keep config consistent.
Unfortunately a lot of the VMs need to be Windows based because that’s the target environment. Most jobs small jobs are Python, the larger jobs are Java, C++ etc. The Java stuff is super simple, but the other languages tend to be trickier. This network has about 40 proper devs and 60 python bandits.
We’re looking for a solution that can be purchased to run on an air gapped network that can do load balancing, re-base-lining etc without much manual maintenance.
I’d suggested doing it with Kubernetes ourselves but we are time restricted and have some budget to buy something. One of my colleagues say a VmWare Tanzu demo that looked good, but anyone with hands on experience would be more useful than a conference sale pitch.
Any suggestions would be appreciated, and I can provide more info if needed. We have about £200k budget for both the compute and the management platform.
Just in case anyone tries to sell me something directly, I won’t be the one making the decision or purchase.
Thanks in advance
https://redd.it/1oytnsx
@r_devops
Hello all,
I was wondering if anyone could recommend any scalable platforms for running CICD in an offline environment. At present we have a bunch of VMs with GitLab runners on them, but due to mixed use of the VMs (like users logging in to do other stuff) it’s quite hard to manage security and keep config consistent.
Unfortunately a lot of the VMs need to be Windows based because that’s the target environment. Most jobs small jobs are Python, the larger jobs are Java, C++ etc. The Java stuff is super simple, but the other languages tend to be trickier. This network has about 40 proper devs and 60 python bandits.
We’re looking for a solution that can be purchased to run on an air gapped network that can do load balancing, re-base-lining etc without much manual maintenance.
I’d suggested doing it with Kubernetes ourselves but we are time restricted and have some budget to buy something. One of my colleagues say a VmWare Tanzu demo that looked good, but anyone with hands on experience would be more useful than a conference sale pitch.
Any suggestions would be appreciated, and I can provide more info if needed. We have about £200k budget for both the compute and the management platform.
Just in case anyone tries to sell me something directly, I won’t be the one making the decision or purchase.
Thanks in advance
https://redd.it/1oytnsx
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
what’s the one type of alert that ruins your sleep the most?
just trying to understand how bad on-call life really is outside my bubble.
Last night a friend got woken up at 3AM… for an alert that turned out to be nothing.
Curious:
• What alert always turns out to be noise?
• What’s the dumbest 3AM wake-up you’ve had?
• If you could delete one alert type forever, which one would it be?
https://redd.it/1oyv1lx
@r_devops
just trying to understand how bad on-call life really is outside my bubble.
Last night a friend got woken up at 3AM… for an alert that turned out to be nothing.
Curious:
• What alert always turns out to be noise?
• What’s the dumbest 3AM wake-up you’ve had?
• If you could delete one alert type forever, which one would it be?
https://redd.it/1oyv1lx
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
System Design interview for DevOps roles
For a year, system design interview has taken its place in the interview process of DevOps roles. At least I am seeing for a year.
In each interview, I was asked to design different systems (api design and database design) to achieve different requirements. These interviews always seem to focus on software itself, rather than infrastructure or operating systems or cloud. Personally I feel they’re judging a fish if it can fly.
Have you seen the same? What’s your opinion?
https://redd.it/1oywe81
@r_devops
For a year, system design interview has taken its place in the interview process of DevOps roles. At least I am seeing for a year.
In each interview, I was asked to design different systems (api design and database design) to achieve different requirements. These interviews always seem to focus on software itself, rather than infrastructure or operating systems or cloud. Personally I feel they’re judging a fish if it can fly.
Have you seen the same? What’s your opinion?
https://redd.it/1oywe81
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
want to build a microservice containing amixture of open source IAM and RBAC
im trying to build a microservice to handle my auth and rbac for a project im starting, though i dont want to waste my time on it, and ould rather use some opensource solutions to handle the requirements:
Authentication:
\- JWT + OAuth2 Password Flow
\- Access tokens + Refresh tokens
\- Token revocation, password reset, user invitations
\- bcrypt password hashing....
Multitenancy:
\- Database-per-tenant architecture
\- Shared schema (super_admins, entities) + Tenant schemas
\- Complete data isolation between entities
RBAC:
\- 3 fixed roles: Super Admin, Admin, User
\- Profile-based permissions for Users
\- Granular permissions: resource.action format (e.g., example.create, billing.*)
\- Admin creates custom profiles with specific permissions
\- Entity-level feature toggles
initially i did set hanko "great solution", but it doesnt align with my system requirements and will need a lot of customization, then i though about using Keycloak, or Ory Kratos ... with OpenFGA for RBAC
but i wonder, what could be the best combination for such requirements, or am i on a completly wrong track?
https://redd.it/1oyw32n
@r_devops
im trying to build a microservice to handle my auth and rbac for a project im starting, though i dont want to waste my time on it, and ould rather use some opensource solutions to handle the requirements:
Authentication:
\- JWT + OAuth2 Password Flow
\- Access tokens + Refresh tokens
\- Token revocation, password reset, user invitations
\- bcrypt password hashing....
Multitenancy:
\- Database-per-tenant architecture
\- Shared schema (super_admins, entities) + Tenant schemas
\- Complete data isolation between entities
RBAC:
\- 3 fixed roles: Super Admin, Admin, User
\- Profile-based permissions for Users
\- Granular permissions: resource.action format (e.g., example.create, billing.*)
\- Admin creates custom profiles with specific permissions
\- Entity-level feature toggles
initially i did set hanko "great solution", but it doesnt align with my system requirements and will need a lot of customization, then i though about using Keycloak, or Ory Kratos ... with OpenFGA for RBAC
but i wonder, what could be the best combination for such requirements, or am i on a completly wrong track?
https://redd.it/1oyw32n
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
CI/CD milestone reached for arkA (open video protocol)
CI/CD milestone reached for arkA (open video protocol)
We now have:
• Schema validation
• Automated builds
• Static deployments
• Zero-backend hosting model
Would love CI/CD feedback or contributors!
Repo: https://github.com/baconpantsuppercut/arkA
https://redd.it/1oyvmjn
@r_devops
CI/CD milestone reached for arkA (open video protocol)
We now have:
• Schema validation
• Automated builds
• Static deployments
• Zero-backend hosting model
Would love CI/CD feedback or contributors!
Repo: https://github.com/baconpantsuppercut/arkA
https://redd.it/1oyvmjn
@r_devops
GitHub
GitHub - baconpantsuppercut/arkA: Arka is an open, decentralized video ecosystem — a protocol + set of reference apps for free…
Arka is an open, decentralized video ecosystem — a protocol + set of reference apps for free expression, creator sovereignty, and AI-guided, kid-safe experiences. - baconpantsuppercut/arkA
I need help, I'm trying to build my first web application
Hello everyone.
I'm trying to deploy my first real app and I'm honestly already losing perspective after 2 days stuck with this. I would like to ask you two things:
1. Recommendations on how I should deploy my stack correctly (Docker Compose).
2. Help to understand why Dokploy/Coolify/DigitalOcean are returning me such weird errors.
---
My stack (everything runs locally with docker compose up without problems):
Backend: Django + Django REST Framework
Tasks: Celery + Celery Beat
Messaging: Redis
Database: PostgreSQL (on DigitalOcean)
Frontend: React with Vite
Everything runs dockerized.
Locally it works perfectly, including Celery, Beat and Redis.
1) DigitalOcean App Platform
I tried it first.
My backend worked, connected fine to the external DO database, but App Platform doesn't support Celery, Celery Beat or Redis in separate services (at least not in a simple way without costing an arm and a leg).
For my project they are essential, so I discarded them.
---
2) Coolify
I tried…but I honestly felt like I was going in circles and not moving forward.
I couldn't get my complete compose up.
I got lost between pipelines, resources, static sites and failing builds.
I gave up.
---
3) Dokploy
Now I am here because in theory it is the clearest option and with the best feedback.
I like that it lets me see logs, connections, containers, etc.
But I have several problems that I don't even know where to attack:
---
❌ Problem 1: Backend goes up, but Django admin gives 404 or Bad Gateway
Dokploy builds my container without errors.
It connects perfectly to my DigitalOcean database.
Buuut... when I open /admin/ or any route I get:
404
or Bad Gateway
Random. I don't understand.
---
❌ Problem 2: I bought a domain, associated it with Dokploy... and now Chrome says that “the connection is not private”
The DNS is correctly configured according to Dokploy (it shows everything green).
But when entering the URL:
> “An attacker may be trying to steal information…”
And below it shows that my site uses "HSTS" (I don't even know what that is 💀).
I don't know if it is a failure of certificates, of the proxy, of misconfigured HTTPS or if something else must happen before it works. Maybe an our father
---
What exactly am I looking for?
1. Realistic and direct advice:
What is the most practical and stable way to deploy a stack like this using Docker Compose?
Backend + React + Redis + Celery + Celery Beat.
2. If someone uses Dokploy:
How do you set up domains and certificates without Chrome saying a hacker wants to steal from me?
Why can a Django that compiles well throw 404 or Bad Gateway only in /admin/?
3. Alternative options:
Should I go back to DigitalOcean Droplets and do a classic deploy with manual docker-compose?
Or was Coolify the right route and I was the problem?
---
I close with this:
I've been stuck between logs for two days
If anyone can give me a clear direction, I would greatly appreciate it 🙏
https://redd.it/1oz1pe4
@r_devops
Hello everyone.
I'm trying to deploy my first real app and I'm honestly already losing perspective after 2 days stuck with this. I would like to ask you two things:
1. Recommendations on how I should deploy my stack correctly (Docker Compose).
2. Help to understand why Dokploy/Coolify/DigitalOcean are returning me such weird errors.
---
My stack (everything runs locally with docker compose up without problems):
Backend: Django + Django REST Framework
Tasks: Celery + Celery Beat
Messaging: Redis
Database: PostgreSQL (on DigitalOcean)
Frontend: React with Vite
Everything runs dockerized.
Locally it works perfectly, including Celery, Beat and Redis.
1) DigitalOcean App Platform
I tried it first.
My backend worked, connected fine to the external DO database, but App Platform doesn't support Celery, Celery Beat or Redis in separate services (at least not in a simple way without costing an arm and a leg).
For my project they are essential, so I discarded them.
---
2) Coolify
I tried…but I honestly felt like I was going in circles and not moving forward.
I couldn't get my complete compose up.
I got lost between pipelines, resources, static sites and failing builds.
I gave up.
---
3) Dokploy
Now I am here because in theory it is the clearest option and with the best feedback.
I like that it lets me see logs, connections, containers, etc.
But I have several problems that I don't even know where to attack:
---
❌ Problem 1: Backend goes up, but Django admin gives 404 or Bad Gateway
Dokploy builds my container without errors.
It connects perfectly to my DigitalOcean database.
Buuut... when I open /admin/ or any route I get:
404
or Bad Gateway
Random. I don't understand.
---
❌ Problem 2: I bought a domain, associated it with Dokploy... and now Chrome says that “the connection is not private”
The DNS is correctly configured according to Dokploy (it shows everything green).
But when entering the URL:
> “An attacker may be trying to steal information…”
And below it shows that my site uses "HSTS" (I don't even know what that is 💀).
I don't know if it is a failure of certificates, of the proxy, of misconfigured HTTPS or if something else must happen before it works. Maybe an our father
---
What exactly am I looking for?
1. Realistic and direct advice:
What is the most practical and stable way to deploy a stack like this using Docker Compose?
Backend + React + Redis + Celery + Celery Beat.
2. If someone uses Dokploy:
How do you set up domains and certificates without Chrome saying a hacker wants to steal from me?
Why can a Django that compiles well throw 404 or Bad Gateway only in /admin/?
3. Alternative options:
Should I go back to DigitalOcean Droplets and do a classic deploy with manual docker-compose?
Or was Coolify the right route and I was the problem?
---
I close with this:
I've been stuck between logs for two days
If anyone can give me a clear direction, I would greatly appreciate it 🙏
https://redd.it/1oz1pe4
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Cloud Infrastructure Engineer
Are there any cloud infrastructure engineer in here that can share their interview experience?
https://redd.it/1oz2y6e
@r_devops
Are there any cloud infrastructure engineer in here that can share their interview experience?
https://redd.it/1oz2y6e
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How I'm using Infisical to secure my secrets in my pyATS/NetBox agent.
Hey everyone, just wanted to share a use case I'm really happy with. I'm building a multi-container AI agent for network automation (pyATS, NetBox, Streamlit) and was dreading how to manage all the device passwords, database strings, and API keys. Infisical was the perfect solution.
My
https://redd.it/1oz3gr6
@r_devops
Hey everyone, just wanted to share a use case I'm really happy with. I'm building a multi-container AI agent for network automation (pyATS, NetBox, Streamlit) and was dreading how to manage all the device passwords, database strings, and API keys. Infisical was the perfect solution.
My
docker_startup.sh noscript just fetches the Machine Identities, and then each container's `entrypoint.sh` uses infisical run to wrap the app (like a secure bubble). This injects all 35+ secrets as environment variables. The best part is my Python code is totally clean—it just uses os.getenv() and has no idea Infisical even exists. It's a fantastic way to keep credentials out of my Docker files. This is the link for the video I made. https://youtu.be/JBJOj8EE-JEhttps://redd.it/1oz3gr6
@r_devops
YouTube
Stop Hardcoding Passwords! Secure Your Docker Apps with Infisical
In this video, I'll show you how I use Infisical, a free and open-source secret manager, to secure my entire automated Python project. You'll see a real-world example of how to store all your credentials in the Infisical vault, use a startup noscript to fetch…
Decoding DevOps
I'm a software specialist with DevOps background and I'm thinking of taking this course: Decoding DevOps – From Basics to Advanced Projects with AI by Imran Teli to strengthen my portfolio and CV to land mid-to-senior DevOps position ASAP.Would it help or there's better options?
https://redd.it/1oz17ab
@r_devops
I'm a software specialist with DevOps background and I'm thinking of taking this course: Decoding DevOps – From Basics to Advanced Projects with AI by Imran Teli to strengthen my portfolio and CV to land mid-to-senior DevOps position ASAP.Would it help or there's better options?
https://redd.it/1oz17ab
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Regex Denial of Service (ReDoS): The Pattern That Freezes Your Server 🌀
https://instatunnel.my/blog/regex-denial-of-service-redos-the-pattern-that-freezes-your-server
https://redd.it/1oz6b5s
@r_devops
https://instatunnel.my/blog/regex-denial-of-service-redos-the-pattern-that-freezes-your-server
https://redd.it/1oz6b5s
@r_devops
InstaTunnel
Regex Denial of Service (ReDoS): How Catastrophic Patterns
Learn how ReDoS attacks exploit inefficient regular expressions to cause CPU exhaustion and downtime. Discover how small inputs trigger catastrophic backtrackin
Our production crashed for 48 hours because of a version mismatch
ClickHouse migration went wrong. Old region: v22.8. New region: v23.3. Nobody noticed.
Two days of debugging with premium support. Zero results.
Finally caught it ourselves after 48 hours.
Building a tool now to prevent these config nightmares. Lesson learned: always verify versions across environments.
https://redd.it/1oz7rcs
@r_devops
ClickHouse migration went wrong. Old region: v22.8. New region: v23.3. Nobody noticed.
Two days of debugging with premium support. Zero results.
Finally caught it ourselves after 48 hours.
Building a tool now to prevent these config nightmares. Lesson learned: always verify versions across environments.
https://redd.it/1oz7rcs
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How to send Supabase Postgres logs to New Relic on Pro (cloud, not self-hosted)?
Hey everyone,
I’m trying to figure out a clean way to get Supabase Postgres logs into New Relic without changing my whole setup or upgrading plans.
My situation:
- I’m using Supabase Cloud, not self-hosted
- I’m currently on the Pro plan
- I don’t want to upgrade to Team just to get log drains
- I’ve already successfully integrated New Relic with my Supabase Edge Functions (Node/TypeScript), and that part is working fine
- What I’m missing is Postgres/DB logs (slow queries, errors, etc.) inside New Relic
From what I’ve seen, the “proper” / official way seems to be using log drains, which are only available on the higher tiers. Since I’m on Pro, I’m looking for any of the following:
- Has anyone found a workaround to get Postgres logs or query data from Supabase Cloud → New Relic while staying on Pro?
- Is there any way to forward logs via webhooks, or some pattern like:
- Supabase → Function / Trigger → HTTP → New Relic ingest endpoint?
- Or maybe using database triggers / audit tables + a job that pushes data into New Relic in some structured way?
If anyone has:
- A working setup
- Even a partial solution (e.g. just errors or slow queries)
- Or can confirm that it’s basically impossible without Team / Enterprise
…I’d really appreciate the details.
Thanks in advance.
https://redd.it/1oza164
@r_devops
Hey everyone,
I’m trying to figure out a clean way to get Supabase Postgres logs into New Relic without changing my whole setup or upgrading plans.
My situation:
- I’m using Supabase Cloud, not self-hosted
- I’m currently on the Pro plan
- I don’t want to upgrade to Team just to get log drains
- I’ve already successfully integrated New Relic with my Supabase Edge Functions (Node/TypeScript), and that part is working fine
- What I’m missing is Postgres/DB logs (slow queries, errors, etc.) inside New Relic
From what I’ve seen, the “proper” / official way seems to be using log drains, which are only available on the higher tiers. Since I’m on Pro, I’m looking for any of the following:
- Has anyone found a workaround to get Postgres logs or query data from Supabase Cloud → New Relic while staying on Pro?
- Is there any way to forward logs via webhooks, or some pattern like:
- Supabase → Function / Trigger → HTTP → New Relic ingest endpoint?
- Or maybe using database triggers / audit tables + a job that pushes data into New Relic in some structured way?
If anyone has:
- A working setup
- Even a partial solution (e.g. just errors or slow queries)
- Or can confirm that it’s basically impossible without Team / Enterprise
…I’d really appreciate the details.
Thanks in advance.
https://redd.it/1oza164
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How can I start learning AWS or Azure without a credit/debit card?
I'm trying to get into cloud computing, but I'm stuck at the very first step. I don't have a credit or debit card, and my college ID isn’t eligible for the Azure for Students offer. Because of that, I can’t sign up for the free tiers on AWS or Azure.
For anyone who’s been in a similar situation — how did you start learning? Are there any alternatives, free resources, sandbox environments, or training platforms I can use without needing a card? I really want to get hands-on practice instead of only watching videos.
Any suggestions would be really appreciated!
https://redd.it/1oz9wrh
@r_devops
I'm trying to get into cloud computing, but I'm stuck at the very first step. I don't have a credit or debit card, and my college ID isn’t eligible for the Azure for Students offer. Because of that, I can’t sign up for the free tiers on AWS or Azure.
For anyone who’s been in a similar situation — how did you start learning? Are there any alternatives, free resources, sandbox environments, or training platforms I can use without needing a card? I really want to get hands-on practice instead of only watching videos.
Any suggestions would be really appreciated!
https://redd.it/1oz9wrh
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Anyone else tired of juggling SonarQube, Snyk, and manual reviews just to keep code clean?
Our setup has become ridiculous. SonarQube runs nightly, Snyk yells about vulnerabilities once a week, and reviewers manually check for style and logic. It's all disconnected - different dashboards, overlapping issues, and zero visibility on whether we're actually improving. I've been wondering if there's a sane way to bring code quality, review automation, and security scanning into a single workflow. Ideally something that plugs into GitHub so we stop context-switching between five tabs every PR.
https://redd.it/1ozc6lj
@r_devops
Our setup has become ridiculous. SonarQube runs nightly, Snyk yells about vulnerabilities once a week, and reviewers manually check for style and logic. It's all disconnected - different dashboards, overlapping issues, and zero visibility on whether we're actually improving. I've been wondering if there's a sane way to bring code quality, review automation, and security scanning into a single workflow. Ideally something that plugs into GitHub so we stop context-switching between five tabs every PR.
https://redd.it/1ozc6lj
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community