Using MCP for Debugging, Reversing, and Threat Analysis: Part 2 - https://whiteknightlabs.com/2025/11/18/using-mcp-for-debugging-reversing-and-threat-analysis-part-2/ by @AlanSguigna
In Part 1 of this article series, I demonstrated the configuration steps for using natural language processing in analyzing a Windows crash dump. In this blog, I dive far deeper, using vibe coding to extend the use of MCP for Windows kernel debugging.
In Part 1 of this article series, I demonstrated the configuration steps for using natural language processing in analyzing a Windows crash dump. In this blog, I dive far deeper, using vibe coding to extend the use of MCP for Windows kernel debugging.
White Knight Labs
Using MCP for Debugging, Reversing, and Threat Analysis: Part 2 | White Knight Labs
In Part 1 of this article series, I demonstrated the configuration steps for using natural language processing in analyzing a Windows crash dump. In this
Report_for_Serious_Incidents_under_the_AI_Act_GeneralPurpose_AI.pdf
174.2 KB
AI Act: Commission publishes a reporting template for serious incidents involving general-purpose AI models with systemic risk
The publication of this template promotes consistent and transparent reporting and helps providers demonstrate compliance with the commitments set out in Commitment 9 of the GPAI Code of Practice.
A glowing lightbulb with "AI" and a robot icon inside, surrounded by symbols of data, security, and analytics, is held in a person's hand.
AdobeStock © Supatman
Under Article 55 of the EU AI Act, these providers are required to report relevant information on serious incidents to the AI Office and, where appropriate, to national competent authorities. The Code of Practice for general-purpose AI (GPAI) further operationalises this obligation by specifying the information to be included in such reports as part of demonstrating compliance, while the Commission’s Guidelines for providers of general-purpose AI models help developers understand if they need to comply with the obligations under the AI Act.
Regarding high-risk AI systems, the Commission has already published draft guidance and a reporting template on serious AI incidents and is seeking stakeholders’ feedback until 7 November.
Source: https://digital-strategy.ec.europa.eu/en/library/ai-act-commission-publishes-reporting-template-serious-incidents-involving-general-purpose-ai
The publication of this template promotes consistent and transparent reporting and helps providers demonstrate compliance with the commitments set out in Commitment 9 of the GPAI Code of Practice.
A glowing lightbulb with "AI" and a robot icon inside, surrounded by symbols of data, security, and analytics, is held in a person's hand.
AdobeStock © Supatman
Under Article 55 of the EU AI Act, these providers are required to report relevant information on serious incidents to the AI Office and, where appropriate, to national competent authorities. The Code of Practice for general-purpose AI (GPAI) further operationalises this obligation by specifying the information to be included in such reports as part of demonstrating compliance, while the Commission’s Guidelines for providers of general-purpose AI models help developers understand if they need to comply with the obligations under the AI Act.
Regarding high-risk AI systems, the Commission has already published draft guidance and a reporting template on serious AI incidents and is seeking stakeholders’ feedback until 7 November.
Source: https://digital-strategy.ec.europa.eu/en/library/ai-act-commission-publishes-reporting-template-serious-incidents-involving-general-purpose-ai
👏1
New Benchmark for Evaluating LLMs on Patching Real-World Vulnerabilities
https://github.com/bytedance/PatchEval
https://github.com/bytedance/PatchEval
GitHub
GitHub - bytedance/PatchEval: PatchEval: A New Benchmark for Evaluating LLMs on Patching Real-World Vulnerabilities
PatchEval: A New Benchmark for Evaluating LLMs on Patching Real-World Vulnerabilities - bytedance/PatchEval
🔥1🫡1😎1
Software quality's collapse: How AI is accelerating decline
https://www.reversinglabs.com/blog/software-quality-collapse-ai-accelerate
https://www.reversinglabs.com/blog/software-quality-collapse-ai-accelerate
ReversingLabs
Software quality's collapse: How AI is accelerating decline | ReversingLabs
Development is in freefall toward software entropy and insecurity. Can spec-driven development help?
Sabotage Evaluations for Automated AI R&D
CTRL-ALT-DECEIT - AI systems are being deployed to automate and assist with software engineering in the real world, with tools such as GitHub CoPilot, Cursor, Deep Research, and Claude Code [1]. As these systems are deployed in high-stakes and safety-critical domains, their reliability becomes correspondingly important. Notably, the most capable AI systems may first be deployed inside AI companies to automate AI R&D - “potentially significantly accelerating AI development in an unpredictable way"
There is growing evidence that frontier and future AI systems may be misaligned with their developers or users and may intentionally act against human interests.
Misaligned AI systems may have incentives to sabotage efforts to evaluate their own dangerous capabilities, monitor their behaviour or make decisions about their deployment. It is therefore important to ensure that AI systems do not subvert human decision-making in high-stakes domains, like frontier AI R&D.
Source: https://arxiv.org/pdf/2511.09904
GitHub: https://github.com/TeunvdWeij/ctrl-alt-deceit
CTRL-ALT-DECEIT - AI systems are being deployed to automate and assist with software engineering in the real world, with tools such as GitHub CoPilot, Cursor, Deep Research, and Claude Code [1]. As these systems are deployed in high-stakes and safety-critical domains, their reliability becomes correspondingly important. Notably, the most capable AI systems may first be deployed inside AI companies to automate AI R&D - “potentially significantly accelerating AI development in an unpredictable way"
There is growing evidence that frontier and future AI systems may be misaligned with their developers or users and may intentionally act against human interests.
Misaligned AI systems may have incentives to sabotage efforts to evaluate their own dangerous capabilities, monitor their behaviour or make decisions about their deployment. It is therefore important to ensure that AI systems do not subvert human decision-making in high-stakes domains, like frontier AI R&D.
Source: https://arxiv.org/pdf/2511.09904
GitHub: https://github.com/TeunvdWeij/ctrl-alt-deceit
🔥1🤔1🤯1
Ollama Remote Code Execution: Securing the Code That Runs LLMs
https://www.sonarsource.com/blog/ollama-remote-code-execution-securing-the-code-that-runs-llms/
https://www.sonarsource.com/blog/ollama-remote-code-execution-securing-the-code-that-runs-llms/
Sonarsource
Ollama Remote Code Execution: Securing the Code That Runs LLMs
Our Vulnerability Researchers uncovered vulnerabilities in the code of Ollama, a popular tool to run LLMs locally. Dive into the details of how LLMs are implemented and what can go wrong.
❤2
The Rise of AI-Powered Scam Assembly Lines - https://www.trendmicro.com/vinfo/us/security/news/cybercrime-and-digital-threats/reimagining-fraud-operations-the-rise-of-ai-powered-scam-assembly-lines
Trendmicro
Reimagining Fraud Operations: The Rise of AI-Powered Scam Assembly Lines
Trend™ Research replicated an AI-powered scam assembly line to reveal how AI is eradicating the barrier for entry to running scams, making fraud easier to run, harder to detect, and effortless to scale.
Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models
The study provides systematic evidence that poetic reformulation degrades refusal behavior across all evaluated model families. When harmful prompts are expressed in verse rather than prose, attack-success rates rise sharply, both for hand-crafted adversarial poems and for the 1,200-item MLCommons corpus transformed through a standardized meta-prompt. The magnitude and consistency of the effect indicate that contemporary alignment pipelines do not generalize across stylistic shifts. The surface form alone is sufficient to move inputs outside the operational distribution on which refusal mechanisms have been optimized.
The cross-model results suggest that the phenomenon is structural rather than provider-specific. Models built using RLHF, Constitutional AI, and hybrid alignment strategies all display elevated vulnerability, with increases ranging from single digits to more than sixty percentage points depending on provider. The effect spans CBRN, cyber-offense, manipulation, privacy, and loss-of-control domains, showing that the bypass does not exploit weakness in any one refusal subsystem but interacts with general alignment heuristics.
Source: https://arxiv.org/pdf/2511.15304
The study provides systematic evidence that poetic reformulation degrades refusal behavior across all evaluated model families. When harmful prompts are expressed in verse rather than prose, attack-success rates rise sharply, both for hand-crafted adversarial poems and for the 1,200-item MLCommons corpus transformed through a standardized meta-prompt. The magnitude and consistency of the effect indicate that contemporary alignment pipelines do not generalize across stylistic shifts. The surface form alone is sufficient to move inputs outside the operational distribution on which refusal mechanisms have been optimized.
The cross-model results suggest that the phenomenon is structural rather than provider-specific. Models built using RLHF, Constitutional AI, and hybrid alignment strategies all display elevated vulnerability, with increases ranging from single digits to more than sixty percentage points depending on provider. The effect spans CBRN, cyber-offense, manipulation, privacy, and loss-of-control domains, showing that the bypass does not exploit weakness in any one refusal subsystem but interacts with general alignment heuristics.
Source: https://arxiv.org/pdf/2511.15304
roi_of_ai_in_security_2025.pdf
8.5 MB
The ROI of AI in Security
Google Cloud’s new The ROI of AI in security report showcases how best-in-class organizations are getting value from AI in cybersecurity. For several years, the cybersecurity industry has discussed the potential of artificial intelligence. Much of this discussion has been aspirational, often blurring the lines between true machine learning and more basic automation. However, this report suggests a significant, practical shift is already underway.
Google surveyed 3,466 senior leaders globally to move beyond hype and into the practical discussion of revenue, productivity, and risk reduction. The findings confirm that the conversation is no longer about "if" AI should be used, but "how" it can be scaled to measurably improve security posture.
https://services.google.com/fh/files/misc/roi_of_ai_in_security_2025.pdf
Google Cloud’s new The ROI of AI in security report showcases how best-in-class organizations are getting value from AI in cybersecurity. For several years, the cybersecurity industry has discussed the potential of artificial intelligence. Much of this discussion has been aspirational, often blurring the lines between true machine learning and more basic automation. However, this report suggests a significant, practical shift is already underway.
Google surveyed 3,466 senior leaders globally to move beyond hype and into the practical discussion of revenue, productivity, and risk reduction. The findings confirm that the conversation is no longer about "if" AI should be used, but "how" it can be scaled to measurably improve security posture.
https://services.google.com/fh/files/misc/roi_of_ai_in_security_2025.pdf
god-eye
AI-powered subdomain enumeration tool with local LLM analysis via Ollama - 100% private, zero API costs
https://github.com/Vyntral/god-eye
AI-powered subdomain enumeration tool with local LLM analysis via Ollama - 100% private, zero API costs
https://github.com/Vyntral/god-eye
GitHub
GitHub - Vyntral/god-eye: AI-powered subdomain enumeration tool with local LLM analysis via Ollama - 100% private, zero API costs
AI-powered subdomain enumeration tool with local LLM analysis via Ollama - 100% private, zero API costs - Vyntral/god-eye
❤1
Reversecore_MCP- A security-first MCP server empowering AI agents to orchestrate Ghidra, Radare2, and YARA for automated reverse engineering. https://github.com/sjkim1127/Reversecore_MCP
GitHub
GitHub - sjkim1127/Reversecore_MCP: A security-first MCP server empowering AI agents to orchestrate Ghidra, Radare2, and YARA for…
A security-first MCP server empowering AI agents to orchestrate Ghidra, Radare2, and YARA for automated reverse engineering. - sjkim1127/Reversecore_MCP
Agentic AI Security Scoping Matrix
The Agentic AI Security Scoping Matrix provides a structured mental model and framework for understanding and addressing the security challenges of autonomous agentic AI systems across four distinct scopes. By accurately assessing your current scope and implementing appropriate controls across all six security dimensions, organizations can confidently deploy agentic AI while managing the landscape of associated risks.
Source: https://aws.amazon.com/blogs/security/the-agentic-ai-security-scoping-matrix-a-framework-for-securing-autonomous-ai-systems/
The Agentic AI Security Scoping Matrix provides a structured mental model and framework for understanding and addressing the security challenges of autonomous agentic AI systems across four distinct scopes. By accurately assessing your current scope and implementing appropriate controls across all six security dimensions, organizations can confidently deploy agentic AI while managing the landscape of associated risks.
Source: https://aws.amazon.com/blogs/security/the-agentic-ai-security-scoping-matrix-a-framework-for-securing-autonomous-ai-systems/
❤1🔥1🤓1
From shortcuts to sabotage: natural emergent misalignment from reward hacking
In the latest research from Anthropic’s alignment team, we show for the first time that realistic AI training processes can accidentally produce misaligned models.
https://www.anthropic.com/research/emergent-misalignment-reward-hacking
In the latest research from Anthropic’s alignment team, we show for the first time that realistic AI training processes can accidentally produce misaligned models.
https://www.anthropic.com/research/emergent-misalignment-reward-hacking
Anthropic
From shortcuts to sabotage: natural emergent misalignment from reward hacking
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
👍1🤯1🎄1
AISecHub
From shortcuts to sabotage: natural emergent misalignment from reward hacking In the latest research from Anthropic’s alignment team, we show for the first time that realistic AI training processes can accidentally produce misaligned models. https://www…
Natural_emergent_misalignment_from_reward_hacking_in_production.pdf
890.2 KB
🔥1