{"id":14354,"date":"2026-05-10T12:16:58","date_gmt":"2026-05-10T12:16:58","guid":{"rendered":"https:\/\/www.securitytoday.de\/2026\/05\/10\/ai-phishing-llm-email-filter-detection-2026\/"},"modified":"2026-06-10T13:57:51","modified_gmt":"2026-06-10T13:57:51","slug":"ai-phishing-llm-email-filter-detection-2026","status":"publish","type":"post","link":"https:\/\/www.securitytoday.de\/en\/2026\/05\/10\/ai-phishing-llm-email-filter-detection-2026\/","title":{"rendered":"AI Phishing: Email Filters Left in the Dark"},"content":{"rendered":"<p style=\"color:#69d8ed;font-size:0.9em;margin:0 0 16px;padding:0;\">5 Min. Reading Time<\/p>\n<p style=\"line-height:1.8;margin-bottom:20px;\"><strong>The spear phishing email that passed through a mid-sized DACH company&#8217;s mail gateway last week was grammatically flawless, contextually precise, and contained not a single heuristic marker that Proofpoint, SpamAssassin, or Microsoft Defender would recognize as suspicious. It was written by an LLM, instructed by a threat actor with thirty minutes of research on the recipient&#8217;s role. The detection layer, built on pattern matching and URL reputation for twenty years, no longer catches such emails, and CISOs must rethink their architecture in 2026, rather than escalating filter updates.<\/strong><\/p>\n<p style=\"margin:8px 0 18px;color:#888;font-size:0.85em;\"><em>May 10, 2026<\/em><\/p>\n<div style=\"background:#f0f9fa;border:1px solid rgba(105,216,237,0.4);border-radius:8px;padding:24px 28px;margin:32px 0;box-shadow:0 1px 0 rgba(105,216,237,0.15), 0 2px 8px rgba(0,0,0,0.03);\">\n<p style=\"margin:0 0 14px 0;font-size:11px;font-weight:600;color:#004a59;text-transform:uppercase;letter-spacing:0.12em;\">Key Takeaways<\/p>\n<ul style=\"margin:0;padding-left:20px;color:#1a1a1a;line-height:1.7;\">\n<li style=\"margin-bottom:10px;\"><strong style=\"color:#004a59;\">Heuristics Break Down:<\/strong> LLM-rewritten phishing emails provide no typos, no template fingerprints, and no recurring phrases. Three of the most common detection layers (Gmail, SpamAssassin, Proofpoint standard profiles) lose 60 to 80 percent of their hit rate against AI phishing in independent tests.<\/li>\n<li style=\"margin-bottom:10px;\"><strong style=\"color:#004a59;\">URL Reputation No Longer Suffices:<\/strong> Attackers use fresh domains with valid certificates that are not yet in threat feeds at the time of the click. Those relying on URL reputation as a second layer have also lost that layer.<\/li>\n<li style=\"margin-bottom:10px;\"><strong style=\"color:#004a59;\">Behavioral Analytics is the New Mandatory Layer:<\/strong> Sender DNA, recipient behavior anomalies, and LLM-based classification on email content close the gap. Proofpoint, Mimecast, and Abnormal have built dedicated agents for this, with a response time of two seconds per email.<\/li>\n<\/ul>\n<\/div>\n<p style=\"font-size:0.88em;color:#666;margin:20px 0 32px 0;border-top:1px solid #e5e5e5;border-bottom:1px solid #e5e5e5;padding:10px 0;\"><span style=\"color:#004a59;font-weight:700;text-transform:uppercase;font-size:0.72em;letter-spacing:0.14em;margin-right:14px;\">Related:<\/span><a href=\"https:\/\/www.securitytoday.de\/en\/2026\/05\/05\/ai-agent-uncovers-linux-kernel-zero\/\" style=\"color:#333;text-decoration:underline;\">AI Agent Finds Linux Zero-Day in One Hour<\/a>&nbsp;&nbsp;<span style=\"color:#ccc;\">\/<\/span>&nbsp;&nbsp;<a href=\"https:\/\/www.securitytoday.de\/en\/2026\/04\/26\/itdr-joins-siem-and-edr-detection-architecture-2026\/\" style=\"color:#333;text-decoration:underline;\">ITDR alongside SIEM and EDR: Detection Architecture 2026<\/a><\/p>\n<h2 style=\"margin-top:48px;margin-bottom:16px;padding-top:0;\">Where the Classic Mail Filter Fails Today<\/h2>\n<p style=\"line-height:1.8;margin-bottom:20px;\"><strong>What is AI Phishing?<\/strong> AI phishing is a class of phishing attacks where content (emails, pretext, links, and attachments) is generated or rewritten by a large language model like GPT-5, Claude 4.7, or a fine-tuned open-source model. The goal is to evade pattern-based detection, which has been trained on typing errors, template fingerprints, and suspicious phrases for years.<\/p>\n<p style=\"line-height:1.8;margin-bottom:20px;\">The first independent tests from threat reports by Mimecast, Proofpoint, and Group-IB show a clear pattern. A handwritten phishing email is stopped by standard profiles with a 70 to 85 percent probability, while a LLM-rewritten variant of the same email is only stopped in 15 to 35 percent of cases. This is not a tuning problem, but an architecture problem, as the filters simply no longer see suspicious markers.<\/p>\n<p style=\"line-height:1.8;margin-bottom:20px;\">Additionally, there is an asymmetry on the attacker&#8217;s side. A threat actor can generate fifty variants of the same pretext with thirty minutes of research on the recipient&#8217;s role, each formulated slightly differently. Anyone who doesn&#8217;t recognize this on a behavioral level, but on a pattern level, fails mathematically.<\/p>\n<div style=\"background:#f0f9fa;border:1px solid rgba(105,216,237,0.35);border-radius:8px;padding:28px 24px;margin:32px 0;text-align:center;\">\n<div style=\"font-size:11px;font-weight:600;color:#666;text-transform:uppercase;letter-spacing:0.12em;margin-bottom:8px;\">Threat Indicator<\/div>\n<div style=\"font-size:46px;font-weight:700;color:#004a59;letter-spacing:-0.03em;line-height:1;\">80 %<\/div>\n<div style=\"font-size:14px;color:#444;margin-top:10px;max-width:480px;margin-left:auto;margin-right:auto;line-height:1.5;\">of social engineering emails in Q1 2026 were AI-supported, with the proportion doubling currently per quarter. The detection response time must drop from minutes to seconds.<\/div>\n<div style=\"font-size:12px;color:#888;margin-top:10px;\">Source: Proofpoint State of the Phish 2026 + Mimecast Threat Intelligence Q1<\/div>\n<\/div>\n<h2 style=\"margin-top:48px;margin-bottom:16px;padding-top:0;\">What the New Detection Layer Really Needs<\/h2>\n<p style=\"line-height:1.8;margin-bottom:20px;\">The detection architecture shifts to three parallel layers, each of which is necessary but not sufficient on its own.<\/p>\n<div style=\"display:grid;grid-template-columns:repeat(auto-fit,minmax(260px,1fr));gap:14px;margin:28px 0;\">\n<div style=\"background:#f0f9fa;border:1px solid rgba(105,216,237,0.3);border-radius:6px;padding:18px 20px;\">\n<p style=\"margin:0 0 8px 0;font-size:11px;font-weight:700;text-transform:uppercase;letter-spacing:0.1em;color:#004a59;\">Layer 1: Sender DNA<\/p>\n<p style=\"margin:0;color:#333;line-height:1.55;font-size:0.95em;\">SPF, DKIM, and DMARC remain mandatory. Fresh domains, reputation fluctuations, and sender behavior changes are the early indicators that must now trigger immediately.<\/p>\n<\/div>\n<div style=\"background:#f0f9fa;border:1px solid rgba(105,216,237,0.3);border-radius:6px;padding:18px 20px;\">\n<p style=\"margin:0 0 8px 0;font-size:11px;font-weight:700;text-transform:uppercase;letter-spacing:0.1em;color:#004a59;\">Layer 2: Behavioral Baseline<\/p>\n<p style=\"margin:0;color:#333;line-height:1.55;font-size:0.95em;\">What does the sender usually write to whom, in what tone, with which attachments? Anomalies against the individual recipient baseline are the most important new detection lever.<\/p>\n<\/div>\n<div style=\"background:#f0f9fa;border:1px solid rgba(105,216,237,0.3);border-radius:6px;padding:18px 20px;\">\n<p style=\"margin:0 0 8px 0;font-size:11px;font-weight:700;text-transform:uppercase;letter-spacing:0.1em;color:#004a59;\">Layer 3: LLM Classification<\/p>\n<p style=\"margin:0;color:#333;line-height:1.55;font-size:0.95em;\">Specialized classification models (Proofpoint, Abnormal, Microsoft Defender for Office) read the email content itself and evaluate intent against the recipient&#8217;s behavioral baseline model.<\/p>\n<\/div>\n<\/div>\n<p style=\"line-height:1.8;margin-bottom:20px;\">The open question in most mid-market setups is not the tool choice, but integration. Anyone measuring sender DNA in the mail platform, behavioral analytics in the SIEM, and LLM classification in the EDR has three data silos that don&#8217;t talk to each other. This gap is precisely what the <a href=\"https:\/\/www.securitytoday.de\/en\/2026\/04\/26\/itdr-joins-siem-and-edr-detection-architecture-2026\/\" style=\"color:#69d8ed;text-decoration:underline;\">ITDR architecture shift<\/a> describes: Identity detection as a central layer that looks across mail, endpoint, and cloud.<\/p>\n<h2 style=\"margin-top:48px;margin-bottom:16px;padding-top:0;\">Who Will Upgrade First in 2026, and Who Will Wait<\/h2>\n<p style=\"line-height:1.8;margin-bottom:20px;\">In the pilot setups of the past few months, three profiles have been notably quick to move. Insurers and financial service providers who already need to implement BAFIN-driven MaRisk requirements and see the AI-phishing wave as a logical extension. Healthcare systems under external audit observation following the data leaks of 2024 and 2025. And IT service providers with clients in the public sector, whose contracts demand concrete response times for detecting spear-phishing.<\/p>\n<p style=\"line-height:1.8;margin-bottom:20px;\">Three more profiles are moving slower than they should. Traditional medium-sized industrial companies without BSI-relevant supply chains that consider the mail filter update a cosmetic issue. Family-owned and owner-managed businesses that leave the matter to their external IT service provider. And IT maintenance departments that, in Outlook-centric setups, pursue a Microsoft-only strategy, thereby reducing it to a layer that has isolated gaps.<\/p>\n<p style=\"line-height:1.8;margin-bottom:20px;\">These two movements will converge in 2026 with cyber insurance policies. Cyber insurers are now scrutinizing the detection stack in detail, and an unanswered questionnaire on mail-phishing defense will cost an additional 8 to 15 percent of the policy premium in 2026. The parallel pressure from the Linux kernel area, such as after the <a href=\"https:\/\/www.securitytoday.de\/en\/2026\/05\/05\/ai-agent-uncovers-linux-kernel-zero\/\" style=\"color:#69d8ed;text-decoration:underline;\">AI Agent Zero-Day discovery in May<\/a>, is only accelerating this trend.<\/p>\n<h2 style=\"margin-top:48px;margin-bottom:16px;padding-top:0;\">90-Day Plan for CISOs<\/h2>\n<p style=\"line-height:1.8;margin-bottom:20px;\">Those who don&#8217;t want to wait can have a measurably better layer in one quarter.<\/p>\n<div style=\"margin:28px 0;border:1px solid rgba(105,216,237,0.4);border-radius:6px;overflow:hidden;\">\n<div style=\"background:#004a59;color:#fff;padding:12px 18px;font-size:0.78em;font-weight:700;text-transform:uppercase;letter-spacing:0.14em;\">90-Day Plan: Detection Against AI-Phishing<\/div>\n<div style=\"padding:8px 0;\">\n<div style=\"display:flex;gap:18px;padding:12px 20px;border-bottom:1px solid rgba(105,216,237,0.2);\">\n<div style=\"min-width:130px;font-weight:700;color:#004a59;\">Week 1 to 2<\/div>\n<div style=\"color:#333;line-height:1.55;\">Measure the status quo. Check SPF\/DKIM\/DMARC coverage, analyze the quarantine rate of the last 90 days, and evaluate the employee click rate from the last phishing simulation. Baseline figures for the executive presentation.<\/div>\n<\/div>\n<div style=\"display:flex;gap:18px;padding:12px 20px;border-bottom:1px solid rgba(105,216,237,0.2);\">\n<div style=\"min-width:130px;font-weight:700;color:#004a59;\">Week 3 to 5<\/div>\n<div style=\"color:#333;line-height:1.55;\">Introduce a behavioral layer. Pilot Abnormal, Proofpoint Nexus AI, or Mimecast CyberGraph on 200 mailboxes, with a 14-day learning phase, followed by a comparison against the classic filter.<\/div>\n<\/div>\n<div style=\"display:flex;gap:18px;padding:12px 20px;border-bottom:1px solid rgba(105,216,237,0.2);\">\n<div style=\"min-width:130px;font-weight:700;color:#004a59;\">Week 6 to 8<\/div>\n<div style=\"color:#333;line-height:1.55;\">Integrate with SIEM and EDR. Bring sender DNA anomalies, behavioral triggers, and EDR process telemetry onto a detection platform, and test correlation.<\/div>\n<\/div>\n<div style=\"display:flex;gap:18px;padding:12px 20px;\">\n<div style=\"min-width:130px;font-weight:700;color:#004a59;\">Week 9 to 12<\/div>\n<div style=\"color:#333;line-height:1.55;\">Roll out to the entire organization, brief employees, update the cyber insurance questionnaire. Establish a quarterly review.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h2 style=\"margin-top:48px;margin-bottom:16px;padding-top:0;\">Frequently Asked Questions<\/h2>\n<h3 style=\"margin-top:32px;margin-bottom:8px;color:#004a59;font-size:1.15em;\">Is it enough to update the existing mail gateway?<\/h3>\n<p style=\"line-height:1.7;margin-bottom:18px;color:#333;\">In most cases, no. Pattern-based filters need an architectural update to behavioral analytics, which is more than just a patch. Those relying on Proofpoint, Mimecast, or Microsoft Defender should actively enable their AI modules and take the learning phase seriously; otherwise, the second layer will remain idle.<\/p>\n<h3 style=\"margin-top:32px;margin-bottom:8px;color:#004a59;font-size:1.15em;\">What role does recipient training still play?<\/h3>\n<p style=\"line-height:1.7;margin-bottom:18px;color:#333;\">It remains important, but expectations must be adjusted. If the email is grammatically clean and contextually appropriate, employees don&#8217;t consider it phishing. Training focus in 2026 should be on behavioral anomalies (unusual requests, urgency, unusual channels), not on detecting typos.<\/p>\n<h3 style=\"margin-top:32px;margin-bottom:8px;color:#004a59;font-size:1.15em;\">How does AI phishing relate to NIS2 reporting requirements?<\/h3>\n<p style=\"line-height:1.7;margin-bottom:18px;color:#333;\">NIS2 requires an initial report within 24 hours of a significant incident. If a successful spear phishing attack is only detected after days due to a lack of behavioral analytics, the deadline is mechanically missed. This is an operational trigger, not just a compliance point.<\/p>\n<h3 style=\"margin-top:32px;margin-bottom:8px;color:#004a59;font-size:1.15em;\">What does a behavioral layer cost in the mid-market?<\/h3>\n<p style=\"line-height:1.7;margin-bottom:18px;color:#333;\">Market prices in 2026 for Abnormal, Proofpoint Nexus AI, and Mimecast CyberGraph range between 4 and 9 euros per mailbox per month in mid-market setups (200 to 2,000 mailboxes). That&#8217;s 9,600 to 216,000 euros per year, depending on size. Cyber insurers factor this into policy negotiations.<\/p>\n<div style=\"background:#f5f5f7;padding:24px 28px;margin:48px 0 24px 0;border-radius:8px;\">\n<p style=\"margin:0 0 8px 0;font-size:0.78em;font-weight:800;color:#004a59;text-transform:uppercase;letter-spacing:0.14em;\">About the Author<\/p>\n<p style=\"margin:0;font-size:0.98em;line-height:1.7;color:#333;\"><strong>Tobias Massow<\/strong> is CEO of Evernine Media GmbH and publisher of MBF Media Magazines. He observes detection reality through email, SOC, and CISO conversations that the magazine conducts daily and writes from this observation, not from tool marketing.<\/p>\n<\/div>\n<div style=\"margin:32px 0 24px 0;\">\n<p style=\"margin:0 0 12px 0;font-size:0.78em;font-weight:800;color:#004a59;text-transform:uppercase;letter-spacing:0.14em;\">More from the MBF Media Network<\/p>\n<div style=\"padding:14px 18px;border-left:3px solid #0bb7fd;background:#fafafa;margin-bottom:6px;\">\n<div style=\"font-size:0.7em;font-weight:700;color:#0bb7fd;text-transform:uppercase;letter-spacing:0.12em;margin-bottom:4px;\">cloudmagazin<\/div>\n<p style=\"margin:0;\"><a href=\"https:\/\/www.cloudmagazin.com\/2026\/05\/10\/cloudflare-containers-ga-workers-edge-dach-2026\/\" style=\"font-weight:600;line-height:1.4;color:#1a1a1a;text-decoration:none;\">Cloudflare Containers: When Workers Are Too Small<\/a><\/p>\n<\/div>\n<div style=\"padding:14px 18px;border-left:3px solid #F21F05;background:#fafafa;margin-bottom:6px;\">\n<div style=\"font-size:0.7em;font-weight:700;color:#F21F05;text-transform:uppercase;letter-spacing:0.12em;margin-bottom:4px;\">mybusinessfuture<\/div>\n<p style=\"margin:0;\"><a href=\"https:\/\/mybusinessfuture.com\/hinweisgeberschutz-bussgelder-mittelstand-meldekanal-2026\/\" style=\"font-weight:600;line-height:1.4;color:#1a1a1a;text-decoration:none;\">Whistleblower Gap: First Fines in the Mid-Market<\/a><\/p>\n<\/div>\n<div style=\"padding:14px 18px;border-left:3px solid #d65663;background:#fafafa;margin-bottom:6px;\">\n<div style=\"font-size:0.7em;font-weight:700;color:#d65663;text-transform:uppercase;letter-spacing:0.12em;margin-bottom:4px;\">digital-chiefs<\/div>\n<p style=\"margin:0;\"><a href=\"https:\/\/www.digital-chiefs.de\/csrd-testat-it-datenchain-cio-audit-2026\/\" style=\"font-weight:600;line-height:1.4;color:#1a1a1a;text-decoration:none;\">CSRD Audit: Where the IT Data Chain Breaks<\/a><\/p>\n<\/div>\n<div style=\"padding:14px 18px;border-left:3px solid #69d8ed;background:#fafafa;\">\n<div style=\"font-size:0.7em;font-weight:700;color:#69d8ed;text-transform:uppercase;letter-spacing:0.12em;margin-bottom:4px;\">securitytoday<\/div>\n<p style=\"margin:0;\"><a href=\"https:\/\/www.securitytoday.de\/en\/2026\/05\/06\/cpanel-whm-cve-2026-41940-bsi-warnung-dach-webhoster\/\" style=\"font-weight:600;line-height:1.4;color:#1a1a1a;text-decoration:none;\">The Backdoor in Almost Every German Web Hosting Contract<\/a><\/p>\n<\/div>\n<\/div>\n<p style=\"text-align:right;color:#868e96;font-size:0.85em;margin-top:48px;font-style:italic;\"><em>Image source: AI-generated (May 2026), C2PA certificate embedded in image<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"AI phishing bypasses classic email filters like Gmail, SpamAssassin, and Proofpoint. What CISOs need to change in detection architecture in 2026.","protected":false},"author":55,"featured_media":14327,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_yoast_wpseo_focuskw":"AI phishing email filter","_yoast_wpseo_title":"AI Phishing: Email Filters Left in the Dark","_yoast_wpseo_metadesc":"AI-driven phishing bypasses Gmail, SpamAssassin and Proofpoint filters. Three detection patterns that still work in 2026 - and where LLM email filters fall short.","_yoast_wpseo_meta-robots-noindex":"","_yoast_wpseo_meta-robots-nofollow":"","_yoast_wpseo_meta-robots-adv":"","_yoast_wpseo_canonical":"","_yoast_wpseo_opengraph-title":"","_yoast_wpseo_opengraph-description":"","_yoast_wpseo_opengraph-image":"https:\/\/www.securitytoday.de\/wp-content\/uploads\/2026\/05\/ai-phishing-llm-mail-filter-detection-2026-cover-hero-1.png","_yoast_wpseo_opengraph-image-id":0,"_yoast_wpseo_twitter-title":"","_yoast_wpseo_twitter-description":"","_yoast_wpseo_twitter-image":"https:\/\/www.securitytoday.de\/wp-content\/uploads\/2026\/05\/ai-phishing-llm-mail-filter-detection-2026-cover-hero-1.png","_yoast_wpseo_twitter-image-id":0,"_evm_translation_lang":"","featured_post":0,"featured_post_sortierung":0,"_wp_old_slug":[],"footnotes":""},"categories":[217],"tags":[],"class_list":["post-14354","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-innovation"],"evm_reading_time_minutes":8,"wpml_language":"en","wpml_translation_of":14313,"_links":{"self":[{"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/posts\/14354","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/users\/55"}],"replies":[{"embeddable":true,"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/comments?post=14354"}],"version-history":[{"count":5,"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/posts\/14354\/revisions"}],"predecessor-version":[{"id":17588,"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/posts\/14354\/revisions\/17588"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/media\/14327"}],"wp:attachment":[{"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/media?parent=14354"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/categories?post=14354"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.securitytoday.de\/en\/wp-json\/wp\/v2\/tags?post=14354"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}