2026/06/28/semgrep-benchmark-finds-zhipu-ai-s-glm-5-2

Semgrep benchmark finds Zhipu AI’s GLM 5.2 outperforms Claude Code on prompt-only IDOR vulnerability detection

Jun 28, 2026, 05:50 PM·semgrep.dev

EDITOR BRIEF

Semgrep says Zhipu AI’s open-weight GLM 5.2 scored 39% F1 on its IDOR detection benchmark, ahead of Claude Code at 32% and at about $0.17 per vulnerability found. Semgrep’s own purpose-built multimodal pipeline still led with 53–61% F1, suggesting much of the performance comes from the surrounding security-analysis harness rather than the model alone.

INSIGHTS

The result points to rising competitiveness for open-weight models in specialized security tasks, especially when cost per finding matters. But the stronger showing from Semgrep’s guided pipeline reinforces that tooling, context selection, and workflow design may be as important as raw model capability for production AI security agents.

COMMENTS

Discussion

> geekhaus:~$ next read?

qsoe.net

Semgrep benchmark finds Zhipu AI’s GLM 5.2 outperforms Claude Code on prompt-only IDOR vulnerability detection

EDITOR BRIEF

INSIGHTS

COMMENTS

Discussion

QSOE 0.1 debuts as a QNX-inspired RISC-V OS with selectable Skimmer or seL4 microkernels

China’s Z.ai claims it can match Mythos on cybersecurity

California law targeting loud streaming ads takes effect on July 1

EDITOR BRIEF

INSIGHTS

COMMENTS

Discussion

Next read recommendations

QSOE 0.1 debuts as a QNX-inspired RISC-V OS with selectable Skimmer or seL4 microkernels

China’s Z.ai claims it can match Mythos on cybersecurity

California law targeting loud streaming ads takes effect on July 1