Doorgaan naar hoofdnavigatie Doorgaan naar zoeken Ga verder naar hoofdinhoud

On the Potential of LLMs for Offensive Security: Benchmarks vs. Operational Reality

  • Ruben Missotten
  • , Vera Rimmer
  • , Wim Mees
  • , Lieven Desmet
  • KU Leuven
  • Royal Military Academy of Belgium

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdragepeer review

Samenvatting

Large Language Models (LLMs), through their strong capabilities in code generation, reasoning, and tool use, have demonstrated promising results in security tasks involving vulnerability discovery and exploitation. However, evaluating their offensive potential in automating penetration testing - a more complex and multi-stage process - remains a critical research challenge. While existing evaluation frameworks effectively demonstrate LLM capabilities in isolated or simplified scenarios, they often do not extend toward the complexity of interconnected attack chains characteristic of real-world adversarial operations. In this analytical study, we examine the challenge of assessing the feasibility of LLM-powered automation across the full adversarial pipeline within realistic environments. We contribute an analysis of current benchmarks and associated environments, and highlight opportunities for methodological enhancements that would strengthen alignment between academic evaluations and operational realities.

Originele taal-2Engels
TitelProceedings - 2025 Annual Computer Security Applications Conference Workshops, ACSACW 2025
UitgeverijInstitute of Electrical and Electronics Engineers Inc.
Pagina's420-427
Aantal pagina's8
ISBN van elektronische versie9798331545369
DOI's
StatusGepubliceerd - 2025
Evenement2025 Annual Computer Security Applications Conference Workshops, ACSACW 2025 - Honolulu, Verenigde Staten van Amerika
Duur: 8 dec. 202512 dec. 2025

Publicatie series

NaamProceedings - 2025 Annual Computer Security Applications Conference Workshops, ACSACW 2025

Congres

Congres2025 Annual Computer Security Applications Conference Workshops, ACSACW 2025
Land/RegioVerenigde Staten van Amerika
StadHonolulu
Periode8/12/2512/12/25

Vingerafdruk

Duik in de onderzoeksthema's van 'On the Potential of LLMs for Offensive Security: Benchmarks vs. Operational Reality'. Samen vormen ze een unieke vingerafdruk.

Citeer dit