Trust but Verify: Benchmarks for Hallucination, Vulnerability, and Style Drift in AI-Generated Code Reviews

Authors

  • Syed Khundmir Azmi Independent Researcher, USA

Keywords:

AI code reviews, hallucination, vulnerability, style drift, AI verification, benchmarks, software development, coding standards, AI reliability, system security

Abstract

The growing popularity of AI-based code reviews in software development necessitates a thorough understanding of their shortcomings and potential risks. The current paper addresses three key problems: hallucination, vulnerability, and style drift, which may jeopardize the quality and security of AI-generated code reviews. Hallucinations refer to instances where AI provides incorrect or irrelevant recommendations, whereas vulnerability points out the threats of misuse or assaults on AI systems. Style drift refers to the shift in coding standards used by the AI. The primary objective of this study is to establish clear standards for identifying and confirming these issues, thereby enhancing the accuracy and reliability of AI-mediated code assessments. The most important finding is that in the absence of proper verification solutions, AI-produced code reviews may cause significant quality differences. The research also contains suggestions on the measures to improve the reliability of AI systems, so that they could correspond to the industry requirements.

Published

06-02-2023

How to Cite

Syed Khundmir Azmi. (2023). Trust but Verify: Benchmarks for Hallucination, Vulnerability, and Style Drift in AI-Generated Code Reviews. Well Testing Journal, 32(1), 76–90. Retrieved from https://welltestingjournal.com/index.php/WT/article/view/229

Issue

Section

Original Research Articles

Similar Articles

<< < 1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.