AI Accuracy: Security Tool or Security Issue?

Part of the Trust in Digital Life Webinar Series 2025

Sep 23, 2025

The new autumn season of TDL webinars began with a look at the questions surrounding AI accuracy, and the extent to which it is a help or a hindrance in managing cybersecurity.

Numerous examples of AI hallucinations are posted every day, ranging from the humorous to those which are concerning. With most cybersecurity players now using LLM-based AI and AI agents, the issue of accuracy becomes increasingly important. But its importance hasn’t yet clearly defined the issue.

Several aspects of AI accuracy were identified prior to the webinar:

· Is AI accuracy clearly measurable? What metrics, beyond the obvious, e.g., precision, need to be included?
· Do these metrics need to be standardised?
· In which ways is AI accuracy relevant to security and privacy threats?
· Can perceived accuracy (by users) lead to additional subtle security and privacy attacks?

Background

The webinar participants were asked as to how AI accuracy plays out for security, privacy and trust applications and agents, and whether AI accuracy overall is a serious or marginal issue for LLM/SLM based AI in general and in security, privacy and trust related applications. In a security, privacy and trust context, the panelists identified specific concerns and how this issue is connected with human expertise. Although there are some applications that measure accuracy, the relevance of references to the text, the relevance of answers, etc, it is not clear whether the techniques used to pinpoint accuracy are sufficient and whether the nature of the security field rwquires additional parameters and approaches

While the emerging standards for accuracy are connected to AI governance or specific metrics, there is no clear understanding yet how to incorporate accuracy and precision in the security space. Since AI is commonly used in mission critical applications, it is essential to correctly position and measure accuracy and ensure that these metrics serve adequately the security, privacy and trust fields of application.

As LLMs (and SLMs) draw from unreliable and incomplete data and synthetic data (for some areas including security), we need to formulate requirements for data that need to be observed, predominantly not on human centric data, like HR databases, but on data that relates to logs, APIs, computations, network analysis, etc.

In addition to improved metrics and data, it is important to design more resilient models that can recognize their inability to answer questions accurately and either withdraw or assess potential accuracy and precision issues.

Looking to the future, the session looked at what might be especially promising areas of technology and process innovations that could improve AI accuracy and be fundamentally good for security, privacy and trust - and what is a realistic short- or long-term ambition.

Speakers

The group of expert speakers assembled to discuss the issues arising were:

· John Baras, Distinguished University Professor and endowed Lockheed Martin Chair in Systems Engineering, University of Maryland
· Lorenzo Cavallaro, Professor, Computer Science, University College London
· Melvin Greer, SVP & Chief Data Scientist, TechElevate Innovation Labs
· Cory Robinson, Private consultant in data governance, privacy, data ethics & AI ethics

The event was moderated by TDL strategic advisor, Claire Vishik.

AI Security Standards and Accuracy

The panel discussed the challenges and importance of accuracy in artificial intelligence, particularly in security applications. The need for domain-specific LLMs and knowledge graphs to improve security analysis was emphasised, while highlighting the risks of AI hallucinations and the importance of data integrity. The need for standards in AI, especially in security was also noted, as well as the importance of controlling input data to prevent "garbage in, garbage out" scenarios. The panellists agreed on the necessity of standardisation and user acceptance of guidelines in AI security applications.

Measuring Accuracy in AI Systems

The panel went on to discuss the challenges and nuances of measuring accuracy in AI systems, particularly in security and other domains. The importance was made of understanding the problem context to choose appropriate metrics, while the need to define accuracy based on the specific task at hand was also recorded. The importance of distinguishing between accuracy and performance was stressed, as was the need for human oversight and ethical considerations in AI systems. The discussion touched on the potential erosion of human expertise due to AI, and the need for context-bound reliability in AI measurements.

AI Security Testing and Standards

The group discussed the challenges of AI accuracy and security, pointing out the need for testing, validation and verification in network security, while drawing attention to the importance of human oversight due to AI's limitations in decision-making. The issue of adversarial attacks on AI systems was addressed as well as the need for realistic threat models. One line of inquiry was about the possibility of standardising metrics for AI accuracy in security, to which it was suggested that there were already existing principles in the EU AI Act, while also noting the complexity of human behaviour and the unique threat landscape for AI systems. The discussion concluded with an acknowledgment of progress in composability and the need for further development of standards for AI security.

Enhancing Data Accuracy for AI

The panel discussed the challenges and opportunities in improving data accuracy for AI systems. The importance of data governance, transparency and purpose limitation was pointed out, while the tension between transparency and data protection, particularly regarding IP and regulatory constraints, was mentioned. The need for better integration of model knowledge with selective data was discussed, especially in engineering systems and networks, while concerns were raised about data sovereignty and the need for trust in data ownership. The panel agreed that while more data is not always better, improvements in data quality, privacy-preserving methods, calibration tools and ethical benchmarks could lead to incremental improvements in accuracy over the next one to two years. It was also noted that while overall it is useful to have more data that is complete and accurate, in itself, the abundance of data doesn’t improve accuracy.

Enhancing AI Trust and Adoption

The panellists discussed the limitations and challenges of AI systems, emphasising that technical fixes alone are insufficient. The importance of non-technical aspects was mentioned, such as inviting non-AI professionals like ethicists and sociologists into the conversation to improve AI adoption and trust. Focusing on semantics and robust reasoning in AI models, while also addressing calibration and uncertainty quantification to enhance model trustworthiness was another suggested path for the future. One of the conclusions was that future discussions should focus on specific metrics for security situations.

While there may not have been definitive answers to the questions raised in the 60-minute discussion, the panellists undoubtedly advanced the understanding of the issues and offered valuable new insights. This discussion will be continued.

Watch the full recording of the webinar on our YouTube channel here!

Trust in Digital Life (TDL)

Discussion about this post