AI might be the answer for better phishing resilience
Phishing is still a go-to tactic for attackers, which is why even small gains in user training are worth noticing. A recent research project from the University of Bari looked at whether LLMs can produce training that helps people spot suspicious emails with better accuracy.

The research team ran two controlled studies with a total of 480 participants. Both studies used content generated by an LLM to deliver phishing awareness lessons.
AI content helped people spot more attacks
The first study involved 80 participants who received training generated through four prompting methods. The goal was to see whether different ways of instructing the LLM would change how helpful the training became. These methods ranged from simple profile inserts, where brief profile data gathered from short questionnaires was added directly to the prompt, to more structured styles based on guidelines or tables.
Despite their differences, each method asked the model to explain a phishing scenario, walk through defense steps, and guide people through short exercises. According to the researchers, every method improved user performance when classifying phishing emails.
The most notable improvement appeared in recall, which in the study shows how often participants caught phishing emails during the tests. Precision also improved, which means participants made fewer incorrect phishing labels. F1 went up as well. This score brings recall and precision together, so it shows overall detection skill in one number. It helped the researchers see how well users spotted phishing emails while avoiding false alarms.
In this early test, a simple prompting method performed as well as the more elaborate formats. It relied on placing each participant’s short questionnaire scores into the prompt. The model adjusted tone and examples based on that profile data. The overall structure of the lesson stayed the same across conditions.
Although no large statistical differences appeared between formats, this direct profile method produced better results after training. The team viewed this as a sign that simple prompts may be enough for practical use, since added complexity in the other formats did not lead to further improvements.
Personalization did not boost outcomes
The second study expanded to 400 participants and assigned people to one of four groups. Two groups received generic content and two received personalized content.
Generic training used one shared version for all participants, while personalized training changed tone and examples based on user profiles. All versions followed the same structure with an introduction, a phishing scenario, defense guidance, hands-on exercises, and a recap.
Participants in all groups improved. They became better at telling genuine messages from fake ones, and both recall and F1 rose. However, personalized content did not outperform generic content. In a few cases the generic groups showed slightly larger improvement, although these differences were not large enough to change the overall result.
The researchers confirmed that the model did adapt tone and examples for each profile. The absence of a measurable effect suggests that the personalization tested here shapes style more than behavior. For security teams that avoid collecting sensitive staff information, this outcome matters. It suggests that generic training can work as well as tailored versions for phishing detection.
Longer training helped, but only slightly
Training length showed a small effect. The longer sessions lasted about 18 minutes, while the shorter ones lasted about 9 minutes. Participants in the longer sessions reached somewhat higher performance, although the difference was modest. The added time offered space for extra examples and explanations, which may have helped.
User impressions did not match learning outcomes
The researchers also looked at how participants felt about the training and how those feelings related to improvement. User satisfaction varied by personality profile data, but these reactions did not match changes in performance. They caution that training designers should rely on measured results, not on sentiment.
Taken together, the findings suggest that organizations can strengthen phishing awareness without complex personalization strategies, as long as training is delivered often and measured with objective data.