HM Haider Mustafa Back to portfolio

Applied AI Build

Fine-tuning a compact model for phishing triage.

A fine-tuning workflow for adapting a compact language model to classify phishing-style messages and return analyst-ready summaries.

Use case Phishing review, analyst summaries, and quicker first-pass triage.
Workflow Dataset prep, LoRA fine-tuning, validation split, and evaluation.
Stack Python, Transformers, PEFT, prompt formatting, and metric review.
Output Training pipeline, sample dataset, evaluation script, and project notes.

Project view

I built a workflow for teaching a smaller AI model how to separate phishing-style messages from benign ones and return a short analyst-ready summary instead of a generic response.

Project access

Start with the project notes for the overview. Open the training script for the fine-tuning pipeline, the evaluation script for the review stage, and the sample dataset if you want to inspect the data format.

Build Shape

How the fine-tuning workflow is structured.

01 Dataset prep

labelled examples and prompt structure

02 Base model

compact instruction model selected

03 LoRA fine-tuning

parameter-efficient adaptation

04 Evaluation

reviewing output quality and failure cases

Basic overview

This project fine-tunes a compact model on labelled phishing-style messages so it can separate suspicious content from benign content and return a short triage summary.

The aim was to build a full applied AI workflow rather than just use a model as-is: prepare the dataset, adapt the base model efficiently, evaluate the output, and produce something usable in an analyst-facing workflow.

Use case

The model is aimed at a practical workflow: separating phishing-like content from benign messages and returning a short summary that an analyst or reviewer can scan quickly.

What the workflow includes
  • Structured JSONL data with labels and target summaries.
  • Parameter-efficient fine-tuning with LoRA instead of full-model retraining.
  • Validation split and evaluation output to review performance and failure cases.
  • Training and evaluation scripts that can be extended to stronger datasets.
What this demonstrates

This project is here to show applied AI capability properly: model adaptation, data formatting, training workflow design, and evaluation rather than generic "AI interest" claims.