🧠 Mercy-62/bart-base-pii-masker-lora

A LoRA fine-tuned BART model for PII (Personally Identifiable Information) masking transforms sensitive text into privacy-safe form while keeping sentence meaning intact.


✨ Overview

Mercy-62/bart-base-pii-masker-lora is built on top of facebook/bart-base and fine-tuned using the PEFT LoRA approach for lightweight adaptation.

It is designed to automatically detect and mask personally identifiable information such as:

  • Names
  • Emails
  • Phone numbers
  • CNICs / National IDs
  • Bank account numbers
  • Dates
  • Salary or monetary values

πŸ’‘ Example

Input:

"Customer John Smith applied for a credit card on May 10, 2023."

Output:

Customer [NAME] applied for a credit card on [DATE].

🧩 Model Details

Component Value
Base Model facebook/bart-base
Adapter Type LoRA (Low-Rank Adaptation)
Frameworks PyTorch, Transformers, PEFT
Task Text-to-Text Generation
Language English

πŸ§ͺ Example Inference

from peft import PeftModel
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

# Load model
base_model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-base")
model = PeftModel.from_pretrained(base_model, "Mercy-62/bart-base-pii-masker-lora")
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-base")

def test_pii_masking(model, tokenizer, samples):
    device = model.device
    for text in samples:
        inputs = tokenizer(text, return_tensors="pt", truncation=True).to(device)
        outputs = model.generate(**inputs, max_length=80)
        print(f"\nInput: {text}")
        print("Output:", tokenizer.decode(outputs[0], skip_special_tokens=True))

pii_samples = [
    "Customer John Smith applied for a credit card on May 10, 2023.",
    "Contact number: +92-300-1234567, Email: [email protected]",
]

test_pii_masking(model, tokenizer, pii_samples)
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Mercy-62/bart-base-pii-masker-lora

Evaluation results