flan-t5-base-parkinson-abstain-curriculum-v1

A faithfulness-first abstractive summarization model for Parkinson’s disease–related PubMed abstracts, fine-tuned from google/flan-t5-base.

Model description

This model generates a short summary (typically 2–3 sentences) from a biomedical abstract with a strong bias toward:

  • Results/Conclusions-focused content (when present),
  • No hallucination / no speculation as a primary objective,
  • Minimizing incomplete outputs via a recommended “SAFE” decoding wrapper (see below).

Architecture: FLAN-T5 Base (T5-base family; ~220M parameters, encoder-decoder seq2seq).

Intended use

Good for

  • literature triage (rapid scanning),
  • building search UX (abstract → key results summary),
  • internal research tooling.

Not for

  • clinical decision making,
  • medical advice,
  • replacing full-text reading.

Training data

  • Source: PubMed abstracts related to Parkinson’s disease (last ~5 years).
  • Rows: 9446
  • Columns: pmid, title, abstract, journal, year, doi, authors, mesh_terms
  • Split: train/val/test = 8502 / 472 / 472
  • ABSTAIN rate (train): 0.0012938

Difficulty distribution (teacher pipeline)

  • difficulty=0: 2583
  • difficulty=1: 4463
  • difficulty=2: 1428
  • difficulty=3: 28

Note: This repository does not include raw PubMed abstracts by default. Ensure you have rights to redistribute any text you upload.

Teacherization (label generation)

Targets were generated using a heuristic teacher pipeline to “teach” the model what an ideal faithful summary looks like:

  • Prefer sentences resembling RESULTS / CONCLUSIONS content when possible
  • Optional ABSTAIN behavior for low-information cases:

INSUFFICIENT_RESULT_INFORMATION

Curriculum training

A staged curriculum was used:

  • STAGE1_EASY: epochs=1, bs=96, lr=5e-5
  • STAGE2_FULL: epochs=3, bs=96, lr=3e-5
  • STAGE3_HARD: epochs=1, bs=96, lr=2e-5

Best checkpoint selected by validation loss.

Validation loss (best)

  • STAGE1_EASY: val=0.1821
  • STAGE2_FULL: val=0.1686 → 0.1635 → 0.1626
  • STAGE3_HARD: val=0.1614 (best)

Safety & faithfulness evaluation (quick-check)

We run a “hallucination quick-check” based on:

  1. Clean novelty ratio: fraction of output tokens not present in the abstract (light stopword filtering),
  2. Truncation ratio: output ends without punctuation.

Baseline vs SAFE (fixed) on 200 samples:

  • BASE clean novelty avg=0.0050 p95=0.0417 max=0.1333
  • SAFE clean novelty avg=0.0029 p95=0.0244 max=0.1333
  • BASE truncation: 14/200 = 7.00%
  • SAFE truncation: 0/200 = 0.00%
  • SAFE ABSTAIN: 0/200 = 0.00%

Interpretation:

  • SAFE decoding substantially reduces incomplete generations and keeps novelty low.

Recommended usage

Prompt template

Summarize the following Parkinson's disease (PD) abstract in 2-3 sentences.
Use ONLY information that appears in the abstract.
Do NOT add recommendations or speculation.
If results/conclusions are not present, output exactly: INSUFFICIENT_RESULT_INFORMATION

<ABSTRACT>

Transformers example (inference)

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

MODEL_ID = "https://huggingface.co/furkanyagiz/flan-t5-base-parkinson-abstain-curriculum-v1"
ABSTAIN = "INSUFFICIENT_RESULT_INFORMATION"

tok = AutoTokenizer.from_pretrained(MODEL_ID)
mdl = AutoModelForSeq2SeqLM.from_pretrained(MODEL_ID)

prompt = (
  "Summarize the following Parkinson's disease (PD) abstract in 2-3 sentences.\n"
  "Use ONLY information that appears in the abstract.\n"
  "Do NOT add recommendations or speculation.\n"
  f"If results/conclusions are not present, output exactly: {ABSTAIN}\n\n"
  + abstract
)

inputs = tok(prompt, return_tensors="pt", truncation=True, max_length=512)
out = mdl.generate(
    **inputs,
    max_new_tokens=256,
    num_beams=4,
    do_sample=False,
    no_repeat_ngram_size=4,
    length_penalty=0.8,
    repetition_penalty=1.05,
)
summary = tok.decode(out[0], skip_special_tokens=True).strip()
print(summary)

SAFE decoding wrapper (recommended)

To reduce truncation and formatting artifacts:

  • clean whitespace/punctuation
  • keep up to 3 sentences
  • ensure ending punctuation
  • optionally re-generate with shorter settings if novelty spikes

(Full reference implementation is in the GitHub repo: https://github.com/ffurkandemir/parkinson-abstract-summarizer)

Limitations

  • Abstract-only: cannot recover details not in the abstract.
  • Review/opinion articles may contain fewer “results”; summary style may vary.
  • Domain shift: non-PD or non-biomedical inputs degrade quality.
  • Minor text artifacts can still occur (typos, missing symbols like “<”).

Ethical considerations

  • Outputs may look authoritative. Always verify with the source paper.
  • Not medical advice; do not use for clinical decisions.
  • Consider adding UI warnings and “open original abstract” links in downstream apps.

License

  • Base model: google/flan-t5-base (Apache-2.0)
  • Fine-tuned model: released under Apache-2.0 (unless otherwise stated)

Citation

If you use this model, cite:

Downloads last month
37
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for furkanyagiz/flan-t5-base-parkinson-abstain-curriculum-v1

Finetuned
(880)
this model