whisper-medium.en_timestamped

This is the ONNX version of openai/whisper-medium.en with word-level timestamp support for use with Transformers.js.

Features

  • βœ… Word-level timestamps via cross-attention (alignment_heads configured)
  • βœ… Multiple quantization variants (fp32, int8, uint8)
  • βœ… Compatible with Transformers.js for browser-based inference
  • βœ… Merged decoder model for efficient inference

Usage with Transformers.js

import { pipeline } from '@huggingface/transformers';

const transcriber = await pipeline(
  'automatic-speech-recognition',
  'neonwatty/whisper-medium.en_timestamped'
);

const result = await transcriber(audioUrl, {
  return_timestamps: 'word',
  chunk_length_s: 30,
  stride_length_s: 5,
});

console.log(result);
// { text: "Hello world", chunks: [{ text: "Hello", timestamp: [0.0, 0.5] }, ...] }

Model Files

The model includes the following ONNX files in the onnx/ directory:

File Description
encoder_model.onnx Audio encoder (fp32)
decoder_model.onnx Text decoder (fp32)
decoder_with_past_model.onnx Decoder with KV cache
decoder_model_merged.onnx Merged decoder for efficient inference
*_int8.onnx INT8 quantized versions
*_uint8.onnx UINT8 quantized versions

Acknowledgments

Downloads last month
43
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for onnx-community/whisper-medium.en_timestamped

Quantized
(4)
this model