SentenceTransformer based on Snowflake/snowflake-arctic-embed-l-v2.0
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l-v2.0. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-l-v2.0
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'[MASK] multiplied, in any case in which the service that was required has been performed in part, by the percentage which the length of service that was not performed is to the length of the service that was required to be performed. The amount will be determined under the following formula:',
'the period of appointment begins, multiplied, in any case in which the service that was required has been performed in part, by the percentage which the length of service that was not performed is to the length of the service that was required to be performed. The amount will be determined under the following formula:',
'[SUBSECTION c] Special enrollment periods.. A Part D eligible individual may enroll in a PDP or disenroll from a PDP and enroll in another PDP or MA-PD plan (as provided at § 422.62(b) of this chapter), as applicable, under any of the following circumstances: [CLAUSE 1] The individual involuntarily loses creditable prescription drug coverage or such coverage is involuntarily reduced so that it is no longer creditable coverage as defined under § 423.56(a). Loss of credible prescription drug coverage due to failure to pay any required premium is not considered involuntary loss of the coverage. [CLAUSE 2] The individual was not adequately informed, as required by standards established by CMS under § 423.56, that he or she has lost his or her creditable prescription drug coverage, that he or she never had credible prescription drug coverage, or the coverage is involuntarily reduced so that it is no longer creditable prescription drug coverage.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9334, 0.2975],
# [0.9334, 1.0000, 0.2973],
# [0.2975, 0.2973, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 24,880 training samples
- Columns:
sentence_0andsentence_1 - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 43 tokens
- mean: 123.38 tokens
- max: 340 tokens
- min: 43 tokens
- mean: 126.43 tokens
- max: 366 tokens
- Samples:
sentence_0 sentence_1 [SECTION HEADING] § [REF] Waiver of right of recovery. In determining whether there is good cause for waiver of any right of recovery which he may have against any applicant by reason of any payments made pursuant to a loan guarantee under this subpart, the Secretary shall take into consideration the extent to which:[SECTION HEADING] § 57.1517 Waiver of right of recovery. In determining whether there is good cause for waiver of any right of recovery which he may have against any applicant by reason of any payments made pursuant to a loan guarantee under this subpart, the Secretary shall take into consideration the extent to which:[SUBSECTION A] Any unmet deductible applied to the charges related to the reasonable costs that the facility incurs in providing the covered services; [CLAUSE 7] Rural health clinic services that meet the requirements set forth in part 491 of this chapter. [CITATIONS][SUBSECTION A] Any unmet deductible applied to the charges related to the reasonable costs that the facility incurs in providing the covered services; [CLAUSE 7] Rural health clinic services that meet the requirements set forth in part 491 of this chapter. [CITATIONS][SUBSECTION b] Waiver of the right to appear.. (1) An enrollee may submit to OMHA a written statement indicating that he or she does not wish to appear at the hearing. [ITEM i] For expedited hearings, an enrollee may indicate in writing or orally [MASK] ii] The OMHA hearing office must document all oral waivers in writing and maintain the documentation in the case files. [CLAUSE 2] The enrollee may subsequently withdraw his or her waiver in writing at any time before the notice of the hearing decision is issued; however, by withdrawing the waiver the enrollee agrees to an extension of the adjudication period as specified in § 423.2016, that may be necessary to schedule and hold the hearing.[SUBSECTION b] Waiver of the right to appear.. (1) An enrollee may submit to OMHA a written statement indicating that he or she does not wish to appear at the hearing. [ITEM i] For expedited hearings, an enrollee may indicate in writing or orally that he or she does not wish to appear at the hearing. [ITEM ii] The OMHA hearing office must document all oral waivers in writing and maintain the documentation in the case files. [CLAUSE 2] The enrollee may subsequently withdraw his or her waiver in writing at any time before the notice of the hearing decision is issued; however, by withdrawing the waiver the enrollee agrees to an extension of the adjudication period as specified in § 423.2016, that may be necessary to schedule and hold the hearing. - Loss:
DenoisingAutoEncoderLoss
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16per_device_eval_batch_size: 16multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss |
|---|---|---|
| 0.3215 | 500 | 6.6865 |
| 0.6431 | 1000 | 5.7788 |
| 0.9646 | 1500 | 5.5835 |
| 1.2862 | 2000 | 5.3876 |
| 1.6077 | 2500 | 5.2766 |
| 1.9293 | 3000 | 5.2108 |
| 2.2508 | 3500 | 5.1317 |
| 2.5723 | 4000 | 5.0701 |
| 2.8939 | 4500 | 5.0288 |
Framework Versions
- Python: 3.12.6
- Sentence Transformers: 5.2.0
- Transformers: 4.56.0
- PyTorch: 2.8.0+cu129
- Accelerate: 1.10.1
- Datasets: 4.4.1
- Tokenizers: 0.22.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
DenoisingAutoEncoderLoss
@inproceedings{wang-2021-TSDAE,
title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
month = nov,
year = "2021",
address = "Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
pages = "671--688",
url = "https://arxiv.org/abs/2104.06979",
}
- Downloads last month
- 3
Model tree for atx-labs/snowflake-custom-noise-tsdae-cfr-finetuned
Base model
Snowflake/snowflake-arctic-embed-l-v2.0