RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://huggingface.co/Alibaba-NLP/gte-en-mlm-base below:

Website Navigation

Alibaba-NLP/gte-en-mlm-base · Hugging Face

gte-en-mlm-base

We introduce GTE-v1.5 series, new generalized text encoder, embedding and reranking models that the context length of up to 8192. The models are built upon the transformer++ encoder backbone (BERT + RoPE + GLU, code refer to Alibaba-NLP/new-impl) as well as the vocabulary of bert-base-uncased.

This text encoder is the GTEv1.5-en-MLM-base-8192 in table 13 of our paper.

Developed by: Institute for Intelligent Computing, Alibaba Group
Model type: Text Encoder
Paper: mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval.

Model list Training Details Training Data

Masked language modeling (MLM): c4-en

Training Procedure

To enable the backbone model to support a context length of 8192, we adopted a multi-stage training strategy. The model first undergoes preliminary MLM pre-training on shorter lengths. And then, we resample the data, reducing the proportion of short texts, and continue the MLM pre-training.

The entire training process is as follows:

MLM-2048: lr 5e-4, mlm_probability 0.3, batch_size 4096, num_steps 70000, rope_base 10000
MLM-8192: lr 5e-5, mlm_probability 0.3, batch_size 1024, num_steps 20000, rope_base 500000

Evaluation Citation

If you find our paper or models helpful, please consider citing them as follows:

@misc{zhang2024mgte,
  title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval}, 
  author={Xin Zhang and Yanzhao Zhang and Dingkun Long and Wen Xie and Ziqi Dai and Jialong Tang and Huan Lin and Baosong Yang and Pengjun Xie and Fei Huang and Meishan Zhang and Wenjie Li and Min Zhang},
  year={2024},
  eprint={2407.19669},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2407.19669}, 
}

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4