A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://huggingface.co/docs/transformers/v4.51.3/en/model_doc/helium below:

Website Navigation


Helium

Helium Overview

Helium was proposed in Announcing Helium-1 Preview by the Kyutai Team.

Helium-1 preview is a lightweight language model with 2B parameters, targeting edge and mobile devices. It supports the following languages: English, French, German, Italian, Portuguese, Spanish.

Evaluation Testing Data

The model was evaluated on MMLU, TriviaQA, NaturalQuestions, ARC Easy & Challenge, Open Book QA, Common Sense QA, Physical Interaction QA, Social Interaction QA, HellaSwag, WinoGrande, Multilingual Knowledge QA, FLORES 200.

Metrics

We report accuracy on MMLU, ARC, OBQA, CSQA, PIQA, SIQA, HellaSwag, WinoGrande. We report exact match on TriviaQA, NQ and MKQA. We report BLEU on FLORES.

English Results Benchmark Helium-1 Preview HF SmolLM2 (1.7B) Gemma-2 (2.6B) Llama-3.2 (3B) Qwen2.5 (1.5B) MMLU 51.2 50.4 53.1 56.6 61.0 NQ 17.3 15.1 17.7 22.0 13.1 TQA 47.9 45.4 49.9 53.6 35.9 ARC E 80.9 81.8 81.1 84.6 89.7 ARC C 62.7 64.7 66.0 69.0 77.2 OBQA 63.8 61.4 64.6 68.4 73.8 CSQA 65.6 59.0 64.4 65.4 72.4 PIQA 77.4 77.7 79.8 78.9 76.0 SIQA 64.4 57.5 61.9 63.8 68.7 HS 69.7 73.2 74.7 76.9 67.5 WG 66.5 65.6 71.2 72.0 64.8 Average 60.7 59.3 62.2 64.7 63.6 Multilingual Results Language Benchmark Helium-1 Preview HF SmolLM2 (1.7B) Gemma-2 (2.6B) Llama-3.2 (3B) Qwen2.5 (1.5B) German MMLU 45.6 35.3 45.0 47.5 49.5 ARC C 56.7 38.4 54.7 58.3 60.2 HS 53.5 33.9 53.4 53.7 42.8 MKQA 16.1 7.1 18.9 20.2 10.4 Spanish MMLU 46.5 38.9 46.2 49.6 52.8 ARC C 58.3 43.2 58.8 60.0 68.1 HS 58.6 40.8 60.5 61.1 51.4 MKQA 16.0 7.9 18.5 20.6 10.6 Technical Specifications Model Architecture and Objective Hyperparameter Value Layers 24 Heads 20 Model dimension 2560 MLP dimension 7040 Context size 4096 Theta RoPE 100,000

Tips:

Usage tips

Helium can be found on the Huggingface Hub

In the following, we demonstrate how to use helium-1-preview for the inference.

>>> from transformers import AutoModelForCausalLM, AutoTokenizer
>>> device = "cuda" 

>>> model = AutoModelForCausalLM.from_pretrained("kyutai/helium-1-preview-2b", device_map="auto")
>>> tokenizer = AutoTokenizer.from_pretrained("kyutai/helium-1-preview-2b")

>>> prompt = "Give me a short introduction to large language model."

>>> model_inputs = tokenizer(prompt, return_tensors="pt").to(device)

>>> generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512, do_sample=True)

>>> generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]

>>> response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
HeliumConfig class transformers.HeliumConfig < source >

( vocab_size = 48000 hidden_size = 2560 intermediate_size = 7040 num_hidden_layers = 24 num_attention_heads = 20 num_key_value_heads = 20 head_dim = 128 hidden_act = 'silu' attention_dropout = 0.0 max_position_embeddings = 4096 initializer_range = 0.02 rms_norm_eps = 1e-08 use_cache = True tie_word_embeddings = False rope_theta = 100000.0 pad_token_id = 3 eos_token_id = 2 bos_token_id = 1 attention_bias = False mlp_bias = False **kwargs )

Parameters

This is the configuration class to store the configuration of a HeliumModel. It is used to instantiate an Helium model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the Helium 2b model. e.g. kyutai/helium-2b Configuration objects inherit from PretrainedConfig and can be used to control the model outputs. Read the documentation from PretrainedConfig for more information.

>>> from transformers import HeliumModel, HeliumConfig
>>> 
>>> configuration = HeliumConfig()
>>> 
>>> model = HeliumModel(configuration)
>>> 
>>> configuration = model.config
HeliumModel class transformers.HeliumModel < source >

( config: HeliumConfig )

Parameters

The bare Helium Model outputting raw hidden-states without any specific head on top. This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

Transformer decoder consisting of config.num_hidden_layers layers. Each layer is a HeliumDecoderLayer

forward < source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.Tensor] = None position_ids: typing.Optional[torch.LongTensor] = None past_key_values: typing.Optional[transformers.cache_utils.Cache] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None cache_position: typing.Optional[torch.LongTensor] = None **flash_attn_kwargs: typing_extensions.Unpack[transformers.modeling_flash_attention_utils.FlashAttentionKwargs] )

Parameters

The HeliumModel forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

HeliumForCausalLM class transformers.HeliumForCausalLM < source >

( config: HeliumConfig )

forward < source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.Tensor] = None position_ids: typing.Optional[torch.LongTensor] = None past_key_values: typing.Optional[transformers.cache_utils.Cache] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.LongTensor] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None cache_position: typing.Optional[torch.LongTensor] = None logits_to_keep: typing.Union[int, torch.Tensor] = 0 **kwargs: typing_extensions.Unpack[transformers.models.helium.modeling_helium.KwargsForCausalLM] ) transformers.modeling_outputs.CausalLMOutputWithPast or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.CausalLMOutputWithPast or a tuple of torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (HeliumConfig) and inputs.

The HeliumForCausalLM forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

>>> from transformers import AutoTokenizer, HeliumForCausalLM

>>> model = HeliumForCausalLM.from_pretrained("google/helium-7b")
>>> tokenizer = AutoTokenizer.from_pretrained("google/helium-7b")

>>> prompt = "What is your favorite condiment?"
>>> inputs = tokenizer(prompt, return_tensors="pt")

>>> 
>>> generate_ids = model.generate(inputs.input_ids, max_length=30)
>>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
"What is your favorite condiment?"
HeliumForSequenceClassification class transformers.HeliumForSequenceClassification < source >

( config: HeliumConfig )

Parameters

The Helium Model transformer with a sequence classification head on top (linear layer).

HeliumForSequenceClassification uses the last token in order to do the classification, as other causal models (e.g. GPT-2) do.

Since it does classification on the last token, it requires to know the position of the last token. If a pad_token_id is defined in the configuration, it finds the last token that is not a padding token in each row. If no pad_token_id is defined, it simply takes the last value in each row of the batch. Since it cannot guess the padding tokens when inputs_embeds are passed instead of input_ids, it does the same (take the last value in each row of the batch).

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward < source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.Tensor] = None position_ids: typing.Optional[torch.LongTensor] = None past_key_values: typing.Optional[transformers.cache_utils.Cache] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.LongTensor] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None )

Parameters

The HeliumForSequenceClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

HeliumForTokenClassification class transformers.HeliumForTokenClassification < source >

( config: HeliumConfig )

Parameters

The Helium Model transformer with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward < source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.Tensor] = None position_ids: typing.Optional[torch.LongTensor] = None past_key_values: typing.Optional[transformers.cache_utils.Cache] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.LongTensor] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None ) transformers.modeling_outputs.TokenClassifierOutput or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.TokenClassifierOutput or a tuple of torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (HeliumConfig) and inputs.

The HeliumForTokenClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

>>> from transformers import AutoTokenizer, HeliumForTokenClassification
>>> import torch

>>> tokenizer = AutoTokenizer.from_pretrained("google/helium-7b")
>>> model = HeliumForTokenClassification.from_pretrained("google/helium-7b")

>>> inputs = tokenizer(
...     "HuggingFace is a company based in Paris and New York", add_special_tokens=False, return_tensors="pt"
... )

>>> with torch.no_grad():
...     logits = model(**inputs).logits

>>> predicted_token_class_ids = logits.argmax(-1)

>>> 
>>> 
>>> 
>>> predicted_tokens_classes = [model.config.id2label[t.item()] for t in predicted_token_class_ids[0]]

>>> labels = predicted_token_class_ids
>>> loss = model(**inputs, labels=labels).loss
< > Update on GitHub

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.3