RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://huggingface.co/docs/transformers/v4.51.3/en/model_doc/auto below:

Website Navigation

Auto Classes

Auto Classes

In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the from_pretrained() method. AutoClasses are here to do this job for you so that you automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.

Instantiating one of AutoConfig, AutoModel, and AutoTokenizer will directly create a class of the relevant architecture. For instance

model = AutoModel.from_pretrained("google-bert/bert-base-cased")

will create a model that is an instance of BertModel.

There is one class of AutoModel for each task, and for each backend (PyTorch, TensorFlow, or Flax).

Extending the Auto Classes

Each of the auto classes has a method to be extended with your custom classes. For instance, if you have defined a custom class of model NewModel, make sure you have a NewModelConfig then you can add those to the auto classes like this:

from transformers import AutoConfig, AutoModel

AutoConfig.register("new-model", NewModelConfig)
AutoModel.register(NewModelConfig, NewModel)

You will then be able to use the auto classes like you would usually do!

If your NewModelConfig is a subclass of PretrainedConfig, make sure its model_type attribute is set to the same key you use when registering the config (here "new-model").

Likewise, if your NewModel is a subclass of PreTrainedModel, make sure its config_class attribute is set to the same class you use when registering the model (here NewModelConfig).

AutoConfig

This is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained < source >

( pretrained_model_name_or_path **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co. A path to a directory containing a configuration file saved using the save_pretrained() method, or the save_pretrained() method, e.g., ./my_model_directory/. A path or url to a saved configuration JSON file, e.g., ./my_model_directory/configuration.json.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
return_unused_kwargs (bool, optional, defaults to False) — If False, then this function returns just the final configuration object. If True, then this functions returns a Tuple(config, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of kwargs which has not been used to update config and is otherwise ignored.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
kwargs(additional keyword arguments, optional) — The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the return_unused_kwargs keyword parameter.

Instantiate one of the configuration classes of the library from a pretrained model configuration.

The configuration class to instantiate is selected based on the model_type property of the config object that is loaded, or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

albert — AlbertConfig (ALBERT model)
align — AlignConfig (ALIGN model)
altclip — AltCLIPConfig (AltCLIP model)
aria — AriaConfig (Aria model)
aria_text — AriaTextConfig (AriaText model)
audio-spectrogram-transformer — ASTConfig (Audio Spectrogram Transformer model)
autoformer — AutoformerConfig (Autoformer model)
aya_vision — AyaVisionConfig (AyaVision model)
bamba — BambaConfig (Bamba model)
bark — BarkConfig (Bark model)
bart — BartConfig (BART model)
beit — BeitConfig (BEiT model)
bert — BertConfig (BERT model)
bert-generation — BertGenerationConfig (Bert Generation model)
big_bird — BigBirdConfig (BigBird model)
bigbird_pegasus — BigBirdPegasusConfig (BigBird-Pegasus model)
biogpt — BioGptConfig (BioGpt model)
bit — BitConfig (BiT model)
blenderbot — BlenderbotConfig (Blenderbot model)
blenderbot-small — BlenderbotSmallConfig (BlenderbotSmall model)
blip — BlipConfig (BLIP model)
blip-2 — Blip2Config (BLIP-2 model)
bloom — BloomConfig (BLOOM model)
bridgetower — BridgeTowerConfig (BridgeTower model)
bros — BrosConfig (BROS model)
camembert — CamembertConfig (CamemBERT model)
canine — CanineConfig (CANINE model)
chameleon — ChameleonConfig (Chameleon model)
chinese_clip — ChineseCLIPConfig (Chinese-CLIP model)
chinese_clip_vision_model — ChineseCLIPVisionConfig (ChineseCLIPVisionModel model)
clap — ClapConfig (CLAP model)
clip — CLIPConfig (CLIP model)
clip_text_model — CLIPTextConfig (CLIPTextModel model)
clip_vision_model — CLIPVisionConfig (CLIPVisionModel model)
clipseg — CLIPSegConfig (CLIPSeg model)
clvp — ClvpConfig (CLVP model)
code_llama — LlamaConfig (CodeLlama model)
codegen — CodeGenConfig (CodeGen model)
cohere — CohereConfig (Cohere model)
cohere2 — Cohere2Config (Cohere2 model)
colpali — ColPaliConfig (ColPali model)
conditional_detr — ConditionalDetrConfig (Conditional DETR model)
convbert — ConvBertConfig (ConvBERT model)
convnext — ConvNextConfig (ConvNeXT model)
convnextv2 — ConvNextV2Config (ConvNeXTV2 model)
cpmant — CpmAntConfig (CPM-Ant model)
ctrl — CTRLConfig (CTRL model)
cvt — CvtConfig (CvT model)
dab-detr — DabDetrConfig (DAB-DETR model)
dac — DacConfig (DAC model)
data2vec-audio — Data2VecAudioConfig (Data2VecAudio model)
data2vec-text — Data2VecTextConfig (Data2VecText model)
data2vec-vision — Data2VecVisionConfig (Data2VecVision model)
dbrx — DbrxConfig (DBRX model)
deberta — DebertaConfig (DeBERTa model)
deberta-v2 — DebertaV2Config (DeBERTa-v2 model)
decision_transformer — DecisionTransformerConfig (Decision Transformer model)
deepseek_v3 — DeepseekV3Config (DeepSeek-V3 model)
deformable_detr — DeformableDetrConfig (Deformable DETR model)
deit — DeiTConfig (DeiT model)
depth_anything — DepthAnythingConfig (Depth Anything model)
depth_pro — DepthProConfig (DepthPro model)
deta — DetaConfig (DETA model)
detr — DetrConfig (DETR model)
diffllama — DiffLlamaConfig (DiffLlama model)
dinat — DinatConfig (DiNAT model)
dinov2 — Dinov2Config (DINOv2 model)
dinov2_with_registers — Dinov2WithRegistersConfig (DINOv2 with Registers model)
distilbert — DistilBertConfig (DistilBERT model)
donut-swin — DonutSwinConfig (DonutSwin model)
dpr — DPRConfig (DPR model)
dpt — DPTConfig (DPT model)
efficientformer — EfficientFormerConfig (EfficientFormer model)
efficientnet — EfficientNetConfig (EfficientNet model)
electra — ElectraConfig (ELECTRA model)
emu3 — Emu3Config (Emu3 model)
encodec — EncodecConfig (EnCodec model)
encoder-decoder — EncoderDecoderConfig (Encoder decoder model)
ernie — ErnieConfig (ERNIE model)
ernie_m — ErnieMConfig (ErnieM model)
esm — EsmConfig (ESM model)
falcon — FalconConfig (Falcon model)
falcon_mamba — FalconMambaConfig (FalconMamba model)
fastspeech2_conformer — FastSpeech2ConformerConfig (FastSpeech2Conformer model)
flaubert — FlaubertConfig (FlauBERT model)
flava — FlavaConfig (FLAVA model)
fnet — FNetConfig (FNet model)
focalnet — FocalNetConfig (FocalNet model)
fsmt — FSMTConfig (FairSeq Machine-Translation model)
funnel — FunnelConfig (Funnel Transformer model)
fuyu — FuyuConfig (Fuyu model)
gemma — GemmaConfig (Gemma model)
gemma2 — Gemma2Config (Gemma2 model)
gemma3 — Gemma3Config (Gemma3ForConditionalGeneration model)
gemma3_text — Gemma3TextConfig (Gemma3ForCausalLM model)
git — GitConfig (GIT model)
glm — GlmConfig (GLM model)
glm4 — Glm4Config (glm4 model)
glpn — GLPNConfig (GLPN model)
got_ocr2 — GotOcr2Config (GOT-OCR2 model)
gpt-sw3 — GPT2Config (GPT-Sw3 model)
gpt2 — GPT2Config (OpenAI GPT-2 model)
gpt_bigcode — GPTBigCodeConfig (GPTBigCode model)
gpt_neo — GPTNeoConfig (GPT Neo model)
gpt_neox — GPTNeoXConfig (GPT NeoX model)
gpt_neox_japanese — GPTNeoXJapaneseConfig (GPT NeoX Japanese model)
gptj — GPTJConfig (GPT-J model)
gptsan-japanese — GPTSanJapaneseConfig (GPTSAN-japanese model)
granite — GraniteConfig (Granite model)
granitemoe — GraniteMoeConfig (GraniteMoeMoe model)
granitemoeshared — GraniteMoeSharedConfig (GraniteMoeSharedMoe model)
granitevision — LlavaNextConfig (LLaVA-NeXT model)
graphormer — GraphormerConfig (Graphormer model)
grounding-dino — GroundingDinoConfig (Grounding DINO model)
groupvit — GroupViTConfig (GroupViT model)
helium — HeliumConfig (Helium model)
hiera — HieraConfig (Hiera model)
hubert — HubertConfig (Hubert model)
ibert — IBertConfig (I-BERT model)
idefics — IdeficsConfig (IDEFICS model)
idefics2 — Idefics2Config (Idefics2 model)
idefics3 — Idefics3Config (Idefics3 model)
idefics3_vision — Idefics3VisionConfig (Idefics3VisionTransformer model)
ijepa — IJepaConfig (I-JEPA model)
imagegpt — ImageGPTConfig (ImageGPT model)
informer — InformerConfig (Informer model)
instructblip — InstructBlipConfig (InstructBLIP model)
instructblipvideo — InstructBlipVideoConfig (InstructBlipVideo model)
jamba — JambaConfig (Jamba model)
jetmoe — JetMoeConfig (JetMoe model)
jukebox — JukeboxConfig (Jukebox model)
kosmos-2 — Kosmos2Config (KOSMOS-2 model)
layoutlm — LayoutLMConfig (LayoutLM model)
layoutlmv2 — LayoutLMv2Config (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3Config (LayoutLMv3 model)
led — LEDConfig (LED model)
levit — LevitConfig (LeViT model)
lilt — LiltConfig (LiLT model)
llama — LlamaConfig (LLaMA model)
llama4 — Llama4Config (Llama4 model)
llama4_text — Llama4TextConfig (Llama4ForCausalLM model)
llava — LlavaConfig (LLaVa model)
llava_next — LlavaNextConfig (LLaVA-NeXT model)
llava_next_video — LlavaNextVideoConfig (LLaVa-NeXT-Video model)
llava_onevision — LlavaOnevisionConfig (LLaVA-Onevision model)
longformer — LongformerConfig (Longformer model)
longt5 — LongT5Config (LongT5 model)
luke — LukeConfig (LUKE model)
lxmert — LxmertConfig (LXMERT model)
m2m_100 — M2M100Config (M2M100 model)
mamba — MambaConfig (Mamba model)
mamba2 — Mamba2Config (mamba2 model)
marian — MarianConfig (Marian model)
markuplm — MarkupLMConfig (MarkupLM model)
mask2former — Mask2FormerConfig (Mask2Former model)
maskformer — MaskFormerConfig (MaskFormer model)
maskformer-swin — MaskFormerSwinConfig (MaskFormerSwin model)
mbart — MBartConfig (mBART model)
mctct — MCTCTConfig (M-CTC-T model)
mega — MegaConfig (MEGA model)
megatron-bert — MegatronBertConfig (Megatron-BERT model)
mgp-str — MgpstrConfig (MGP-STR model)
mimi — MimiConfig (Mimi model)
mistral — MistralConfig (Mistral model)
mistral3 — Mistral3Config (Mistral3 model)
mixtral — MixtralConfig (Mixtral model)
mllama — MllamaConfig (Mllama model)
mobilebert — MobileBertConfig (MobileBERT model)
mobilenet_v1 — MobileNetV1Config (MobileNetV1 model)
mobilenet_v2 — MobileNetV2Config (MobileNetV2 model)
mobilevit — MobileViTConfig (MobileViT model)
mobilevitv2 — MobileViTV2Config (MobileViTV2 model)
modernbert — ModernBertConfig (ModernBERT model)
moonshine — MoonshineConfig (Moonshine model)
moshi — MoshiConfig (Moshi model)
mpnet — MPNetConfig (MPNet model)
mpt — MptConfig (MPT model)
mra — MraConfig (MRA model)
mt5 — MT5Config (MT5 model)
musicgen — MusicgenConfig (MusicGen model)
musicgen_melody — MusicgenMelodyConfig (MusicGen Melody model)
mvp — MvpConfig (MVP model)
nat — NatConfig (NAT model)
nemotron — NemotronConfig (Nemotron model)
nezha — NezhaConfig (Nezha model)
nllb-moe — NllbMoeConfig (NLLB-MOE model)
nougat — VisionEncoderDecoderConfig (Nougat model)
nystromformer — NystromformerConfig (Nyströmformer model)
olmo — OlmoConfig (OLMo model)
olmo2 — Olmo2Config (OLMo2 model)
olmoe — OlmoeConfig (OLMoE model)
omdet-turbo — OmDetTurboConfig (OmDet-Turbo model)
oneformer — OneFormerConfig (OneFormer model)
open-llama — OpenLlamaConfig (OpenLlama model)
openai-gpt — OpenAIGPTConfig (OpenAI GPT model)
opt — OPTConfig (OPT model)
owlv2 — Owlv2Config (OWLv2 model)
owlvit — OwlViTConfig (OWL-ViT model)
paligemma — PaliGemmaConfig (PaliGemma model)
patchtsmixer — PatchTSMixerConfig (PatchTSMixer model)
patchtst — PatchTSTConfig (PatchTST model)
pegasus — PegasusConfig (Pegasus model)
pegasus_x — PegasusXConfig (PEGASUS-X model)
perceiver — PerceiverConfig (Perceiver model)
persimmon — PersimmonConfig (Persimmon model)
phi — PhiConfig (Phi model)
phi3 — Phi3Config (Phi3 model)
phi4_multimodal — Phi4MultimodalConfig (Phi4Multimodal model)
phimoe — PhimoeConfig (Phimoe model)
pix2struct — Pix2StructConfig (Pix2Struct model)
pixtral — PixtralVisionConfig (Pixtral model)
plbart — PLBartConfig (PLBart model)
poolformer — PoolFormerConfig (PoolFormer model)
pop2piano — Pop2PianoConfig (Pop2Piano model)
prompt_depth_anything — PromptDepthAnythingConfig (PromptDepthAnything model)
prophetnet — ProphetNetConfig (ProphetNet model)
pvt — PvtConfig (PVT model)
pvt_v2 — PvtV2Config (PVTv2 model)
qdqbert — QDQBertConfig (QDQBert model)
qwen2 — Qwen2Config (Qwen2 model)
qwen2_5_vl — Qwen2_5_VLConfig (Qwen2_5_VL model)
qwen2_audio — Qwen2AudioConfig (Qwen2Audio model)
qwen2_audio_encoder — Qwen2AudioEncoderConfig (Qwen2AudioEncoder model)
qwen2_moe — Qwen2MoeConfig (Qwen2MoE model)
qwen2_vl — Qwen2VLConfig (Qwen2VL model)
qwen3 — Qwen3Config (Qwen3 model)
qwen3_moe — Qwen3MoeConfig (Qwen3MoE model)
rag — RagConfig (RAG model)
realm — RealmConfig (REALM model)
recurrent_gemma — RecurrentGemmaConfig (RecurrentGemma model)
reformer — ReformerConfig (Reformer model)
regnet — RegNetConfig (RegNet model)
rembert — RemBertConfig (RemBERT model)
resnet — ResNetConfig (ResNet model)
retribert — RetriBertConfig (RetriBERT model)
roberta — RobertaConfig (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormConfig (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertConfig (RoCBert model)
roformer — RoFormerConfig (RoFormer model)
rt_detr — RTDetrConfig (RT-DETR model)
rt_detr_resnet — RTDetrResNetConfig (RT-DETR-ResNet model)
rt_detr_v2 — RTDetrV2Config (RT-DETRv2 model)
rwkv — RwkvConfig (RWKV model)
sam — SamConfig (SAM model)
sam_vision_model — SamVisionConfig (SamVisionModel model)
seamless_m4t — SeamlessM4TConfig (SeamlessM4T model)
seamless_m4t_v2 — SeamlessM4Tv2Config (SeamlessM4Tv2 model)
segformer — SegformerConfig (SegFormer model)
seggpt — SegGptConfig (SegGPT model)
sew — SEWConfig (SEW model)
sew-d — SEWDConfig (SEW-D model)
shieldgemma2 — ShieldGemma2Config (Shieldgemma2 model)
siglip — SiglipConfig (SigLIP model)
siglip2 — Siglip2Config (SigLIP2 model)
siglip_vision_model — SiglipVisionConfig (SiglipVisionModel model)
smolvlm — SmolVLMConfig (SmolVLM model)
smolvlm_vision — SmolVLMVisionConfig (SmolVLMVisionTransformer model)
speech-encoder-decoder — SpeechEncoderDecoderConfig (Speech Encoder decoder model)
speech_to_text — Speech2TextConfig (Speech2Text model)
speech_to_text_2 — Speech2Text2Config (Speech2Text2 model)
speecht5 — SpeechT5Config (SpeechT5 model)
splinter — SplinterConfig (Splinter model)
squeezebert — SqueezeBertConfig (SqueezeBERT model)
stablelm — StableLmConfig (StableLm model)
starcoder2 — Starcoder2Config (Starcoder2 model)
superglue — SuperGlueConfig (SuperGlue model)
superpoint — SuperPointConfig (SuperPoint model)
swiftformer — SwiftFormerConfig (SwiftFormer model)
swin — SwinConfig (Swin Transformer model)
swin2sr — Swin2SRConfig (Swin2SR model)
swinv2 — Swinv2Config (Swin Transformer V2 model)
switch_transformers — SwitchTransformersConfig (SwitchTransformers model)
t5 — T5Config (T5 model)
table-transformer — TableTransformerConfig (Table Transformer model)
tapas — TapasConfig (TAPAS model)
textnet — TextNetConfig (TextNet model)
time_series_transformer — TimeSeriesTransformerConfig (Time Series Transformer model)
timesformer — TimesformerConfig (TimeSformer model)
timm_backbone — TimmBackboneConfig (TimmBackbone model)
timm_wrapper — TimmWrapperConfig (TimmWrapperModel model)
trajectory_transformer — TrajectoryTransformerConfig (Trajectory Transformer model)
transfo-xl — TransfoXLConfig (Transformer-XL model)
trocr — TrOCRConfig (TrOCR model)
tvlt — TvltConfig (TVLT model)
tvp — TvpConfig (TVP model)
udop — UdopConfig (UDOP model)
umt5 — UMT5Config (UMT5 model)
unispeech — UniSpeechConfig (UniSpeech model)
unispeech-sat — UniSpeechSatConfig (UniSpeechSat model)
univnet — UnivNetConfig (UnivNet model)
upernet — UperNetConfig (UPerNet model)
van — VanConfig (VAN model)
video_llava — VideoLlavaConfig (VideoLlava model)
videomae — VideoMAEConfig (VideoMAE model)
vilt — ViltConfig (ViLT model)
vipllava — VipLlavaConfig (VipLlava model)
vision-encoder-decoder — VisionEncoderDecoderConfig (Vision Encoder decoder model)
vision-text-dual-encoder — VisionTextDualEncoderConfig (VisionTextDualEncoder model)
visual_bert — VisualBertConfig (VisualBERT model)
vit — ViTConfig (ViT model)
vit_hybrid — ViTHybridConfig (ViT Hybrid model)
vit_mae — ViTMAEConfig (ViTMAE model)
vit_msn — ViTMSNConfig (ViTMSN model)
vitdet — VitDetConfig (VitDet model)
vitmatte — VitMatteConfig (ViTMatte model)
vitpose — VitPoseConfig (ViTPose model)
vitpose_backbone — VitPoseBackboneConfig (ViTPoseBackbone model)
vits — VitsConfig (VITS model)
vivit — VivitConfig (ViViT model)
wav2vec2 — Wav2Vec2Config (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2BertConfig (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2ConformerConfig (Wav2Vec2-Conformer model)
wavlm — WavLMConfig (WavLM model)
whisper — WhisperConfig (Whisper model)
xclip — XCLIPConfig (X-CLIP model)
xglm — XGLMConfig (XGLM model)
xlm — XLMConfig (XLM model)
xlm-prophetnet — XLMProphetNetConfig (XLM-ProphetNet model)
xlm-roberta — XLMRobertaConfig (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLConfig (XLM-RoBERTa-XL model)
xlnet — XLNetConfig (XLNet model)
xmod — XmodConfig (X-MOD model)
yolos — YolosConfig (YOLOS model)
yoso — YosoConfig (YOSO model)
zamba — ZambaConfig (Zamba model)
zamba2 — Zamba2Config (Zamba2 model)
zoedepth — ZoeDepthConfig (ZoeDepth model)

Examples:

>>> from transformers import AutoConfig

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased")

>>> 
>>> config = AutoConfig.from_pretrained("dbmdz/bert-base-german-cased")

>>> 
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/")

>>> 
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/my_configuration.json")

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False)
>>> config.output_attentions
True

>>> config, unused_kwargs = AutoConfig.from_pretrained(
...     "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True
... )
>>> config.output_attentions
True

>>> unused_kwargs
{'foo': False}

register < source >

( model_type config exist_ok = False )

Parameters

model_type (str) — The model type like “bert” or “gpt”.
config (PretrainedConfig) — The config to register.

AutoTokenizer

This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained < source >

( pretrained_model_name_or_path *inputs **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a predefined tokenizer hosted inside a model repo on huggingface.co. A path to a directory containing vocabulary files required by the tokenizer, for instance saved using the save_pretrained() method, e.g., ./my_model_directory/. A path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.: ./my_model_directory/vocab.txt. (Not applicable to all derived classes)
inputs (additional positional arguments, optional) — Will be passed along to the Tokenizer __init__() method.
config (PretrainedConfig, optional) — The configuration object used to determine the tokenizer class to instantiate.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
subfolder (str, optional) — In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here.
use_fast (bool, optional, defaults to True) — Use a fast Rust-based tokenizer if it is supported for a given model. If a fast tokenizer is not available for a given model, a normal Python-based tokenizer is returned instead.
tokenizer_type (str, optional) — Tokenizer type to be loaded.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
kwargs (additional keyword arguments, optional) — Will be passed to the Tokenizer __init__() method. Can be used to set special tokens like bos_token, eos_token, unk_token, sep_token, pad_token, cls_token, mask_token, additional_special_tokens. See parameters in the __init__() for more details.

Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.

The tokenizer class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

albert — AlbertTokenizer or AlbertTokenizerFast (ALBERT model)
align — BertTokenizer or BertTokenizerFast (ALIGN model)
aria — LlamaTokenizer or LlamaTokenizerFast (Aria model)
aya_vision — CohereTokenizerFast (AyaVision model)
bark — BertTokenizer or BertTokenizerFast (Bark model)
bart — BartTokenizer or BartTokenizerFast (BART model)
barthez — BarthezTokenizer or BarthezTokenizerFast (BARThez model)
bartpho — BartphoTokenizer (BARTpho model)
bert — BertTokenizer or BertTokenizerFast (BERT model)
bert-generation — BertGenerationTokenizer (Bert Generation model)
bert-japanese — BertJapaneseTokenizer (BertJapanese model)
bertweet — BertweetTokenizer (BERTweet model)
big_bird — BigBirdTokenizer or BigBirdTokenizerFast (BigBird model)
bigbird_pegasus — PegasusTokenizer or PegasusTokenizerFast (BigBird-Pegasus model)
biogpt — BioGptTokenizer (BioGpt model)
blenderbot — BlenderbotTokenizer or BlenderbotTokenizerFast (Blenderbot model)
blenderbot-small — BlenderbotSmallTokenizer (BlenderbotSmall model)
blip — BertTokenizer or BertTokenizerFast (BLIP model)
blip-2 — GPT2Tokenizer or GPT2TokenizerFast (BLIP-2 model)
bloom — BloomTokenizerFast (BLOOM model)
bridgetower — RobertaTokenizer or RobertaTokenizerFast (BridgeTower model)
bros — BertTokenizer or BertTokenizerFast (BROS model)
byt5 — ByT5Tokenizer (ByT5 model)
camembert — CamembertTokenizer or CamembertTokenizerFast (CamemBERT model)
canine — CanineTokenizer (CANINE model)
chameleon — LlamaTokenizer or LlamaTokenizerFast (Chameleon model)
chinese_clip — BertTokenizer or BertTokenizerFast (Chinese-CLIP model)
clap — RobertaTokenizer or RobertaTokenizerFast (CLAP model)
clip — CLIPTokenizer or CLIPTokenizerFast (CLIP model)
clipseg — CLIPTokenizer or CLIPTokenizerFast (CLIPSeg model)
clvp — ClvpTokenizer (CLVP model)
code_llama — CodeLlamaTokenizer or CodeLlamaTokenizerFast (CodeLlama model)
codegen — CodeGenTokenizer or CodeGenTokenizerFast (CodeGen model)
cohere — CohereTokenizerFast (Cohere model)
cohere2 — CohereTokenizerFast (Cohere2 model)
colpali — LlamaTokenizer or LlamaTokenizerFast (ColPali model)
convbert — ConvBertTokenizer or ConvBertTokenizerFast (ConvBERT model)
cpm — CpmTokenizer or CpmTokenizerFast (CPM model)
cpmant — CpmAntTokenizer (CPM-Ant model)
ctrl — CTRLTokenizer (CTRL model)
data2vec-audio — Wav2Vec2CTCTokenizer (Data2VecAudio model)
data2vec-text — RobertaTokenizer or RobertaTokenizerFast (Data2VecText model)
dbrx — GPT2Tokenizer or GPT2TokenizerFast (DBRX model)
deberta — DebertaTokenizer or DebertaTokenizerFast (DeBERTa model)
deberta-v2 — DebertaV2Tokenizer or DebertaV2TokenizerFast (DeBERTa-v2 model)
deepseek_v3 — LlamaTokenizer or LlamaTokenizerFast (DeepSeek-V3 model)
diffllama — LlamaTokenizer or LlamaTokenizerFast (DiffLlama model)
distilbert — DistilBertTokenizer or DistilBertTokenizerFast (DistilBERT model)
dpr — DPRQuestionEncoderTokenizer or DPRQuestionEncoderTokenizerFast (DPR model)
electra — ElectraTokenizer or ElectraTokenizerFast (ELECTRA model)
emu3 — GPT2Tokenizer or GPT2TokenizerFast (Emu3 model)
ernie — BertTokenizer or BertTokenizerFast (ERNIE model)
ernie_m — ErnieMTokenizer (ErnieM model)
esm — EsmTokenizer (ESM model)
falcon — PreTrainedTokenizerFast (Falcon model)
falcon_mamba — GPTNeoXTokenizerFast (FalconMamba model)
fastspeech2_conformer — (FastSpeech2Conformer model)
flaubert — FlaubertTokenizer (FlauBERT model)
fnet — FNetTokenizer or FNetTokenizerFast (FNet model)
fsmt — FSMTTokenizer (FairSeq Machine-Translation model)
funnel — FunnelTokenizer or FunnelTokenizerFast (Funnel Transformer model)
gemma — GemmaTokenizer or GemmaTokenizerFast (Gemma model)
gemma2 — GemmaTokenizer or GemmaTokenizerFast (Gemma2 model)
gemma3 — GemmaTokenizer or GemmaTokenizerFast (Gemma3ForConditionalGeneration model)
gemma3_text — GemmaTokenizer or GemmaTokenizerFast (Gemma3ForCausalLM model)
git — BertTokenizer or BertTokenizerFast (GIT model)
glm — PreTrainedTokenizerFast (GLM model)
glm4 — PreTrainedTokenizerFast (glm4 model)
gpt-sw3 — GPTSw3Tokenizer (GPT-Sw3 model)
gpt2 — GPT2Tokenizer or GPT2TokenizerFast (OpenAI GPT-2 model)
gpt_bigcode — GPT2Tokenizer or GPT2TokenizerFast (GPTBigCode model)
gpt_neo — GPT2Tokenizer or GPT2TokenizerFast (GPT Neo model)
gpt_neox — GPTNeoXTokenizerFast (GPT NeoX model)
gpt_neox_japanese — GPTNeoXJapaneseTokenizer (GPT NeoX Japanese model)
gptj — GPT2Tokenizer or GPT2TokenizerFast (GPT-J model)
gptsan-japanese — GPTSanJapaneseTokenizer (GPTSAN-japanese model)
grounding-dino — BertTokenizer or BertTokenizerFast (Grounding DINO model)
groupvit — CLIPTokenizer or CLIPTokenizerFast (GroupViT model)
helium — PreTrainedTokenizerFast (Helium model)
herbert — HerbertTokenizer or HerbertTokenizerFast (HerBERT model)
hubert — Wav2Vec2CTCTokenizer (Hubert model)
ibert — RobertaTokenizer or RobertaTokenizerFast (I-BERT model)
idefics — LlamaTokenizerFast (IDEFICS model)
idefics2 — LlamaTokenizer or LlamaTokenizerFast (Idefics2 model)
idefics3 — LlamaTokenizer or LlamaTokenizerFast (Idefics3 model)
instructblip — GPT2Tokenizer or GPT2TokenizerFast (InstructBLIP model)
instructblipvideo — GPT2Tokenizer or GPT2TokenizerFast (InstructBlipVideo model)
jamba — LlamaTokenizer or LlamaTokenizerFast (Jamba model)
jetmoe — LlamaTokenizer or LlamaTokenizerFast (JetMoe model)
jukebox — JukeboxTokenizer (Jukebox model)
kosmos-2 — XLMRobertaTokenizer or XLMRobertaTokenizerFast (KOSMOS-2 model)
layoutlm — LayoutLMTokenizer or LayoutLMTokenizerFast (LayoutLM model)
layoutlmv2 — LayoutLMv2Tokenizer or LayoutLMv2TokenizerFast (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3Tokenizer or LayoutLMv3TokenizerFast (LayoutLMv3 model)
layoutxlm — LayoutXLMTokenizer or LayoutXLMTokenizerFast (LayoutXLM model)
led — LEDTokenizer or LEDTokenizerFast (LED model)
lilt — LayoutLMv3Tokenizer or LayoutLMv3TokenizerFast (LiLT model)
llama — LlamaTokenizer or LlamaTokenizerFast (LLaMA model)
llama4 — LlamaTokenizer or LlamaTokenizerFast (Llama4 model)
llama4_text — LlamaTokenizer or LlamaTokenizerFast (Llama4ForCausalLM model)
llava — LlamaTokenizer or LlamaTokenizerFast (LLaVa model)
llava_next — LlamaTokenizer or LlamaTokenizerFast (LLaVA-NeXT model)
llava_next_video — LlamaTokenizer or LlamaTokenizerFast (LLaVa-NeXT-Video model)
llava_onevision — LlamaTokenizer or LlamaTokenizerFast (LLaVA-Onevision model)
longformer — LongformerTokenizer or LongformerTokenizerFast (Longformer model)
longt5 — T5Tokenizer or T5TokenizerFast (LongT5 model)
luke — LukeTokenizer (LUKE model)
lxmert — LxmertTokenizer or LxmertTokenizerFast (LXMERT model)
m2m_100 — M2M100Tokenizer (M2M100 model)
mamba — GPTNeoXTokenizerFast (Mamba model)
mamba2 — GPTNeoXTokenizerFast (mamba2 model)
marian — MarianTokenizer (Marian model)
mbart — MBartTokenizer or MBartTokenizerFast (mBART model)
mbart50 — MBart50Tokenizer or MBart50TokenizerFast (mBART-50 model)
mega — RobertaTokenizer or RobertaTokenizerFast (MEGA model)
megatron-bert — BertTokenizer or BertTokenizerFast (Megatron-BERT model)
mgp-str — MgpstrTokenizer (MGP-STR model)
mistral — LlamaTokenizer or LlamaTokenizerFast (Mistral model)
mixtral — LlamaTokenizer or LlamaTokenizerFast (Mixtral model)
mllama — LlamaTokenizer or LlamaTokenizerFast (Mllama model)
mluke — MLukeTokenizer (mLUKE model)
mobilebert — MobileBertTokenizer or MobileBertTokenizerFast (MobileBERT model)
modernbert — PreTrainedTokenizerFast (ModernBERT model)
moonshine — PreTrainedTokenizerFast (Moonshine model)
moshi — PreTrainedTokenizerFast (Moshi model)
mpnet — MPNetTokenizer or MPNetTokenizerFast (MPNet model)
mpt — GPTNeoXTokenizerFast (MPT model)
mra — RobertaTokenizer or RobertaTokenizerFast (MRA model)
mt5 — MT5Tokenizer or MT5TokenizerFast (MT5 model)
musicgen — T5Tokenizer or T5TokenizerFast (MusicGen model)
musicgen_melody — T5Tokenizer or T5TokenizerFast (MusicGen Melody model)
mvp — MvpTokenizer or MvpTokenizerFast (MVP model)
myt5 — MyT5Tokenizer (myt5 model)
nemotron — PreTrainedTokenizerFast (Nemotron model)
nezha — BertTokenizer or BertTokenizerFast (Nezha model)
nllb — NllbTokenizer or NllbTokenizerFast (NLLB model)
nllb-moe — NllbTokenizer or NllbTokenizerFast (NLLB-MOE model)
nystromformer — AlbertTokenizer or AlbertTokenizerFast (Nyströmformer model)
olmo — GPTNeoXTokenizerFast (OLMo model)
olmo2 — GPTNeoXTokenizerFast (OLMo2 model)
olmoe — GPTNeoXTokenizerFast (OLMoE model)
omdet-turbo — CLIPTokenizer or CLIPTokenizerFast (OmDet-Turbo model)
oneformer — CLIPTokenizer or CLIPTokenizerFast (OneFormer model)
openai-gpt — OpenAIGPTTokenizer or OpenAIGPTTokenizerFast (OpenAI GPT model)
opt — GPT2Tokenizer or GPT2TokenizerFast (OPT model)
owlv2 — CLIPTokenizer or CLIPTokenizerFast (OWLv2 model)
owlvit — CLIPTokenizer or CLIPTokenizerFast (OWL-ViT model)
paligemma — LlamaTokenizer or LlamaTokenizerFast (PaliGemma model)
pegasus — PegasusTokenizer or PegasusTokenizerFast (Pegasus model)
pegasus_x — PegasusTokenizer or PegasusTokenizerFast (PEGASUS-X model)
perceiver — PerceiverTokenizer (Perceiver model)
persimmon — LlamaTokenizer or LlamaTokenizerFast (Persimmon model)
phi — CodeGenTokenizer or CodeGenTokenizerFast (Phi model)
phi3 — LlamaTokenizer or LlamaTokenizerFast (Phi3 model)
phimoe — LlamaTokenizer or LlamaTokenizerFast (Phimoe model)
phobert — PhobertTokenizer (PhoBERT model)
pix2struct — T5Tokenizer or T5TokenizerFast (Pix2Struct model)
pixtral — PreTrainedTokenizerFast (Pixtral model)
plbart — PLBartTokenizer (PLBart model)
prophetnet — ProphetNetTokenizer (ProphetNet model)
qdqbert — BertTokenizer or BertTokenizerFast (QDQBert model)
qwen2 — Qwen2Tokenizer or Qwen2TokenizerFast (Qwen2 model)
qwen2_5_vl — Qwen2Tokenizer or Qwen2TokenizerFast (Qwen2_5_VL model)
qwen2_audio — Qwen2Tokenizer or Qwen2TokenizerFast (Qwen2Audio model)
qwen2_moe — Qwen2Tokenizer or Qwen2TokenizerFast (Qwen2MoE model)
qwen2_vl — Qwen2Tokenizer or Qwen2TokenizerFast (Qwen2VL model)
qwen3 — Qwen2Tokenizer or Qwen2TokenizerFast (Qwen3 model)
qwen3_moe — Qwen2Tokenizer or Qwen2TokenizerFast (Qwen3MoE model)
rag — RagTokenizer (RAG model)
realm — RealmTokenizer or RealmTokenizerFast (REALM model)
recurrent_gemma — GemmaTokenizer or GemmaTokenizerFast (RecurrentGemma model)
reformer — ReformerTokenizer or ReformerTokenizerFast (Reformer model)
rembert — RemBertTokenizer or RemBertTokenizerFast (RemBERT model)
retribert — RetriBertTokenizer or RetriBertTokenizerFast (RetriBERT model)
roberta — RobertaTokenizer or RobertaTokenizerFast (RoBERTa model)
roberta-prelayernorm — RobertaTokenizer or RobertaTokenizerFast (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertTokenizer (RoCBert model)
roformer — RoFormerTokenizer or RoFormerTokenizerFast (RoFormer model)
rwkv — GPTNeoXTokenizerFast (RWKV model)
seamless_m4t — SeamlessM4TTokenizer or SeamlessM4TTokenizerFast (SeamlessM4T model)
seamless_m4t_v2 — SeamlessM4TTokenizer or SeamlessM4TTokenizerFast (SeamlessM4Tv2 model)
shieldgemma2 — GemmaTokenizer or GemmaTokenizerFast (Shieldgemma2 model)
siglip — SiglipTokenizer (SigLIP model)
siglip2 — GemmaTokenizer or GemmaTokenizerFast (SigLIP2 model)
speech_to_text — Speech2TextTokenizer (Speech2Text model)
speech_to_text_2 — Speech2Text2Tokenizer (Speech2Text2 model)
speecht5 — SpeechT5Tokenizer (SpeechT5 model)
splinter — SplinterTokenizer or SplinterTokenizerFast (Splinter model)
squeezebert — SqueezeBertTokenizer or SqueezeBertTokenizerFast (SqueezeBERT model)
stablelm — GPTNeoXTokenizerFast (StableLm model)
starcoder2 — GPT2Tokenizer or GPT2TokenizerFast (Starcoder2 model)
switch_transformers — T5Tokenizer or T5TokenizerFast (SwitchTransformers model)
t5 — T5Tokenizer or T5TokenizerFast (T5 model)
tapas — TapasTokenizer (TAPAS model)
tapex — TapexTokenizer (TAPEX model)
transfo-xl — TransfoXLTokenizer (Transformer-XL model)
tvp — BertTokenizer or BertTokenizerFast (TVP model)
udop — UdopTokenizer or UdopTokenizerFast (UDOP model)
umt5 — T5Tokenizer or T5TokenizerFast (UMT5 model)
video_llava — LlamaTokenizer or LlamaTokenizerFast (VideoLlava model)
vilt — BertTokenizer or BertTokenizerFast (ViLT model)
vipllava — LlamaTokenizer or LlamaTokenizerFast (VipLlava model)
visual_bert — BertTokenizer or BertTokenizerFast (VisualBERT model)
vits — VitsTokenizer (VITS model)
wav2vec2 — Wav2Vec2CTCTokenizer (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2CTCTokenizer (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2CTCTokenizer (Wav2Vec2-Conformer model)
wav2vec2_phoneme — Wav2Vec2PhonemeCTCTokenizer (Wav2Vec2Phoneme model)
whisper — WhisperTokenizer or WhisperTokenizerFast (Whisper model)
xclip — CLIPTokenizer or CLIPTokenizerFast (X-CLIP model)
xglm — XGLMTokenizer or XGLMTokenizerFast (XGLM model)
xlm — XLMTokenizer (XLM model)
xlm-prophetnet — XLMProphetNetTokenizer (XLM-ProphetNet model)
xlm-roberta — XLMRobertaTokenizer or XLMRobertaTokenizerFast (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaTokenizer or XLMRobertaTokenizerFast (XLM-RoBERTa-XL model)
xlnet — XLNetTokenizer or XLNetTokenizerFast (XLNet model)
xmod — XLMRobertaTokenizer or XLMRobertaTokenizerFast (X-MOD model)
yoso — AlbertTokenizer or AlbertTokenizerFast (YOSO model)
zamba — LlamaTokenizer or LlamaTokenizerFast (Zamba model)
zamba2 — LlamaTokenizer or LlamaTokenizerFast (Zamba2 model)

Examples:

>>> from transformers import AutoTokenizer

>>> 
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")

>>> 
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")

>>> 
>>> 

>>> 
>>> tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_prefix_space=True)

register < source >

( config_class slow_tokenizer_class = None fast_tokenizer_class = None exist_ok = False )

Parameters

config_class (PretrainedConfig) — The configuration corresponding to the model to register.
slow_tokenizer_class (PretrainedTokenizer, optional) — The slow tokenizer to register.
fast_tokenizer_class (PretrainedTokenizerFast, optional) — The fast tokenizer to register.

AutoFeatureExtractor

This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the library when created with the AutoFeatureExtractor.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

( pretrained_model_name_or_path **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — This can be either: a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co. a path to a directory containing a feature extractor file saved using the save_pretrained() method, e.g., ./my_model_directory/. a path or url to a saved feature extractor JSON file, e.g., ./my_model_directory/preprocessor_config.json.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.
force_download (bool, optional, defaults to False) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
token (str or bool, optional) — The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running huggingface-cli login (stored in ~/.huggingface).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
return_unused_kwargs (bool, optional, defaults to False) — If False, then this function returns just the final feature extractor object. If True, then this functions returns a Tuple(feature_extractor, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of kwargs which has not been used to update feature_extractor and is otherwise ignored.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
kwargs (Dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by the return_unused_kwargs keyword parameter.

Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.

The feature extractor class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

audio-spectrogram-transformer — ASTFeatureExtractor (Audio Spectrogram Transformer model)
beit — BeitFeatureExtractor (BEiT model)
chinese_clip — ChineseCLIPFeatureExtractor (Chinese-CLIP model)
clap — ClapFeatureExtractor (CLAP model)
clip — CLIPFeatureExtractor (CLIP model)
clipseg — ViTFeatureExtractor (CLIPSeg model)
clvp — ClvpFeatureExtractor (CLVP model)
conditional_detr — ConditionalDetrFeatureExtractor (Conditional DETR model)
convnext — ConvNextFeatureExtractor (ConvNeXT model)
cvt — ConvNextFeatureExtractor (CvT model)
dac — DacFeatureExtractor (DAC model)
data2vec-audio — Wav2Vec2FeatureExtractor (Data2VecAudio model)
data2vec-vision — BeitFeatureExtractor (Data2VecVision model)
deformable_detr — DeformableDetrFeatureExtractor (Deformable DETR model)
deit — DeiTFeatureExtractor (DeiT model)
detr — DetrFeatureExtractor (DETR model)
dinat — ViTFeatureExtractor (DiNAT model)
donut-swin — DonutFeatureExtractor (DonutSwin model)
dpt — DPTFeatureExtractor (DPT model)
encodec — EncodecFeatureExtractor (EnCodec model)
flava — FlavaFeatureExtractor (FLAVA model)
glpn — GLPNFeatureExtractor (GLPN model)
groupvit — CLIPFeatureExtractor (GroupViT model)
hubert — Wav2Vec2FeatureExtractor (Hubert model)
imagegpt — ImageGPTFeatureExtractor (ImageGPT model)
layoutlmv2 — LayoutLMv2FeatureExtractor (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3FeatureExtractor (LayoutLMv3 model)
levit — LevitFeatureExtractor (LeViT model)
maskformer — MaskFormerFeatureExtractor (MaskFormer model)
mctct — MCTCTFeatureExtractor (M-CTC-T model)
mimi — EncodecFeatureExtractor (Mimi model)
mobilenet_v1 — MobileNetV1FeatureExtractor (MobileNetV1 model)
mobilenet_v2 — MobileNetV2FeatureExtractor (MobileNetV2 model)
mobilevit — MobileViTFeatureExtractor (MobileViT model)
moonshine — Wav2Vec2FeatureExtractor (Moonshine model)
moshi — EncodecFeatureExtractor (Moshi model)
nat — ViTFeatureExtractor (NAT model)
owlvit — OwlViTFeatureExtractor (OWL-ViT model)
perceiver — PerceiverFeatureExtractor (Perceiver model)
phi4_multimodal — Phi4MultimodalFeatureExtractor (Phi4Multimodal model)
poolformer — PoolFormerFeatureExtractor (PoolFormer model)
pop2piano — Pop2PianoFeatureExtractor (Pop2Piano model)
regnet — ConvNextFeatureExtractor (RegNet model)
resnet — ConvNextFeatureExtractor (ResNet model)
seamless_m4t — SeamlessM4TFeatureExtractor (SeamlessM4T model)
seamless_m4t_v2 — SeamlessM4TFeatureExtractor (SeamlessM4Tv2 model)
segformer — SegformerFeatureExtractor (SegFormer model)
sew — Wav2Vec2FeatureExtractor (SEW model)
sew-d — Wav2Vec2FeatureExtractor (SEW-D model)
speech_to_text — Speech2TextFeatureExtractor (Speech2Text model)
speecht5 — SpeechT5FeatureExtractor (SpeechT5 model)
swiftformer — ViTFeatureExtractor (SwiftFormer model)
swin — ViTFeatureExtractor (Swin Transformer model)
swinv2 — ViTFeatureExtractor (Swin Transformer V2 model)
table-transformer — DetrFeatureExtractor (Table Transformer model)
timesformer — VideoMAEFeatureExtractor (TimeSformer model)
tvlt — TvltFeatureExtractor (TVLT model)
unispeech — Wav2Vec2FeatureExtractor (UniSpeech model)
unispeech-sat — Wav2Vec2FeatureExtractor (UniSpeechSat model)
univnet — UnivNetFeatureExtractor (UnivNet model)
van — ConvNextFeatureExtractor (VAN model)
videomae — VideoMAEFeatureExtractor (VideoMAE model)
vilt — ViltFeatureExtractor (ViLT model)
vit — ViTFeatureExtractor (ViT model)
vit_mae — ViTFeatureExtractor (ViTMAE model)
vit_msn — ViTFeatureExtractor (ViTMSN model)
wav2vec2 — Wav2Vec2FeatureExtractor (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2FeatureExtractor (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2FeatureExtractor (Wav2Vec2-Conformer model)
wavlm — Wav2Vec2FeatureExtractor (WavLM model)
whisper — WhisperFeatureExtractor (Whisper model)
xclip — CLIPFeatureExtractor (X-CLIP model)
yolos — YolosFeatureExtractor (YOLOS model)

Passing token=True is required when you want to use a private model.

Examples:

>>> from transformers import AutoFeatureExtractor

>>> 
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")

>>> 
>>>

( config_class feature_extractor_class exist_ok = False )

Parameters

config_class (PretrainedConfig) — The configuration corresponding to the model to register.
feature_extractor_class (FeatureExtractorMixin) — The feature extractor to register.

AutoImageProcessor class transformers.AutoImageProcessor < source >

( )

This is a generic image processor class that will be instantiated as one of the image processor classes of the library when created with the AutoImageProcessor.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained < source >

( pretrained_model_name_or_path *inputs **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — This can be either: a string, the model id of a pretrained image_processor hosted inside a model repo on huggingface.co. a path to a directory containing a image processor file saved using the save_pretrained() method, e.g., ./my_model_directory/. a path or url to a saved image processor JSON file, e.g., ./my_model_directory/preprocessor_config.json.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used.
force_download (bool, optional, defaults to False) — Whether or not to force to (re-)download the image processor files and override the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
token (str or bool, optional) — The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running huggingface-cli login (stored in ~/.huggingface).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
use_fast (bool, optional, defaults to False) — Use a fast torchvision-base image processor if it is supported for a given model. If a fast image processor is not available for a given model, a normal numpy-based image processor is returned instead.
return_unused_kwargs (bool, optional, defaults to False) — If False, then this function returns just the final image processor object. If True, then this functions returns a Tuple(image_processor, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part of kwargs which has not been used to update image_processor and is otherwise ignored.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
image_processor_filename (str, optional, defaults to "config.json") — The name of the file in the model directory to use for the image processor config.
kwargs (Dict[str, Any], optional) — The values in kwargs of any keys which are image processor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not image processor attributes is controlled by the return_unused_kwargs keyword parameter.

Instantiate one of the image processor classes of the library from a pretrained model vocabulary.

The image processor class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

align — EfficientNetImageProcessor (ALIGN model)
aria — AriaImageProcessor (Aria model)
beit — BeitImageProcessor (BEiT model)
bit — BitImageProcessor (BiT model)
blip — BlipImageProcessor or BlipImageProcessorFast (BLIP model)
blip-2 — BlipImageProcessor or BlipImageProcessorFast (BLIP-2 model)
bridgetower — BridgeTowerImageProcessor (BridgeTower model)
chameleon — ChameleonImageProcessor (Chameleon model)
chinese_clip — ChineseCLIPImageProcessor (Chinese-CLIP model)
clip — CLIPImageProcessor or CLIPImageProcessorFast (CLIP model)
clipseg — ViTImageProcessor or ViTImageProcessorFast (CLIPSeg model)
conditional_detr — ConditionalDetrImageProcessor (Conditional DETR model)
convnext — ConvNextImageProcessor or ConvNextImageProcessorFast (ConvNeXT model)
convnextv2 — ConvNextImageProcessor or ConvNextImageProcessorFast (ConvNeXTV2 model)
cvt — ConvNextImageProcessor or ConvNextImageProcessorFast (CvT model)
data2vec-vision — BeitImageProcessor (Data2VecVision model)
deformable_detr — DeformableDetrImageProcessor or DeformableDetrImageProcessorFast (Deformable DETR model)
deit — DeiTImageProcessor or DeiTImageProcessorFast (DeiT model)
depth_anything — DPTImageProcessor (Depth Anything model)
depth_pro — DepthProImageProcessor or DepthProImageProcessorFast (DepthPro model)
deta — DetaImageProcessor (DETA model)
detr — DetrImageProcessor or DetrImageProcessorFast (DETR model)
dinat — ViTImageProcessor or ViTImageProcessorFast (DiNAT model)
dinov2 — BitImageProcessor (DINOv2 model)
donut-swin — DonutImageProcessor (DonutSwin model)
dpt — DPTImageProcessor (DPT model)
efficientformer — EfficientFormerImageProcessor (EfficientFormer model)
efficientnet — EfficientNetImageProcessor (EfficientNet model)
flava — FlavaImageProcessor (FLAVA model)
focalnet — BitImageProcessor (FocalNet model)
fuyu — FuyuImageProcessor (Fuyu model)
gemma3 — Gemma3ImageProcessor or Gemma3ImageProcessorFast (Gemma3ForConditionalGeneration model)
git — CLIPImageProcessor or CLIPImageProcessorFast (GIT model)
glpn — GLPNImageProcessor (GLPN model)
got_ocr2 — GotOcr2ImageProcessor or GotOcr2ImageProcessorFast (GOT-OCR2 model)
grounding-dino — GroundingDinoImageProcessor (Grounding DINO model)
groupvit — CLIPImageProcessor or CLIPImageProcessorFast (GroupViT model)
hiera — BitImageProcessor (Hiera model)
idefics — IdeficsImageProcessor (IDEFICS model)
idefics2 — Idefics2ImageProcessor (Idefics2 model)
idefics3 — Idefics3ImageProcessor (Idefics3 model)
ijepa — ViTImageProcessor or ViTImageProcessorFast (I-JEPA model)
imagegpt — ImageGPTImageProcessor (ImageGPT model)
instructblip — BlipImageProcessor or BlipImageProcessorFast (InstructBLIP model)
instructblipvideo — InstructBlipVideoImageProcessor (InstructBlipVideo model)
kosmos-2 — CLIPImageProcessor or CLIPImageProcessorFast (KOSMOS-2 model)
layoutlmv2 — LayoutLMv2ImageProcessor (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3ImageProcessor (LayoutLMv3 model)
levit — LevitImageProcessor (LeViT model)
llama4 — Llama4ImageProcessor or Llama4ImageProcessorFast (Llama4 model)
llava — LlavaImageProcessor or LlavaImageProcessorFast (LLaVa model)
llava_next — LlavaNextImageProcessor or LlavaNextImageProcessorFast (LLaVA-NeXT model)
llava_next_video — LlavaNextVideoImageProcessor (LLaVa-NeXT-Video model)
llava_onevision — LlavaOnevisionImageProcessor or LlavaOnevisionImageProcessorFast (LLaVA-Onevision model)
mask2former — Mask2FormerImageProcessor (Mask2Former model)
maskformer — MaskFormerImageProcessor (MaskFormer model)
mgp-str — ViTImageProcessor or ViTImageProcessorFast (MGP-STR model)
mistral3 — PixtralImageProcessor or PixtralImageProcessorFast (Mistral3 model)
mllama — MllamaImageProcessor (Mllama model)
mobilenet_v1 — MobileNetV1ImageProcessor (MobileNetV1 model)
mobilenet_v2 — MobileNetV2ImageProcessor (MobileNetV2 model)
mobilevit — MobileViTImageProcessor (MobileViT model)
mobilevitv2 — MobileViTImageProcessor (MobileViTV2 model)
nat — ViTImageProcessor or ViTImageProcessorFast (NAT model)
nougat — NougatImageProcessor (Nougat model)
oneformer — OneFormerImageProcessor (OneFormer model)
owlv2 — Owlv2ImageProcessor (OWLv2 model)
owlvit — OwlViTImageProcessor (OWL-ViT model)
paligemma — SiglipImageProcessor or SiglipImageProcessorFast (PaliGemma model)
perceiver — PerceiverImageProcessor (Perceiver model)
phi4_multimodal — P or h (Phi4Multimodal model)
pix2struct — Pix2StructImageProcessor (Pix2Struct model)
pixtral — PixtralImageProcessor or PixtralImageProcessorFast (Pixtral model)
poolformer — PoolFormerImageProcessor (PoolFormer model)
prompt_depth_anything — PromptDepthAnythingImageProcessor (PromptDepthAnything model)
pvt — PvtImageProcessor (PVT model)
pvt_v2 — PvtImageProcessor (PVTv2 model)
qwen2_5_vl — Qwen2VLImageProcessor or Qwen2VLImageProcessorFast (Qwen2_5_VL model)
qwen2_vl — Qwen2VLImageProcessor or Qwen2VLImageProcessorFast (Qwen2VL model)
regnet — ConvNextImageProcessor or ConvNextImageProcessorFast (RegNet model)
resnet — ConvNextImageProcessor or ConvNextImageProcessorFast (ResNet model)
rt_detr — RTDetrImageProcessor or RTDetrImageProcessorFast (RT-DETR model)
sam — SamImageProcessor (SAM model)
segformer — SegformerImageProcessor (SegFormer model)
seggpt — SegGptImageProcessor (SegGPT model)
shieldgemma2 — Gemma3ImageProcessor or Gemma3ImageProcessorFast (Shieldgemma2 model)
siglip — SiglipImageProcessor or SiglipImageProcessorFast (SigLIP model)
siglip2 — Siglip2ImageProcessor or Siglip2ImageProcessorFast (SigLIP2 model)
superglue — SuperGlueImageProcessor (SuperGlue model)
swiftformer — ViTImageProcessor or ViTImageProcessorFast (SwiftFormer model)
swin — ViTImageProcessor or ViTImageProcessorFast (Swin Transformer model)
swin2sr — Swin2SRImageProcessor (Swin2SR model)
swinv2 — ViTImageProcessor or ViTImageProcessorFast (Swin Transformer V2 model)
table-transformer — DetrImageProcessor (Table Transformer model)
timesformer — VideoMAEImageProcessor (TimeSformer model)
timm_wrapper — TimmWrapperImageProcessor (TimmWrapperModel model)
tvlt — TvltImageProcessor (TVLT model)
tvp — TvpImageProcessor (TVP model)
udop — LayoutLMv3ImageProcessor (UDOP model)
upernet — SegformerImageProcessor (UPerNet model)
van — ConvNextImageProcessor or ConvNextImageProcessorFast (VAN model)
videomae — VideoMAEImageProcessor (VideoMAE model)
vilt — ViltImageProcessor (ViLT model)
vipllava — CLIPImageProcessor or CLIPImageProcessorFast (VipLlava model)
vit — ViTImageProcessor or ViTImageProcessorFast (ViT model)
vit_hybrid — ViTHybridImageProcessor (ViT Hybrid model)
vit_mae — ViTImageProcessor or ViTImageProcessorFast (ViTMAE model)
vit_msn — ViTImageProcessor or ViTImageProcessorFast (ViTMSN model)
vitmatte — VitMatteImageProcessor (ViTMatte model)
xclip — CLIPImageProcessor or CLIPImageProcessorFast (X-CLIP model)
yolos — YolosImageProcessor (YOLOS model)
zoedepth — ZoeDepthImageProcessor (ZoeDepth model)

Passing token=True is required when you want to use a private model.

Examples:

>>> from transformers import AutoImageProcessor

>>> 
>>> image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")

>>> 
>>>

register < source >

( config_class image_processor_class = None slow_image_processor_class = None fast_image_processor_class = None exist_ok = False )

Parameters

config_class (PretrainedConfig) — The configuration corresponding to the model to register.
image_processor_class (ImageProcessingMixin) — The image processor to register.

AutoProcessor

This is a generic processor class that will be instantiated as one of the processor classes of the library when created with the AutoProcessor.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained < source >

( pretrained_model_name_or_path **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — This can be either: a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co. a path to a directory containing a processor files saved using the save_pretrained() method, e.g., ./my_model_directory/.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.
force_download (bool, optional, defaults to False) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
token (str or bool, optional) — The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running huggingface-cli login (stored in ~/.huggingface).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
return_unused_kwargs (bool, optional, defaults to False) — If False, then this function returns just the final feature extractor object. If True, then this functions returns a Tuple(feature_extractor, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of kwargs which has not been used to update feature_extractor and is otherwise ignored.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
kwargs (Dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by the return_unused_kwargs keyword parameter.

Instantiate one of the processor classes of the library from a pretrained model vocabulary.

The processor class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible):

align — AlignProcessor (ALIGN model)
altclip — AltCLIPProcessor (AltCLIP model)
aria — AriaProcessor (Aria model)
aya_vision — AyaVisionProcessor (AyaVision model)
bark — BarkProcessor (Bark model)
blip — BlipProcessor (BLIP model)
blip-2 — Blip2Processor (BLIP-2 model)
bridgetower — BridgeTowerProcessor (BridgeTower model)
chameleon — ChameleonProcessor (Chameleon model)
chinese_clip — ChineseCLIPProcessor (Chinese-CLIP model)
clap — ClapProcessor (CLAP model)
clip — CLIPProcessor (CLIP model)
clipseg — CLIPSegProcessor (CLIPSeg model)
clvp — ClvpProcessor (CLVP model)
colpali — ColPaliProcessor (ColPali model)
emu3 — Emu3Processor (Emu3 model)
flava — FlavaProcessor (FLAVA model)
fuyu — FuyuProcessor (Fuyu model)
gemma3 — Gemma3Processor (Gemma3ForConditionalGeneration model)
git — GitProcessor (GIT model)
got_ocr2 — GotOcr2Processor (GOT-OCR2 model)
grounding-dino — GroundingDinoProcessor (Grounding DINO model)
groupvit — CLIPProcessor (GroupViT model)
hubert — Wav2Vec2Processor (Hubert model)
idefics — IdeficsProcessor (IDEFICS model)
idefics2 — Idefics2Processor (Idefics2 model)
idefics3 — Idefics3Processor (Idefics3 model)
instructblip — InstructBlipProcessor (InstructBLIP model)
instructblipvideo — InstructBlipVideoProcessor (InstructBlipVideo model)
kosmos-2 — Kosmos2Processor (KOSMOS-2 model)
layoutlmv2 — LayoutLMv2Processor (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3Processor (LayoutLMv3 model)
llama4 — Llama4Processor (Llama4 model)
llava — LlavaProcessor (LLaVa model)
llava_next — LlavaNextProcessor (LLaVA-NeXT model)
llava_next_video — LlavaNextVideoProcessor (LLaVa-NeXT-Video model)
llava_onevision — LlavaOnevisionProcessor (LLaVA-Onevision model)
markuplm — MarkupLMProcessor (MarkupLM model)
mctct — MCTCTProcessor (M-CTC-T model)
mgp-str — MgpstrProcessor (MGP-STR model)
mistral3 — PixtralProcessor (Mistral3 model)
mllama — MllamaProcessor (Mllama model)
moonshine — Wav2Vec2Processor (Moonshine model)
oneformer — OneFormerProcessor (OneFormer model)
owlv2 — Owlv2Processor (OWLv2 model)
owlvit — OwlViTProcessor (OWL-ViT model)
paligemma — PaliGemmaProcessor (PaliGemma model)
phi4_multimodal — Phi4MultimodalProcessor (Phi4Multimodal model)
pix2struct — Pix2StructProcessor (Pix2Struct model)
pixtral — PixtralProcessor (Pixtral model)
pop2piano — Pop2PianoProcessor (Pop2Piano model)
qwen2_5_vl — Qwen2_5_VLProcessor (Qwen2_5_VL model)
qwen2_audio — Qwen2AudioProcessor (Qwen2Audio model)
qwen2_vl — Qwen2VLProcessor (Qwen2VL model)
sam — SamProcessor (SAM model)
seamless_m4t — SeamlessM4TProcessor (SeamlessM4T model)
sew — Wav2Vec2Processor (SEW model)
sew-d — Wav2Vec2Processor (SEW-D model)
shieldgemma2 — ShieldGemma2Processor (Shieldgemma2 model)
siglip — SiglipProcessor (SigLIP model)
siglip2 — Siglip2Processor (SigLIP2 model)
speech_to_text — Speech2TextProcessor (Speech2Text model)
speech_to_text_2 — Speech2Text2Processor (Speech2Text2 model)
speecht5 — SpeechT5Processor (SpeechT5 model)
trocr — TrOCRProcessor (TrOCR model)
tvlt — TvltProcessor (TVLT model)
tvp — TvpProcessor (TVP model)
udop — UdopProcessor (UDOP model)
unispeech — Wav2Vec2Processor (UniSpeech model)
unispeech-sat — Wav2Vec2Processor (UniSpeechSat model)
video_llava — VideoLlavaProcessor (VideoLlava model)
vilt — ViltProcessor (ViLT model)
vipllava — LlavaProcessor (VipLlava model)
vision-text-dual-encoder — VisionTextDualEncoderProcessor (VisionTextDualEncoder model)
wav2vec2 — Wav2Vec2Processor (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2Processor (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2Processor (Wav2Vec2-Conformer model)
wavlm — Wav2Vec2Processor (WavLM model)
whisper — WhisperProcessor (Whisper model)
xclip — XCLIPProcessor (X-CLIP model)

Passing token=True is required when you want to use a private model.

Examples:

>>> from transformers import AutoProcessor

>>> 
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")

>>> 
>>>

register < source >

( config_class processor_class exist_ok = False )

Parameters

config_class (PretrainedConfig) — The configuration corresponding to the model to register.
processor_class (ProcessorMixin) — The processor to register.

Generic model classes

The following auto classes are available for instantiating a base model class without a specific head.

AutoModel class transformers.AutoModel < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the base model classes of the library from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModel

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModel.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

albert — AlbertModel (ALBERT model)
align — AlignModel (ALIGN model)
altclip — AltCLIPModel (AltCLIP model)
aria — AriaForConditionalGeneration (Aria model)
aria_text — AriaTextModel (AriaText model)
audio-spectrogram-transformer — ASTModel (Audio Spectrogram Transformer model)
autoformer — AutoformerModel (Autoformer model)
bamba — BambaModel (Bamba model)
bark — BarkModel (Bark model)
bart — BartModel (BART model)
beit — BeitModel (BEiT model)
bert — BertModel (BERT model)
bert-generation — BertGenerationEncoder (Bert Generation model)
big_bird — BigBirdModel (BigBird model)
bigbird_pegasus — BigBirdPegasusModel (BigBird-Pegasus model)
biogpt — BioGptModel (BioGpt model)
bit — BitModel (BiT model)
blenderbot — BlenderbotModel (Blenderbot model)
blenderbot-small — BlenderbotSmallModel (BlenderbotSmall model)
blip — BlipModel (BLIP model)
blip-2 — Blip2Model (BLIP-2 model)
bloom — BloomModel (BLOOM model)
bridgetower — BridgeTowerModel (BridgeTower model)
bros — BrosModel (BROS model)
camembert — CamembertModel (CamemBERT model)
canine — CanineModel (CANINE model)
chameleon — ChameleonModel (Chameleon model)
chinese_clip — ChineseCLIPModel (Chinese-CLIP model)
chinese_clip_vision_model — ChineseCLIPVisionModel (ChineseCLIPVisionModel model)
clap — ClapModel (CLAP model)
clip — CLIPModel (CLIP model)
clip_text_model — CLIPTextModel (CLIPTextModel model)
clip_vision_model — CLIPVisionModel (CLIPVisionModel model)
clipseg — CLIPSegModel (CLIPSeg model)
clvp — ClvpModelForConditionalGeneration (CLVP model)
code_llama — LlamaModel (CodeLlama model)
codegen — CodeGenModel (CodeGen model)
cohere — CohereModel (Cohere model)
cohere2 — Cohere2Model (Cohere2 model)
conditional_detr — ConditionalDetrModel (Conditional DETR model)
convbert — ConvBertModel (ConvBERT model)
convnext — ConvNextModel (ConvNeXT model)
convnextv2 — ConvNextV2Model (ConvNeXTV2 model)
cpmant — CpmAntModel (CPM-Ant model)
ctrl — CTRLModel (CTRL model)
cvt — CvtModel (CvT model)
dab-detr — DabDetrModel (DAB-DETR model)
dac — DacModel (DAC model)
data2vec-audio — Data2VecAudioModel (Data2VecAudio model)
data2vec-text — Data2VecTextModel (Data2VecText model)
data2vec-vision — Data2VecVisionModel (Data2VecVision model)
dbrx — DbrxModel (DBRX model)
deberta — DebertaModel (DeBERTa model)
deberta-v2 — DebertaV2Model (DeBERTa-v2 model)
decision_transformer — DecisionTransformerModel (Decision Transformer model)
deepseek_v3 — DeepseekV3Model (DeepSeek-V3 model)
deformable_detr — DeformableDetrModel (Deformable DETR model)
deit — DeiTModel (DeiT model)
depth_pro — DepthProModel (DepthPro model)
deta — DetaModel (DETA model)
detr — DetrModel (DETR model)
diffllama — DiffLlamaModel (DiffLlama model)
dinat — DinatModel (DiNAT model)
dinov2 — Dinov2Model (DINOv2 model)
dinov2_with_registers — Dinov2WithRegistersModel (DINOv2 with Registers model)
distilbert — DistilBertModel (DistilBERT model)
donut-swin — DonutSwinModel (DonutSwin model)
dpr — DPRQuestionEncoder (DPR model)
dpt — DPTModel (DPT model)
efficientformer — EfficientFormerModel (EfficientFormer model)
efficientnet — EfficientNetModel (EfficientNet model)
electra — ElectraModel (ELECTRA model)
encodec — EncodecModel (EnCodec model)
ernie — ErnieModel (ERNIE model)
ernie_m — ErnieMModel (ErnieM model)
esm — EsmModel (ESM model)
falcon — FalconModel (Falcon model)
falcon_mamba — FalconMambaModel (FalconMamba model)
fastspeech2_conformer — FastSpeech2ConformerModel (FastSpeech2Conformer model)
flaubert — FlaubertModel (FlauBERT model)
flava — FlavaModel (FLAVA model)
fnet — FNetModel (FNet model)
focalnet — FocalNetModel (FocalNet model)
fsmt — FSMTModel (FairSeq Machine-Translation model)
funnel — FunnelModel or FunnelBaseModel (Funnel Transformer model)
gemma — GemmaModel (Gemma model)
gemma2 — Gemma2Model (Gemma2 model)
gemma3_text — Gemma3TextModel (Gemma3ForCausalLM model)
git — GitModel (GIT model)
glm — GlmModel (GLM model)
glm4 — Glm4Model (glm4 model)
glpn — GLPNModel (GLPN model)
got_ocr2 — GotOcr2ForConditionalGeneration (GOT-OCR2 model)
gpt-sw3 — GPT2Model (GPT-Sw3 model)
gpt2 — GPT2Model (OpenAI GPT-2 model)
gpt_bigcode — GPTBigCodeModel (GPTBigCode model)
gpt_neo — GPTNeoModel (GPT Neo model)
gpt_neox — GPTNeoXModel (GPT NeoX model)
gpt_neox_japanese — GPTNeoXJapaneseModel (GPT NeoX Japanese model)
gptj — GPTJModel (GPT-J model)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese model)
granite — GraniteModel (Granite model)
granitemoe — GraniteMoeModel (GraniteMoeMoe model)
granitemoeshared — GraniteMoeSharedModel (GraniteMoeSharedMoe model)
graphormer — GraphormerModel (Graphormer model)
grounding-dino — GroundingDinoModel (Grounding DINO model)
groupvit — GroupViTModel (GroupViT model)
helium — HeliumModel (Helium model)
hiera — HieraModel (Hiera model)
hubert — HubertModel (Hubert model)
ibert — IBertModel (I-BERT model)
idefics — IdeficsModel (IDEFICS model)
idefics2 — Idefics2Model (Idefics2 model)
idefics3 — Idefics3Model (Idefics3 model)
idefics3_vision — Idefics3VisionTransformer (Idefics3VisionTransformer model)
ijepa — IJepaModel (I-JEPA model)
imagegpt — ImageGPTModel (ImageGPT model)
informer — InformerModel (Informer model)
jamba — JambaModel (Jamba model)
jetmoe — JetMoeModel (JetMoe model)
jukebox — JukeboxModel (Jukebox model)
kosmos-2 — Kosmos2Model (KOSMOS-2 model)
layoutlm — LayoutLMModel (LayoutLM model)
layoutlmv2 — LayoutLMv2Model (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3Model (LayoutLMv3 model)
led — LEDModel (LED model)
levit — LevitModel (LeViT model)
lilt — LiltModel (LiLT model)
llama — LlamaModel (LLaMA model)
llama4 — Llama4ForConditionalGeneration (Llama4 model)
longformer — LongformerModel (Longformer model)
longt5 — LongT5Model (LongT5 model)
luke — LukeModel (LUKE model)
lxmert — LxmertModel (LXMERT model)
m2m_100 — M2M100Model (M2M100 model)
mamba — MambaModel (Mamba model)
mamba2 — Mamba2Model (mamba2 model)
marian — MarianModel (Marian model)
markuplm — MarkupLMModel (MarkupLM model)
mask2former — Mask2FormerModel (Mask2Former model)
maskformer — MaskFormerModel (MaskFormer model)
maskformer-swin — MaskFormerSwinModel (MaskFormerSwin model)
mbart — MBartModel (mBART model)
mctct — MCTCTModel (M-CTC-T model)
mega — MegaModel (MEGA model)
megatron-bert — MegatronBertModel (Megatron-BERT model)
mgp-str — MgpstrForSceneTextRecognition (MGP-STR model)
mimi — MimiModel (Mimi model)
mistral — MistralModel (Mistral model)
mixtral — MixtralModel (Mixtral model)
mobilebert — MobileBertModel (MobileBERT model)
mobilenet_v1 — MobileNetV1Model (MobileNetV1 model)
mobilenet_v2 — MobileNetV2Model (MobileNetV2 model)
mobilevit — MobileViTModel (MobileViT model)
mobilevitv2 — MobileViTV2Model (MobileViTV2 model)
modernbert — ModernBertModel (ModernBERT model)
moonshine — MoonshineModel (Moonshine model)
moshi — MoshiModel (Moshi model)
mpnet — MPNetModel (MPNet model)
mpt — MptModel (MPT model)
mra — MraModel (MRA model)
mt5 — MT5Model (MT5 model)
musicgen — MusicgenModel (MusicGen model)
musicgen_melody — MusicgenMelodyModel (MusicGen Melody model)
mvp — MvpModel (MVP model)
nat — NatModel (NAT model)
nemotron — NemotronModel (Nemotron model)
nezha — NezhaModel (Nezha model)
nllb-moe — NllbMoeModel (NLLB-MOE model)
nystromformer — NystromformerModel (Nyströmformer model)
olmo — OlmoModel (OLMo model)
olmo2 — Olmo2Model (OLMo2 model)
olmoe — OlmoeModel (OLMoE model)
omdet-turbo — OmDetTurboForObjectDetection (OmDet-Turbo model)
oneformer — OneFormerModel (OneFormer model)
open-llama — OpenLlamaModel (OpenLlama model)
openai-gpt — OpenAIGPTModel (OpenAI GPT model)
opt — OPTModel (OPT model)
owlv2 — Owlv2Model (OWLv2 model)
owlvit — OwlViTModel (OWL-ViT model)
patchtsmixer — PatchTSMixerModel (PatchTSMixer model)
patchtst — PatchTSTModel (PatchTST model)
pegasus — PegasusModel (Pegasus model)
pegasus_x — PegasusXModel (PEGASUS-X model)
perceiver — PerceiverModel (Perceiver model)
persimmon — PersimmonModel (Persimmon model)
phi — PhiModel (Phi model)
phi3 — Phi3Model (Phi3 model)
phi4_multimodal — Phi4MultimodalModel (Phi4Multimodal model)
phimoe — PhimoeModel (Phimoe model)
pixtral — PixtralVisionModel (Pixtral model)
plbart — PLBartModel (PLBart model)
poolformer — PoolFormerModel (PoolFormer model)
prophetnet — ProphetNetModel (ProphetNet model)
pvt — PvtModel (PVT model)
pvt_v2 — PvtV2Model (PVTv2 model)
qdqbert — QDQBertModel (QDQBert model)
qwen2 — Qwen2Model (Qwen2 model)
qwen2_5_vl — Qwen2_5_VLModel (Qwen2_5_VL model)
qwen2_audio_encoder — Qwen2AudioEncoder (Qwen2AudioEncoder model)
qwen2_moe — Qwen2MoeModel (Qwen2MoE model)
qwen2_vl — Qwen2VLModel (Qwen2VL model)
qwen3 — Qwen3Model (Qwen3 model)
qwen3_moe — Qwen3MoeModel (Qwen3MoE model)
recurrent_gemma — RecurrentGemmaModel (RecurrentGemma model)
reformer — ReformerModel (Reformer model)
regnet — RegNetModel (RegNet model)
rembert — RemBertModel (RemBERT model)
resnet — ResNetModel (ResNet model)
retribert — RetriBertModel (RetriBERT model)
roberta — RobertaModel (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormModel (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertModel (RoCBert model)
roformer — RoFormerModel (RoFormer model)
rt_detr — RTDetrModel (RT-DETR model)
rt_detr_v2 — RTDetrV2Model (RT-DETRv2 model)
rwkv — RwkvModel (RWKV model)
sam — SamModel (SAM model)
sam_vision_model — SamVisionModel (SamVisionModel model)
seamless_m4t — SeamlessM4TModel (SeamlessM4T model)
seamless_m4t_v2 — SeamlessM4Tv2Model (SeamlessM4Tv2 model)
segformer — SegformerModel (SegFormer model)
seggpt — SegGptModel (SegGPT model)
sew — SEWModel (SEW model)
sew-d — SEWDModel (SEW-D model)
siglip — SiglipModel (SigLIP model)
siglip2 — Siglip2Model (SigLIP2 model)
siglip_vision_model — SiglipVisionModel (SiglipVisionModel model)
smolvlm — SmolVLMModel (SmolVLM model)
smolvlm_vision — SmolVLMVisionTransformer (SmolVLMVisionTransformer model)
speech_to_text — Speech2TextModel (Speech2Text model)
speecht5 — SpeechT5Model (SpeechT5 model)
splinter — SplinterModel (Splinter model)
squeezebert — SqueezeBertModel (SqueezeBERT model)
stablelm — StableLmModel (StableLm model)
starcoder2 — Starcoder2Model (Starcoder2 model)
superglue — SuperGlueForKeypointMatching (SuperGlue model)
swiftformer — SwiftFormerModel (SwiftFormer model)
swin — SwinModel (Swin Transformer model)
swin2sr — Swin2SRModel (Swin2SR model)
swinv2 — Swinv2Model (Swin Transformer V2 model)
switch_transformers — SwitchTransformersModel (SwitchTransformers model)
t5 — T5Model (T5 model)
table-transformer — TableTransformerModel (Table Transformer model)
tapas — TapasModel (TAPAS model)
textnet — TextNetModel (TextNet model)
time_series_transformer — TimeSeriesTransformerModel (Time Series Transformer model)
timesformer — TimesformerModel (TimeSformer model)
timm_backbone — TimmBackbone (TimmBackbone model)
timm_wrapper — TimmWrapperModel (TimmWrapperModel model)
trajectory_transformer — TrajectoryTransformerModel (Trajectory Transformer model)
transfo-xl — TransfoXLModel (Transformer-XL model)
tvlt — TvltModel (TVLT model)
tvp — TvpModel (TVP model)
udop — UdopModel (UDOP model)
umt5 — UMT5Model (UMT5 model)
unispeech — UniSpeechModel (UniSpeech model)
unispeech-sat — UniSpeechSatModel (UniSpeechSat model)
univnet — UnivNetModel (UnivNet model)
van — VanModel (VAN model)
videomae — VideoMAEModel (VideoMAE model)
vilt — ViltModel (ViLT model)
vision-text-dual-encoder — VisionTextDualEncoderModel (VisionTextDualEncoder model)
visual_bert — VisualBertModel (VisualBERT model)
vit — ViTModel (ViT model)
vit_hybrid — ViTHybridModel (ViT Hybrid model)
vit_mae — ViTMAEModel (ViTMAE model)
vit_msn — ViTMSNModel (ViTMSN model)
vitdet — VitDetModel (VitDet model)
vits — VitsModel (VITS model)
vivit — VivitModel (ViViT model)
wav2vec2 — Wav2Vec2Model (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2BertModel (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2ConformerModel (Wav2Vec2-Conformer model)
wavlm — WavLMModel (WavLM model)
whisper — WhisperModel (Whisper model)
xclip — XCLIPModel (X-CLIP model)
xglm — XGLMModel (XGLM model)
xlm — XLMModel (XLM model)
xlm-prophetnet — XLMProphetNetModel (XLM-ProphetNet model)
xlm-roberta — XLMRobertaModel (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLModel (XLM-RoBERTa-XL model)
xlnet — XLNetModel (XLNet model)
xmod — XmodModel (X-MOD model)
yolos — YolosModel (YOLOS model)
yoso — YosoModel (YOSO model)
zamba — ZambaModel (Zamba model)
zamba2 — Zamba2Model (Zamba2 model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModel

>>> 
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModel.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModel class transformers.TFAutoModel < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the base model classes of the library from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModel

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModel.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the base model classes of the library from a pretrained model.

albert — TFAlbertModel (ALBERT model)
bart — TFBartModel (BART model)
bert — TFBertModel (BERT model)
blenderbot — TFBlenderbotModel (Blenderbot model)
blenderbot-small — TFBlenderbotSmallModel (BlenderbotSmall model)
blip — TFBlipModel (BLIP model)
camembert — TFCamembertModel (CamemBERT model)
clip — TFCLIPModel (CLIP model)
convbert — TFConvBertModel (ConvBERT model)
convnext — TFConvNextModel (ConvNeXT model)
convnextv2 — TFConvNextV2Model (ConvNeXTV2 model)
ctrl — TFCTRLModel (CTRL model)
cvt — TFCvtModel (CvT model)
data2vec-vision — TFData2VecVisionModel (Data2VecVision model)
deberta — TFDebertaModel (DeBERTa model)
deberta-v2 — TFDebertaV2Model (DeBERTa-v2 model)
deit — TFDeiTModel (DeiT model)
distilbert — TFDistilBertModel (DistilBERT model)
dpr — TFDPRQuestionEncoder (DPR model)
efficientformer — TFEfficientFormerModel (EfficientFormer model)
electra — TFElectraModel (ELECTRA model)
esm — TFEsmModel (ESM model)
flaubert — TFFlaubertModel (FlauBERT model)
funnel — TFFunnelModel or TFFunnelBaseModel (Funnel Transformer model)
gpt-sw3 — TFGPT2Model (GPT-Sw3 model)
gpt2 — TFGPT2Model (OpenAI GPT-2 model)
gptj — TFGPTJModel (GPT-J model)
groupvit — TFGroupViTModel (GroupViT model)
hubert — TFHubertModel (Hubert model)
idefics — TFIdeficsModel (IDEFICS model)
layoutlm — TFLayoutLMModel (LayoutLM model)
layoutlmv3 — TFLayoutLMv3Model (LayoutLMv3 model)
led — TFLEDModel (LED model)
longformer — TFLongformerModel (Longformer model)
lxmert — TFLxmertModel (LXMERT model)
marian — TFMarianModel (Marian model)
mbart — TFMBartModel (mBART model)
mistral — TFMistralModel (Mistral model)
mobilebert — TFMobileBertModel (MobileBERT model)
mobilevit — TFMobileViTModel (MobileViT model)
mpnet — TFMPNetModel (MPNet model)
mt5 — TFMT5Model (MT5 model)
openai-gpt — TFOpenAIGPTModel (OpenAI GPT model)
opt — TFOPTModel (OPT model)
pegasus — TFPegasusModel (Pegasus model)
regnet — TFRegNetModel (RegNet model)
rembert — TFRemBertModel (RemBERT model)
resnet — TFResNetModel (ResNet model)
roberta — TFRobertaModel (RoBERTa model)
roberta-prelayernorm — TFRobertaPreLayerNormModel (RoBERTa-PreLayerNorm model)
roformer — TFRoFormerModel (RoFormer model)
sam — TFSamModel (SAM model)
sam_vision_model — TFSamVisionModel (SamVisionModel model)
segformer — TFSegformerModel (SegFormer model)
speech_to_text — TFSpeech2TextModel (Speech2Text model)
swiftformer — TFSwiftFormerModel (SwiftFormer model)
swin — TFSwinModel (Swin Transformer model)
t5 — TFT5Model (T5 model)
tapas — TFTapasModel (TAPAS model)
transfo-xl — TFTransfoXLModel (Transformer-XL model)
vision-text-dual-encoder — TFVisionTextDualEncoderModel (VisionTextDualEncoder model)
vit — TFViTModel (ViT model)
vit_mae — TFViTMAEModel (ViTMAE model)
wav2vec2 — TFWav2Vec2Model (Wav2Vec2 model)
whisper — TFWhisperModel (Whisper model)
xglm — TFXGLMModel (XGLM model)
xlm — TFXLMModel (XLM model)
xlm-roberta — TFXLMRobertaModel (XLM-RoBERTa model)
xlnet — TFXLNetModel (XLNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModel

>>> 
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModel class transformers.FlaxAutoModel < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the base model classes of the library from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModel

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModel.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the base model classes of the library from a pretrained model.

albert — FlaxAlbertModel (ALBERT model)
bart — FlaxBartModel (BART model)
beit — FlaxBeitModel (BEiT model)
bert — FlaxBertModel (BERT model)
big_bird — FlaxBigBirdModel (BigBird model)
blenderbot — FlaxBlenderbotModel (Blenderbot model)
blenderbot-small — FlaxBlenderbotSmallModel (BlenderbotSmall model)
bloom — FlaxBloomModel (BLOOM model)
clip — FlaxCLIPModel (CLIP model)
dinov2 — FlaxDinov2Model (DINOv2 model)
distilbert — FlaxDistilBertModel (DistilBERT model)
electra — FlaxElectraModel (ELECTRA model)
gemma — FlaxGemmaModel (Gemma model)
gpt-sw3 — FlaxGPT2Model (GPT-Sw3 model)
gpt2 — FlaxGPT2Model (OpenAI GPT-2 model)
gpt_neo — FlaxGPTNeoModel (GPT Neo model)
gptj — FlaxGPTJModel (GPT-J model)
llama — FlaxLlamaModel (LLaMA model)
longt5 — FlaxLongT5Model (LongT5 model)
marian — FlaxMarianModel (Marian model)
mbart — FlaxMBartModel (mBART model)
mistral — FlaxMistralModel (Mistral model)
mt5 — FlaxMT5Model (MT5 model)
opt — FlaxOPTModel (OPT model)
pegasus — FlaxPegasusModel (Pegasus model)
regnet — FlaxRegNetModel (RegNet model)
resnet — FlaxResNetModel (ResNet model)
roberta — FlaxRobertaModel (RoBERTa model)
roberta-prelayernorm — FlaxRobertaPreLayerNormModel (RoBERTa-PreLayerNorm model)
roformer — FlaxRoFormerModel (RoFormer model)
t5 — FlaxT5Model (T5 model)
vision-text-dual-encoder — FlaxVisionTextDualEncoderModel (VisionTextDualEncoder model)
vit — FlaxViTModel (ViT model)
wav2vec2 — FlaxWav2Vec2Model (Wav2Vec2 model)
whisper — FlaxWhisperModel (Whisper model)
xglm — FlaxXGLMModel (XGLM model)
xlm-roberta — FlaxXLMRobertaModel (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModel

>>> 
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

Generic pretraining classes

The following auto classes are available for instantiating a model with a pretraining head.

AutoModelForPreTraining class transformers.AutoModelForPreTraining < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForPreTraining.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

albert — AlbertForPreTraining (ALBERT model)
bart — BartForConditionalGeneration (BART model)
bert — BertForPreTraining (BERT model)
big_bird — BigBirdForPreTraining (BigBird model)
bloom — BloomForCausalLM (BLOOM model)
camembert — CamembertForMaskedLM (CamemBERT model)
colpali — ColPaliForRetrieval (ColPali model)
ctrl — CTRLLMHeadModel (CTRL model)
data2vec-text — Data2VecTextForMaskedLM (Data2VecText model)
deberta — DebertaForMaskedLM (DeBERTa model)
deberta-v2 — DebertaV2ForMaskedLM (DeBERTa-v2 model)
distilbert — DistilBertForMaskedLM (DistilBERT model)
electra — ElectraForPreTraining (ELECTRA model)
ernie — ErnieForPreTraining (ERNIE model)
falcon_mamba — FalconMambaForCausalLM (FalconMamba model)
flaubert — FlaubertWithLMHeadModel (FlauBERT model)
flava — FlavaForPreTraining (FLAVA model)
fnet — FNetForPreTraining (FNet model)
fsmt — FSMTForConditionalGeneration (FairSeq Machine-Translation model)
funnel — FunnelForPreTraining (Funnel Transformer model)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration model)
gpt-sw3 — GPT2LMHeadModel (GPT-Sw3 model)
gpt2 — GPT2LMHeadModel (OpenAI GPT-2 model)
gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCode model)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese model)
hiera — HieraForPreTraining (Hiera model)
ibert — IBertForMaskedLM (I-BERT model)
idefics — IdeficsForVisionText2Text (IDEFICS model)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 model)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 model)
layoutlm — LayoutLMForMaskedLM (LayoutLM model)
llava — LlavaForConditionalGeneration (LLaVa model)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT model)
llava_next_video — LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video model)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision model)
longformer — LongformerForMaskedLM (Longformer model)
luke — LukeForMaskedLM (LUKE model)
lxmert — LxmertForPreTraining (LXMERT model)
mamba — MambaForCausalLM (Mamba model)
mamba2 — Mamba2ForCausalLM (mamba2 model)
mega — MegaForMaskedLM (MEGA model)
megatron-bert — MegatronBertForPreTraining (Megatron-BERT model)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 model)
mllama — MllamaForConditionalGeneration (Mllama model)
mobilebert — MobileBertForPreTraining (MobileBERT model)
mpnet — MPNetForMaskedLM (MPNet model)
mpt — MptForCausalLM (MPT model)
mra — MraForMaskedLM (MRA model)
mvp — MvpForConditionalGeneration (MVP model)
nezha — NezhaForPreTraining (Nezha model)
nllb-moe — NllbMoeForConditionalGeneration (NLLB-MOE model)
openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT model)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma model)
qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2Audio model)
retribert — RetriBertModel (RetriBERT model)
roberta — RobertaForMaskedLM (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertForPreTraining (RoCBert model)
rwkv — RwkvForCausalLM (RWKV model)
splinter — SplinterForPreTraining (Splinter model)
squeezebert — SqueezeBertForMaskedLM (SqueezeBERT model)
switch_transformers — SwitchTransformersForConditionalGeneration (SwitchTransformers model)
t5 — T5ForConditionalGeneration (T5 model)
tapas — TapasForMaskedLM (TAPAS model)
transfo-xl — TransfoXLLMHeadModel (Transformer-XL model)
tvlt — TvltForPreTraining (TVLT model)
unispeech — UniSpeechForPreTraining (UniSpeech model)
unispeech-sat — UniSpeechSatForPreTraining (UniSpeechSat model)
video_llava — VideoLlavaForConditionalGeneration (VideoLlava model)
videomae — VideoMAEForPreTraining (VideoMAE model)
vipllava — VipLlavaForConditionalGeneration (VipLlava model)
visual_bert — VisualBertForPreTraining (VisualBERT model)
vit_mae — ViTMAEForPreTraining (ViTMAE model)
wav2vec2 — Wav2Vec2ForPreTraining (Wav2Vec2 model)
wav2vec2-conformer — Wav2Vec2ConformerForPreTraining (Wav2Vec2-Conformer model)
xlm — XLMWithLMHeadModel (XLM model)
xlm-roberta — XLMRobertaForMaskedLM (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL model)
xlnet — XLNetLMHeadModel (XLNet model)
xmod — XmodForMaskedLM (X-MOD model)

Examples:

>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> 
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForPreTraining.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForPreTraining class transformers.TFAutoModelForPreTraining < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForPreTraining.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

albert — TFAlbertForPreTraining (ALBERT model)
bart — TFBartForConditionalGeneration (BART model)
bert — TFBertForPreTraining (BERT model)
camembert — TFCamembertForMaskedLM (CamemBERT model)
ctrl — TFCTRLLMHeadModel (CTRL model)
distilbert — TFDistilBertForMaskedLM (DistilBERT model)
electra — TFElectraForPreTraining (ELECTRA model)
flaubert — TFFlaubertWithLMHeadModel (FlauBERT model)
funnel — TFFunnelForPreTraining (Funnel Transformer model)
gpt-sw3 — TFGPT2LMHeadModel (GPT-Sw3 model)
gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 model)
idefics — TFIdeficsForVisionText2Text (IDEFICS model)
layoutlm — TFLayoutLMForMaskedLM (LayoutLM model)
lxmert — TFLxmertForPreTraining (LXMERT model)
mobilebert — TFMobileBertForPreTraining (MobileBERT model)
mpnet — TFMPNetForMaskedLM (MPNet model)
openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT model)
roberta — TFRobertaForMaskedLM (RoBERTa model)
roberta-prelayernorm — TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm model)
t5 — TFT5ForConditionalGeneration (T5 model)
tapas — TFTapasForMaskedLM (TAPAS model)
transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL model)
vit_mae — TFViTMAEForPreTraining (ViTMAE model)
xlm — TFXLMWithLMHeadModel (XLM model)
xlm-roberta — TFXLMRobertaForMaskedLM (XLM-RoBERTa model)
xlnet — TFXLNetLMHeadModel (XLNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> 
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForPreTraining class transformers.FlaxAutoModelForPreTraining < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForPreTraining.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

albert — FlaxAlbertForPreTraining (ALBERT model)
bart — FlaxBartForConditionalGeneration (BART model)
bert — FlaxBertForPreTraining (BERT model)
big_bird — FlaxBigBirdForPreTraining (BigBird model)
electra — FlaxElectraForPreTraining (ELECTRA model)
longt5 — FlaxLongT5ForConditionalGeneration (LongT5 model)
mbart — FlaxMBartForConditionalGeneration (mBART model)
mt5 — FlaxMT5ForConditionalGeneration (MT5 model)
roberta — FlaxRobertaForMaskedLM (RoBERTa model)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm model)
roformer — FlaxRoFormerForMaskedLM (RoFormer model)
t5 — FlaxT5ForConditionalGeneration (T5 model)
wav2vec2 — FlaxWav2Vec2ForPreTraining (Wav2Vec2 model)
whisper — FlaxWhisperForConditionalGeneration (Whisper model)
xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> 
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

Natural Language Processing

The following auto classes are available for the following natural language processing tasks.

AutoModelForCausalLM class transformers.AutoModelForCausalLM < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCausalLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

aria_text — AriaTextForCausalLM (AriaText model)
bamba — BambaForCausalLM (Bamba model)
bart — BartForCausalLM (BART model)
bert — BertLMHeadModel (BERT model)
bert-generation — BertGenerationDecoder (Bert Generation model)
big_bird — BigBirdForCausalLM (BigBird model)
bigbird_pegasus — BigBirdPegasusForCausalLM (BigBird-Pegasus model)
biogpt — BioGptForCausalLM (BioGpt model)
blenderbot — BlenderbotForCausalLM (Blenderbot model)
blenderbot-small — BlenderbotSmallForCausalLM (BlenderbotSmall model)
bloom — BloomForCausalLM (BLOOM model)
camembert — CamembertForCausalLM (CamemBERT model)
code_llama — LlamaForCausalLM (CodeLlama model)
codegen — CodeGenForCausalLM (CodeGen model)
cohere — CohereForCausalLM (Cohere model)
cohere2 — Cohere2ForCausalLM (Cohere2 model)
cpmant — CpmAntForCausalLM (CPM-Ant model)
ctrl — CTRLLMHeadModel (CTRL model)
data2vec-text — Data2VecTextForCausalLM (Data2VecText model)
dbrx — DbrxForCausalLM (DBRX model)
deepseek_v3 — DeepseekV3ForCausalLM (DeepSeek-V3 model)
diffllama — DiffLlamaForCausalLM (DiffLlama model)
electra — ElectraForCausalLM (ELECTRA model)
emu3 — Emu3ForCausalLM (Emu3 model)
ernie — ErnieForCausalLM (ERNIE model)
falcon — FalconForCausalLM (Falcon model)
falcon_mamba — FalconMambaForCausalLM (FalconMamba model)
fuyu — FuyuForCausalLM (Fuyu model)
gemma — GemmaForCausalLM (Gemma model)
gemma2 — Gemma2ForCausalLM (Gemma2 model)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration model)
gemma3_text — Gemma3ForCausalLM (Gemma3ForCausalLM model)
git — GitForCausalLM (GIT model)
glm — GlmForCausalLM (GLM model)
glm4 — Glm4ForCausalLM (glm4 model)
got_ocr2 — GotOcr2ForConditionalGeneration (GOT-OCR2 model)
gpt-sw3 — GPT2LMHeadModel (GPT-Sw3 model)
gpt2 — GPT2LMHeadModel (OpenAI GPT-2 model)
gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCode model)
gpt_neo — GPTNeoForCausalLM (GPT Neo model)
gpt_neox — GPTNeoXForCausalLM (GPT NeoX model)
gpt_neox_japanese — GPTNeoXJapaneseForCausalLM (GPT NeoX Japanese model)
gptj — GPTJForCausalLM (GPT-J model)
granite — GraniteForCausalLM (Granite model)
granitemoe — GraniteMoeForCausalLM (GraniteMoeMoe model)
granitemoeshared — GraniteMoeSharedForCausalLM (GraniteMoeSharedMoe model)
helium — HeliumForCausalLM (Helium model)
jamba — JambaForCausalLM (Jamba model)
jetmoe — JetMoeForCausalLM (JetMoe model)
llama — LlamaForCausalLM (LLaMA model)
llama4 — Llama4ForCausalLM (Llama4 model)
llama4_text — Llama4ForCausalLM (Llama4ForCausalLM model)
mamba — MambaForCausalLM (Mamba model)
mamba2 — Mamba2ForCausalLM (mamba2 model)
marian — MarianForCausalLM (Marian model)
mbart — MBartForCausalLM (mBART model)
mega — MegaForCausalLM (MEGA model)
megatron-bert — MegatronBertForCausalLM (Megatron-BERT model)
mistral — MistralForCausalLM (Mistral model)
mixtral — MixtralForCausalLM (Mixtral model)
mllama — MllamaForCausalLM (Mllama model)
moshi — MoshiForCausalLM (Moshi model)
mpt — MptForCausalLM (MPT model)
musicgen — MusicgenForCausalLM (MusicGen model)
musicgen_melody — MusicgenMelodyForCausalLM (MusicGen Melody model)
mvp — MvpForCausalLM (MVP model)
nemotron — NemotronForCausalLM (Nemotron model)
olmo — OlmoForCausalLM (OLMo model)
olmo2 — Olmo2ForCausalLM (OLMo2 model)
olmoe — OlmoeForCausalLM (OLMoE model)
open-llama — OpenLlamaForCausalLM (OpenLlama model)
openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT model)
opt — OPTForCausalLM (OPT model)
pegasus — PegasusForCausalLM (Pegasus model)
persimmon — PersimmonForCausalLM (Persimmon model)
phi — PhiForCausalLM (Phi model)
phi3 — Phi3ForCausalLM (Phi3 model)
phi4_multimodal — Phi4MultimodalForCausalLM (Phi4Multimodal model)
phimoe — PhimoeForCausalLM (Phimoe model)
plbart — PLBartForCausalLM (PLBart model)
prophetnet — ProphetNetForCausalLM (ProphetNet model)
qdqbert — QDQBertLMHeadModel (QDQBert model)
qwen2 — Qwen2ForCausalLM (Qwen2 model)
qwen2_moe — Qwen2MoeForCausalLM (Qwen2MoE model)
qwen3 — Qwen3ForCausalLM (Qwen3 model)
qwen3_moe — Qwen3MoeForCausalLM (Qwen3MoE model)
recurrent_gemma — RecurrentGemmaForCausalLM (RecurrentGemma model)
reformer — ReformerModelWithLMHead (Reformer model)
rembert — RemBertForCausalLM (RemBERT model)
roberta — RobertaForCausalLM (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertForCausalLM (RoCBert model)
roformer — RoFormerForCausalLM (RoFormer model)
rwkv — RwkvForCausalLM (RWKV model)
speech_to_text_2 — Speech2Text2ForCausalLM (Speech2Text2 model)
stablelm — StableLmForCausalLM (StableLm model)
starcoder2 — Starcoder2ForCausalLM (Starcoder2 model)
transfo-xl — TransfoXLLMHeadModel (Transformer-XL model)
trocr — TrOCRForCausalLM (TrOCR model)
whisper — WhisperForCausalLM (Whisper model)
xglm — XGLMForCausalLM (XGLM model)
xlm — XLMWithLMHeadModel (XLM model)
xlm-prophetnet — XLMProphetNetForCausalLM (XLM-ProphetNet model)
xlm-roberta — XLMRobertaForCausalLM (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLForCausalLM (XLM-RoBERTa-XL model)
xlnet — XLNetLMHeadModel (XLNet model)
xmod — XmodForCausalLM (X-MOD model)
zamba — ZambaForCausalLM (Zamba model)
zamba2 — Zamba2ForCausalLM (Zamba2 model)

Examples:

>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> 
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCausalLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForCausalLM class transformers.TFAutoModelForCausalLM < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForCausalLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

bert — TFBertLMHeadModel (BERT model)
camembert — TFCamembertForCausalLM (CamemBERT model)
ctrl — TFCTRLLMHeadModel (CTRL model)
gpt-sw3 — TFGPT2LMHeadModel (GPT-Sw3 model)
gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 model)
gptj — TFGPTJForCausalLM (GPT-J model)
mistral — TFMistralForCausalLM (Mistral model)
openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT model)
opt — TFOPTForCausalLM (OPT model)
rembert — TFRemBertForCausalLM (RemBERT model)
roberta — TFRobertaForCausalLM (RoBERTa model)
roberta-prelayernorm — TFRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm model)
roformer — TFRoFormerForCausalLM (RoFormer model)
transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL model)
xglm — TFXGLMForCausalLM (XGLM model)
xlm — TFXLMWithLMHeadModel (XLM model)
xlm-roberta — TFXLMRobertaForCausalLM (XLM-RoBERTa model)
xlnet — TFXLNetLMHeadModel (XLNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> 
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForCausalLM class transformers.FlaxAutoModelForCausalLM < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForCausalLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

bart — FlaxBartForCausalLM (BART model)
bert — FlaxBertForCausalLM (BERT model)
big_bird — FlaxBigBirdForCausalLM (BigBird model)
bloom — FlaxBloomForCausalLM (BLOOM model)
electra — FlaxElectraForCausalLM (ELECTRA model)
gemma — FlaxGemmaForCausalLM (Gemma model)
gpt-sw3 — FlaxGPT2LMHeadModel (GPT-Sw3 model)
gpt2 — FlaxGPT2LMHeadModel (OpenAI GPT-2 model)
gpt_neo — FlaxGPTNeoForCausalLM (GPT Neo model)
gptj — FlaxGPTJForCausalLM (GPT-J model)
llama — FlaxLlamaForCausalLM (LLaMA model)
mistral — FlaxMistralForCausalLM (Mistral model)
opt — FlaxOPTForCausalLM (OPT model)
roberta — FlaxRobertaForCausalLM (RoBERTa model)
roberta-prelayernorm — FlaxRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm model)
xglm — FlaxXGLMForCausalLM (XGLM model)
xlm-roberta — FlaxXLMRobertaForCausalLM (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> 
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMaskedLM class transformers.AutoModelForMaskedLM < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

albert — AlbertForMaskedLM (ALBERT model)
bart — BartForConditionalGeneration (BART model)
bert — BertForMaskedLM (BERT model)
big_bird — BigBirdForMaskedLM (BigBird model)
camembert — CamembertForMaskedLM (CamemBERT model)
convbert — ConvBertForMaskedLM (ConvBERT model)
data2vec-text — Data2VecTextForMaskedLM (Data2VecText model)
deberta — DebertaForMaskedLM (DeBERTa model)
deberta-v2 — DebertaV2ForMaskedLM (DeBERTa-v2 model)
distilbert — DistilBertForMaskedLM (DistilBERT model)
electra — ElectraForMaskedLM (ELECTRA model)
ernie — ErnieForMaskedLM (ERNIE model)
esm — EsmForMaskedLM (ESM model)
flaubert — FlaubertWithLMHeadModel (FlauBERT model)
fnet — FNetForMaskedLM (FNet model)
funnel — FunnelForMaskedLM (Funnel Transformer model)
ibert — IBertForMaskedLM (I-BERT model)
layoutlm — LayoutLMForMaskedLM (LayoutLM model)
longformer — LongformerForMaskedLM (Longformer model)
luke — LukeForMaskedLM (LUKE model)
mbart — MBartForConditionalGeneration (mBART model)
mega — MegaForMaskedLM (MEGA model)
megatron-bert — MegatronBertForMaskedLM (Megatron-BERT model)
mobilebert — MobileBertForMaskedLM (MobileBERT model)
modernbert — ModernBertForMaskedLM (ModernBERT model)
mpnet — MPNetForMaskedLM (MPNet model)
mra — MraForMaskedLM (MRA model)
mvp — MvpForConditionalGeneration (MVP model)
nezha — NezhaForMaskedLM (Nezha model)
nystromformer — NystromformerForMaskedLM (Nyströmformer model)
perceiver — PerceiverForMaskedLM (Perceiver model)
qdqbert — QDQBertForMaskedLM (QDQBert model)
reformer — ReformerForMaskedLM (Reformer model)
rembert — RemBertForMaskedLM (RemBERT model)
roberta — RobertaForMaskedLM (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertForMaskedLM (RoCBert model)
roformer — RoFormerForMaskedLM (RoFormer model)
squeezebert — SqueezeBertForMaskedLM (SqueezeBERT model)
tapas — TapasForMaskedLM (TAPAS model)
wav2vec2 — Wav2Vec2ForMaskedLM (Wav2Vec2 model)
xlm — XLMWithLMHeadModel (XLM model)
xlm-roberta — XLMRobertaForMaskedLM (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL model)
xmod — XmodForMaskedLM (X-MOD model)
yoso — YosoForMaskedLM (YOSO model)

Examples:

>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> 
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMaskedLM class transformers.TFAutoModelForMaskedLM < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

albert — TFAlbertForMaskedLM (ALBERT model)
bert — TFBertForMaskedLM (BERT model)
camembert — TFCamembertForMaskedLM (CamemBERT model)
convbert — TFConvBertForMaskedLM (ConvBERT model)
deberta — TFDebertaForMaskedLM (DeBERTa model)
deberta-v2 — TFDebertaV2ForMaskedLM (DeBERTa-v2 model)
distilbert — TFDistilBertForMaskedLM (DistilBERT model)
electra — TFElectraForMaskedLM (ELECTRA model)
esm — TFEsmForMaskedLM (ESM model)
flaubert — TFFlaubertWithLMHeadModel (FlauBERT model)
funnel — TFFunnelForMaskedLM (Funnel Transformer model)
layoutlm — TFLayoutLMForMaskedLM (LayoutLM model)
longformer — TFLongformerForMaskedLM (Longformer model)
mobilebert — TFMobileBertForMaskedLM (MobileBERT model)
mpnet — TFMPNetForMaskedLM (MPNet model)
rembert — TFRemBertForMaskedLM (RemBERT model)
roberta — TFRobertaForMaskedLM (RoBERTa model)
roberta-prelayernorm — TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm model)
roformer — TFRoFormerForMaskedLM (RoFormer model)
tapas — TFTapasForMaskedLM (TAPAS model)
xlm — TFXLMWithLMHeadModel (XLM model)
xlm-roberta — TFXLMRobertaForMaskedLM (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> 
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForMaskedLM class transformers.FlaxAutoModelForMaskedLM < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMaskedLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

albert — FlaxAlbertForMaskedLM (ALBERT model)
bart — FlaxBartForConditionalGeneration (BART model)
bert — FlaxBertForMaskedLM (BERT model)
big_bird — FlaxBigBirdForMaskedLM (BigBird model)
distilbert — FlaxDistilBertForMaskedLM (DistilBERT model)
electra — FlaxElectraForMaskedLM (ELECTRA model)
mbart — FlaxMBartForConditionalGeneration (mBART model)
roberta — FlaxRobertaForMaskedLM (RoBERTa model)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm model)
roformer — FlaxRoFormerForMaskedLM (RoFormer model)
xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> 
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMaskGeneration class transformers.AutoModelForMaskGeneration < source >

( *args **kwargs )

TFAutoModelForMaskGeneration class transformers.TFAutoModelForMaskGeneration < source >

( *args **kwargs )

AutoModelForSeq2SeqLM class transformers.AutoModelForSeq2SeqLM < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> 
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = AutoModelForSeq2SeqLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

bart — BartForConditionalGeneration (BART model)
bigbird_pegasus — BigBirdPegasusForConditionalGeneration (BigBird-Pegasus model)
blenderbot — BlenderbotForConditionalGeneration (Blenderbot model)
blenderbot-small — BlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
encoder-decoder — EncoderDecoderModel (Encoder decoder model)
fsmt — FSMTForConditionalGeneration (FairSeq Machine-Translation model)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese model)
led — LEDForConditionalGeneration (LED model)
longt5 — LongT5ForConditionalGeneration (LongT5 model)
m2m_100 — M2M100ForConditionalGeneration (M2M100 model)
marian — MarianMTModel (Marian model)
mbart — MBartForConditionalGeneration (mBART model)
mt5 — MT5ForConditionalGeneration (MT5 model)
mvp — MvpForConditionalGeneration (MVP model)
nllb-moe — NllbMoeForConditionalGeneration (NLLB-MOE model)
pegasus — PegasusForConditionalGeneration (Pegasus model)
pegasus_x — PegasusXForConditionalGeneration (PEGASUS-X model)
plbart — PLBartForConditionalGeneration (PLBart model)
prophetnet — ProphetNetForConditionalGeneration (ProphetNet model)
qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2Audio model)
seamless_m4t — SeamlessM4TForTextToText (SeamlessM4T model)
seamless_m4t_v2 — SeamlessM4Tv2ForTextToText (SeamlessM4Tv2 model)
switch_transformers — SwitchTransformersForConditionalGeneration (SwitchTransformers model)
t5 — T5ForConditionalGeneration (T5 model)
umt5 — UMT5ForConditionalGeneration (UMT5 model)
xlm-prophetnet — XLMProphetNetForConditionalGeneration (XLM-ProphetNet model)

Examples:

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> 
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> 
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/t5_tf_model_config.json")
>>> model = AutoModelForSeq2SeqLM.from_pretrained(
...     "./tf_model/t5_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSeq2SeqLM class transformers.TFAutoModelForSeq2SeqLM < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> 
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = TFAutoModelForSeq2SeqLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

bart — TFBartForConditionalGeneration (BART model)
blenderbot — TFBlenderbotForConditionalGeneration (Blenderbot model)
blenderbot-small — TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
encoder-decoder — TFEncoderDecoderModel (Encoder decoder model)
led — TFLEDForConditionalGeneration (LED model)
marian — TFMarianMTModel (Marian model)
mbart — TFMBartForConditionalGeneration (mBART model)
mt5 — TFMT5ForConditionalGeneration (MT5 model)
pegasus — TFPegasusForConditionalGeneration (Pegasus model)
t5 — TFT5ForConditionalGeneration (T5 model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> 
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> 
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSeq2SeqLM class transformers.FlaxAutoModelForSeq2SeqLM < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> 
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = FlaxAutoModelForSeq2SeqLM.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

bart — FlaxBartForConditionalGeneration (BART model)
blenderbot — FlaxBlenderbotForConditionalGeneration (Blenderbot model)
blenderbot-small — FlaxBlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
encoder-decoder — FlaxEncoderDecoderModel (Encoder decoder model)
longt5 — FlaxLongT5ForConditionalGeneration (LongT5 model)
marian — FlaxMarianMTModel (Marian model)
mbart — FlaxMBartForConditionalGeneration (mBART model)
mt5 — FlaxMT5ForConditionalGeneration (MT5 model)
pegasus — FlaxPegasusForConditionalGeneration (Pegasus model)
t5 — FlaxT5ForConditionalGeneration (T5 model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> 
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> 
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForSequenceClassification class transformers.AutoModelForSequenceClassification < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSequenceClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

albert — AlbertForSequenceClassification (ALBERT model)
bart — BartForSequenceClassification (BART model)
bert — BertForSequenceClassification (BERT model)
big_bird — BigBirdForSequenceClassification (BigBird model)
bigbird_pegasus — BigBirdPegasusForSequenceClassification (BigBird-Pegasus model)
biogpt — BioGptForSequenceClassification (BioGpt model)
bloom — BloomForSequenceClassification (BLOOM model)
camembert — CamembertForSequenceClassification (CamemBERT model)
canine — CanineForSequenceClassification (CANINE model)
code_llama — LlamaForSequenceClassification (CodeLlama model)
convbert — ConvBertForSequenceClassification (ConvBERT model)
ctrl — CTRLForSequenceClassification (CTRL model)
data2vec-text — Data2VecTextForSequenceClassification (Data2VecText model)
deberta — DebertaForSequenceClassification (DeBERTa model)
deberta-v2 — DebertaV2ForSequenceClassification (DeBERTa-v2 model)
diffllama — DiffLlamaForSequenceClassification (DiffLlama model)
distilbert — DistilBertForSequenceClassification (DistilBERT model)
electra — ElectraForSequenceClassification (ELECTRA model)
ernie — ErnieForSequenceClassification (ERNIE model)
ernie_m — ErnieMForSequenceClassification (ErnieM model)
esm — EsmForSequenceClassification (ESM model)
falcon — FalconForSequenceClassification (Falcon model)
flaubert — FlaubertForSequenceClassification (FlauBERT model)
fnet — FNetForSequenceClassification (FNet model)
funnel — FunnelForSequenceClassification (Funnel Transformer model)
gemma — GemmaForSequenceClassification (Gemma model)
gemma2 — Gemma2ForSequenceClassification (Gemma2 model)
glm — GlmForSequenceClassification (GLM model)
glm4 — Glm4ForSequenceClassification (glm4 model)
gpt-sw3 — GPT2ForSequenceClassification (GPT-Sw3 model)
gpt2 — GPT2ForSequenceClassification (OpenAI GPT-2 model)
gpt_bigcode — GPTBigCodeForSequenceClassification (GPTBigCode model)
gpt_neo — GPTNeoForSequenceClassification (GPT Neo model)
gpt_neox — GPTNeoXForSequenceClassification (GPT NeoX model)
gptj — GPTJForSequenceClassification (GPT-J model)
helium — HeliumForSequenceClassification (Helium model)
ibert — IBertForSequenceClassification (I-BERT model)
jamba — JambaForSequenceClassification (Jamba model)
jetmoe — JetMoeForSequenceClassification (JetMoe model)
layoutlm — LayoutLMForSequenceClassification (LayoutLM model)
layoutlmv2 — LayoutLMv2ForSequenceClassification (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3ForSequenceClassification (LayoutLMv3 model)
led — LEDForSequenceClassification (LED model)
lilt — LiltForSequenceClassification (LiLT model)
llama — LlamaForSequenceClassification (LLaMA model)
longformer — LongformerForSequenceClassification (Longformer model)
luke — LukeForSequenceClassification (LUKE model)
markuplm — MarkupLMForSequenceClassification (MarkupLM model)
mbart — MBartForSequenceClassification (mBART model)
mega — MegaForSequenceClassification (MEGA model)
megatron-bert — MegatronBertForSequenceClassification (Megatron-BERT model)
mistral — MistralForSequenceClassification (Mistral model)
mixtral — MixtralForSequenceClassification (Mixtral model)
mobilebert — MobileBertForSequenceClassification (MobileBERT model)
modernbert — ModernBertForSequenceClassification (ModernBERT model)
mpnet — MPNetForSequenceClassification (MPNet model)
mpt — MptForSequenceClassification (MPT model)
mra — MraForSequenceClassification (MRA model)
mt5 — MT5ForSequenceClassification (MT5 model)
mvp — MvpForSequenceClassification (MVP model)
nemotron — NemotronForSequenceClassification (Nemotron model)
nezha — NezhaForSequenceClassification (Nezha model)
nystromformer — NystromformerForSequenceClassification (Nyströmformer model)
open-llama — OpenLlamaForSequenceClassification (OpenLlama model)
openai-gpt — OpenAIGPTForSequenceClassification (OpenAI GPT model)
opt — OPTForSequenceClassification (OPT model)
perceiver — PerceiverForSequenceClassification (Perceiver model)
persimmon — PersimmonForSequenceClassification (Persimmon model)
phi — PhiForSequenceClassification (Phi model)
phi3 — Phi3ForSequenceClassification (Phi3 model)
phimoe — PhimoeForSequenceClassification (Phimoe model)
plbart — PLBartForSequenceClassification (PLBart model)
qdqbert — QDQBertForSequenceClassification (QDQBert model)
qwen2 — Qwen2ForSequenceClassification (Qwen2 model)
qwen2_moe — Qwen2MoeForSequenceClassification (Qwen2MoE model)
qwen3 — Qwen3ForSequenceClassification (Qwen3 model)
qwen3_moe — Qwen3MoeForSequenceClassification (Qwen3MoE model)
reformer — ReformerForSequenceClassification (Reformer model)
rembert — RemBertForSequenceClassification (RemBERT model)
roberta — RobertaForSequenceClassification (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertForSequenceClassification (RoCBert model)
roformer — RoFormerForSequenceClassification (RoFormer model)
squeezebert — SqueezeBertForSequenceClassification (SqueezeBERT model)
stablelm — StableLmForSequenceClassification (StableLm model)
starcoder2 — Starcoder2ForSequenceClassification (Starcoder2 model)
t5 — T5ForSequenceClassification (T5 model)
tapas — TapasForSequenceClassification (TAPAS model)
transfo-xl — TransfoXLForSequenceClassification (Transformer-XL model)
umt5 — UMT5ForSequenceClassification (UMT5 model)
xlm — XLMForSequenceClassification (XLM model)
xlm-roberta — XLMRobertaForSequenceClassification (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLForSequenceClassification (XLM-RoBERTa-XL model)
xlnet — XLNetForSequenceClassification (XLNet model)
xmod — XmodForSequenceClassification (X-MOD model)
yoso — YosoForSequenceClassification (YOSO model)
zamba — ZambaForSequenceClassification (Zamba model)
zamba2 — Zamba2ForSequenceClassification (Zamba2 model)

Examples:

>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> 
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSequenceClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSequenceClassification class transformers.TFAutoModelForSequenceClassification < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSequenceClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

albert — TFAlbertForSequenceClassification (ALBERT model)
bart — TFBartForSequenceClassification (BART model)
bert — TFBertForSequenceClassification (BERT model)
camembert — TFCamembertForSequenceClassification (CamemBERT model)
convbert — TFConvBertForSequenceClassification (ConvBERT model)
ctrl — TFCTRLForSequenceClassification (CTRL model)
deberta — TFDebertaForSequenceClassification (DeBERTa model)
deberta-v2 — TFDebertaV2ForSequenceClassification (DeBERTa-v2 model)
distilbert — TFDistilBertForSequenceClassification (DistilBERT model)
electra — TFElectraForSequenceClassification (ELECTRA model)
esm — TFEsmForSequenceClassification (ESM model)
flaubert — TFFlaubertForSequenceClassification (FlauBERT model)
funnel — TFFunnelForSequenceClassification (Funnel Transformer model)
gpt-sw3 — TFGPT2ForSequenceClassification (GPT-Sw3 model)
gpt2 — TFGPT2ForSequenceClassification (OpenAI GPT-2 model)
gptj — TFGPTJForSequenceClassification (GPT-J model)
layoutlm — TFLayoutLMForSequenceClassification (LayoutLM model)
layoutlmv3 — TFLayoutLMv3ForSequenceClassification (LayoutLMv3 model)
longformer — TFLongformerForSequenceClassification (Longformer model)
mistral — TFMistralForSequenceClassification (Mistral model)
mobilebert — TFMobileBertForSequenceClassification (MobileBERT model)
mpnet — TFMPNetForSequenceClassification (MPNet model)
openai-gpt — TFOpenAIGPTForSequenceClassification (OpenAI GPT model)
rembert — TFRemBertForSequenceClassification (RemBERT model)
roberta — TFRobertaForSequenceClassification (RoBERTa model)
roberta-prelayernorm — TFRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm model)
roformer — TFRoFormerForSequenceClassification (RoFormer model)
tapas — TFTapasForSequenceClassification (TAPAS model)
transfo-xl — TFTransfoXLForSequenceClassification (Transformer-XL model)
xlm — TFXLMForSequenceClassification (XLM model)
xlm-roberta — TFXLMRobertaForSequenceClassification (XLM-RoBERTa model)
xlnet — TFXLNetForSequenceClassification (XLNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> 
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSequenceClassification class transformers.FlaxAutoModelForSequenceClassification < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSequenceClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

albert — FlaxAlbertForSequenceClassification (ALBERT model)
bart — FlaxBartForSequenceClassification (BART model)
bert — FlaxBertForSequenceClassification (BERT model)
big_bird — FlaxBigBirdForSequenceClassification (BigBird model)
distilbert — FlaxDistilBertForSequenceClassification (DistilBERT model)
electra — FlaxElectraForSequenceClassification (ELECTRA model)
mbart — FlaxMBartForSequenceClassification (mBART model)
roberta — FlaxRobertaForSequenceClassification (RoBERTa model)
roberta-prelayernorm — FlaxRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm model)
roformer — FlaxRoFormerForSequenceClassification (RoFormer model)
xlm-roberta — FlaxXLMRobertaForSequenceClassification (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> 
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMultipleChoice class transformers.AutoModelForMultipleChoice < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMultipleChoice.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

albert — AlbertForMultipleChoice (ALBERT model)
bert — BertForMultipleChoice (BERT model)
big_bird — BigBirdForMultipleChoice (BigBird model)
camembert — CamembertForMultipleChoice (CamemBERT model)
canine — CanineForMultipleChoice (CANINE model)
convbert — ConvBertForMultipleChoice (ConvBERT model)
data2vec-text — Data2VecTextForMultipleChoice (Data2VecText model)
deberta-v2 — DebertaV2ForMultipleChoice (DeBERTa-v2 model)
distilbert — DistilBertForMultipleChoice (DistilBERT model)
electra — ElectraForMultipleChoice (ELECTRA model)
ernie — ErnieForMultipleChoice (ERNIE model)
ernie_m — ErnieMForMultipleChoice (ErnieM model)
flaubert — FlaubertForMultipleChoice (FlauBERT model)
fnet — FNetForMultipleChoice (FNet model)
funnel — FunnelForMultipleChoice (Funnel Transformer model)
ibert — IBertForMultipleChoice (I-BERT model)
longformer — LongformerForMultipleChoice (Longformer model)
luke — LukeForMultipleChoice (LUKE model)
mega — MegaForMultipleChoice (MEGA model)
megatron-bert — MegatronBertForMultipleChoice (Megatron-BERT model)
mobilebert — MobileBertForMultipleChoice (MobileBERT model)
mpnet — MPNetForMultipleChoice (MPNet model)
mra — MraForMultipleChoice (MRA model)
nezha — NezhaForMultipleChoice (Nezha model)
nystromformer — NystromformerForMultipleChoice (Nyströmformer model)
qdqbert — QDQBertForMultipleChoice (QDQBert model)
rembert — RemBertForMultipleChoice (RemBERT model)
roberta — RobertaForMultipleChoice (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertForMultipleChoice (RoCBert model)
roformer — RoFormerForMultipleChoice (RoFormer model)
squeezebert — SqueezeBertForMultipleChoice (SqueezeBERT model)
xlm — XLMForMultipleChoice (XLM model)
xlm-roberta — XLMRobertaForMultipleChoice (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLForMultipleChoice (XLM-RoBERTa-XL model)
xlnet — XLNetForMultipleChoice (XLNet model)
xmod — XmodForMultipleChoice (X-MOD model)
yoso — YosoForMultipleChoice (YOSO model)

Examples:

>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> 
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMultipleChoice.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMultipleChoice class transformers.TFAutoModelForMultipleChoice < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMultipleChoice.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

albert — TFAlbertForMultipleChoice (ALBERT model)
bert — TFBertForMultipleChoice (BERT model)
camembert — TFCamembertForMultipleChoice (CamemBERT model)
convbert — TFConvBertForMultipleChoice (ConvBERT model)
deberta-v2 — TFDebertaV2ForMultipleChoice (DeBERTa-v2 model)
distilbert — TFDistilBertForMultipleChoice (DistilBERT model)
electra — TFElectraForMultipleChoice (ELECTRA model)
flaubert — TFFlaubertForMultipleChoice (FlauBERT model)
funnel — TFFunnelForMultipleChoice (Funnel Transformer model)
longformer — TFLongformerForMultipleChoice (Longformer model)
mobilebert — TFMobileBertForMultipleChoice (MobileBERT model)
mpnet — TFMPNetForMultipleChoice (MPNet model)
rembert — TFRemBertForMultipleChoice (RemBERT model)
roberta — TFRobertaForMultipleChoice (RoBERTa model)
roberta-prelayernorm — TFRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm model)
roformer — TFRoFormerForMultipleChoice (RoFormer model)
xlm — TFXLMForMultipleChoice (XLM model)
xlm-roberta — TFXLMRobertaForMultipleChoice (XLM-RoBERTa model)
xlnet — TFXLNetForMultipleChoice (XLNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> 
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForMultipleChoice class transformers.FlaxAutoModelForMultipleChoice < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMultipleChoice.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

albert — FlaxAlbertForMultipleChoice (ALBERT model)
bert — FlaxBertForMultipleChoice (BERT model)
big_bird — FlaxBigBirdForMultipleChoice (BigBird model)
distilbert — FlaxDistilBertForMultipleChoice (DistilBERT model)
electra — FlaxElectraForMultipleChoice (ELECTRA model)
roberta — FlaxRobertaForMultipleChoice (RoBERTa model)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm model)
roformer — FlaxRoFormerForMultipleChoice (RoFormer model)
xlm-roberta — FlaxXLMRobertaForMultipleChoice (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> 
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForNextSentencePrediction class transformers.AutoModelForNextSentencePrediction < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForNextSentencePrediction.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

bert — BertForNextSentencePrediction (BERT model)
ernie — ErnieForNextSentencePrediction (ERNIE model)
fnet — FNetForNextSentencePrediction (FNet model)
megatron-bert — MegatronBertForNextSentencePrediction (Megatron-BERT model)
mobilebert — MobileBertForNextSentencePrediction (MobileBERT model)
nezha — NezhaForNextSentencePrediction (Nezha model)
qdqbert — QDQBertForNextSentencePrediction (QDQBert model)

Examples:

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> 
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForNextSentencePrediction.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForNextSentencePrediction class transformers.TFAutoModelForNextSentencePrediction < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForNextSentencePrediction.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

bert — TFBertForNextSentencePrediction (BERT model)
mobilebert — TFMobileBertForNextSentencePrediction (MobileBERT model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> 
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForNextSentencePrediction class transformers.FlaxAutoModelForNextSentencePrediction < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

from_config < source >

( **kwargs )

Parameters

config (PretrainedConfig) — The model class to instantiate is selected based on the configuration class: BertConfig configuration class: FlaxBertForNextSentencePrediction (BERT model)
attn_implementation (str, optional) — The attention implementation to use in the model (if relevant). Can be any of "eager" (manual implementation of the attention), "sdpa" (using F.scaled_dot_product_attention), or "flash_attention_2" (using Dao-AILab/flash-attention). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual "eager" implementation.

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForNextSentencePrediction.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

bert — FlaxBertForNextSentencePrediction (BERT model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> 
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForTokenClassification class transformers.AutoModelForTokenClassification < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTokenClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

albert — AlbertForTokenClassification (ALBERT model)
bert — BertForTokenClassification (BERT model)
big_bird — BigBirdForTokenClassification (BigBird model)
biogpt — BioGptForTokenClassification (BioGpt model)
bloom — BloomForTokenClassification (BLOOM model)
bros — BrosForTokenClassification (BROS model)
camembert — CamembertForTokenClassification (CamemBERT model)
canine — CanineForTokenClassification (CANINE model)
convbert — ConvBertForTokenClassification (ConvBERT model)
data2vec-text — Data2VecTextForTokenClassification (Data2VecText model)
deberta — DebertaForTokenClassification (DeBERTa model)
deberta-v2 — DebertaV2ForTokenClassification (DeBERTa-v2 model)
diffllama — DiffLlamaForTokenClassification (DiffLlama model)
distilbert — DistilBertForTokenClassification (DistilBERT model)
electra — ElectraForTokenClassification (ELECTRA model)
ernie — ErnieForTokenClassification (ERNIE model)
ernie_m — ErnieMForTokenClassification (ErnieM model)
esm — EsmForTokenClassification (ESM model)
falcon — FalconForTokenClassification (Falcon model)
flaubert — FlaubertForTokenClassification (FlauBERT model)
fnet — FNetForTokenClassification (FNet model)
funnel — FunnelForTokenClassification (Funnel Transformer model)
gemma — GemmaForTokenClassification (Gemma model)
gemma2 — Gemma2ForTokenClassification (Gemma2 model)
glm — GlmForTokenClassification (GLM model)
glm4 — Glm4ForTokenClassification (glm4 model)
gpt-sw3 — GPT2ForTokenClassification (GPT-Sw3 model)
gpt2 — GPT2ForTokenClassification (OpenAI GPT-2 model)
gpt_bigcode — GPTBigCodeForTokenClassification (GPTBigCode model)
gpt_neo — GPTNeoForTokenClassification (GPT Neo model)
gpt_neox — GPTNeoXForTokenClassification (GPT NeoX model)
helium — HeliumForTokenClassification (Helium model)
ibert — IBertForTokenClassification (I-BERT model)
layoutlm — LayoutLMForTokenClassification (LayoutLM model)
layoutlmv2 — LayoutLMv2ForTokenClassification (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3ForTokenClassification (LayoutLMv3 model)
lilt — LiltForTokenClassification (LiLT model)
llama — LlamaForTokenClassification (LLaMA model)
longformer — LongformerForTokenClassification (Longformer model)
luke — LukeForTokenClassification (LUKE model)
markuplm — MarkupLMForTokenClassification (MarkupLM model)
mega — MegaForTokenClassification (MEGA model)
megatron-bert — MegatronBertForTokenClassification (Megatron-BERT model)
mistral — MistralForTokenClassification (Mistral model)
mixtral — MixtralForTokenClassification (Mixtral model)
mobilebert — MobileBertForTokenClassification (MobileBERT model)
modernbert — ModernBertForTokenClassification (ModernBERT model)
mpnet — MPNetForTokenClassification (MPNet model)
mpt — MptForTokenClassification (MPT model)
mra — MraForTokenClassification (MRA model)
mt5 — MT5ForTokenClassification (MT5 model)
nemotron — NemotronForTokenClassification (Nemotron model)
nezha — NezhaForTokenClassification (Nezha model)
nystromformer — NystromformerForTokenClassification (Nyströmformer model)
persimmon — PersimmonForTokenClassification (Persimmon model)
phi — PhiForTokenClassification (Phi model)
phi3 — Phi3ForTokenClassification (Phi3 model)
qdqbert — QDQBertForTokenClassification (QDQBert model)
qwen2 — Qwen2ForTokenClassification (Qwen2 model)
qwen2_moe — Qwen2MoeForTokenClassification (Qwen2MoE model)
qwen3 — Qwen3ForTokenClassification (Qwen3 model)
qwen3_moe — Qwen3MoeForTokenClassification (Qwen3MoE model)
rembert — RemBertForTokenClassification (RemBERT model)
roberta — RobertaForTokenClassification (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertForTokenClassification (RoCBert model)
roformer — RoFormerForTokenClassification (RoFormer model)
squeezebert — SqueezeBertForTokenClassification (SqueezeBERT model)
stablelm — StableLmForTokenClassification (StableLm model)
starcoder2 — Starcoder2ForTokenClassification (Starcoder2 model)
t5 — T5ForTokenClassification (T5 model)
umt5 — UMT5ForTokenClassification (UMT5 model)
xlm — XLMForTokenClassification (XLM model)
xlm-roberta — XLMRobertaForTokenClassification (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLForTokenClassification (XLM-RoBERTa-XL model)
xlnet — XLNetForTokenClassification (XLNet model)
xmod — XmodForTokenClassification (X-MOD model)
yoso — YosoForTokenClassification (YOSO model)

Examples:

>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> 
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForTokenClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForTokenClassification class transformers.TFAutoModelForTokenClassification < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForTokenClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

albert — TFAlbertForTokenClassification (ALBERT model)
bert — TFBertForTokenClassification (BERT model)
camembert — TFCamembertForTokenClassification (CamemBERT model)
convbert — TFConvBertForTokenClassification (ConvBERT model)
deberta — TFDebertaForTokenClassification (DeBERTa model)
deberta-v2 — TFDebertaV2ForTokenClassification (DeBERTa-v2 model)
distilbert — TFDistilBertForTokenClassification (DistilBERT model)
electra — TFElectraForTokenClassification (ELECTRA model)
esm — TFEsmForTokenClassification (ESM model)
flaubert — TFFlaubertForTokenClassification (FlauBERT model)
funnel — TFFunnelForTokenClassification (Funnel Transformer model)
layoutlm — TFLayoutLMForTokenClassification (LayoutLM model)
layoutlmv3 — TFLayoutLMv3ForTokenClassification (LayoutLMv3 model)
longformer — TFLongformerForTokenClassification (Longformer model)
mobilebert — TFMobileBertForTokenClassification (MobileBERT model)
mpnet — TFMPNetForTokenClassification (MPNet model)
rembert — TFRemBertForTokenClassification (RemBERT model)
roberta — TFRobertaForTokenClassification (RoBERTa model)
roberta-prelayernorm — TFRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm model)
roformer — TFRoFormerForTokenClassification (RoFormer model)
xlm — TFXLMForTokenClassification (XLM model)
xlm-roberta — TFXLMRobertaForTokenClassification (XLM-RoBERTa model)
xlnet — TFXLNetForTokenClassification (XLNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> 
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForTokenClassification class transformers.FlaxAutoModelForTokenClassification < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForTokenClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

albert — FlaxAlbertForTokenClassification (ALBERT model)
bert — FlaxBertForTokenClassification (BERT model)
big_bird — FlaxBigBirdForTokenClassification (BigBird model)
distilbert — FlaxDistilBertForTokenClassification (DistilBERT model)
electra — FlaxElectraForTokenClassification (ELECTRA model)
roberta — FlaxRobertaForTokenClassification (RoBERTa model)
roberta-prelayernorm — FlaxRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm model)
roformer — FlaxRoFormerForTokenClassification (RoFormer model)
xlm-roberta — FlaxXLMRobertaForTokenClassification (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> 
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForQuestionAnswering class transformers.AutoModelForQuestionAnswering < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForQuestionAnswering.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

albert — AlbertForQuestionAnswering (ALBERT model)
bart — BartForQuestionAnswering (BART model)
bert — BertForQuestionAnswering (BERT model)
big_bird — BigBirdForQuestionAnswering (BigBird model)
bigbird_pegasus — BigBirdPegasusForQuestionAnswering (BigBird-Pegasus model)
bloom — BloomForQuestionAnswering (BLOOM model)
camembert — CamembertForQuestionAnswering (CamemBERT model)
canine — CanineForQuestionAnswering (CANINE model)
convbert — ConvBertForQuestionAnswering (ConvBERT model)
data2vec-text — Data2VecTextForQuestionAnswering (Data2VecText model)
deberta — DebertaForQuestionAnswering (DeBERTa model)
deberta-v2 — DebertaV2ForQuestionAnswering (DeBERTa-v2 model)
diffllama — DiffLlamaForQuestionAnswering (DiffLlama model)
distilbert — DistilBertForQuestionAnswering (DistilBERT model)
electra — ElectraForQuestionAnswering (ELECTRA model)
ernie — ErnieForQuestionAnswering (ERNIE model)
ernie_m — ErnieMForQuestionAnswering (ErnieM model)
falcon — FalconForQuestionAnswering (Falcon model)
flaubert — FlaubertForQuestionAnsweringSimple (FlauBERT model)
fnet — FNetForQuestionAnswering (FNet model)
funnel — FunnelForQuestionAnswering (Funnel Transformer model)
gpt2 — GPT2ForQuestionAnswering (OpenAI GPT-2 model)
gpt_neo — GPTNeoForQuestionAnswering (GPT Neo model)
gpt_neox — GPTNeoXForQuestionAnswering (GPT NeoX model)
gptj — GPTJForQuestionAnswering (GPT-J model)
ibert — IBertForQuestionAnswering (I-BERT model)
layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3 model)
led — LEDForQuestionAnswering (LED model)
lilt — LiltForQuestionAnswering (LiLT model)
llama — LlamaForQuestionAnswering (LLaMA model)
longformer — LongformerForQuestionAnswering (Longformer model)
luke — LukeForQuestionAnswering (LUKE model)
lxmert — LxmertForQuestionAnswering (LXMERT model)
markuplm — MarkupLMForQuestionAnswering (MarkupLM model)
mbart — MBartForQuestionAnswering (mBART model)
mega — MegaForQuestionAnswering (MEGA model)
megatron-bert — MegatronBertForQuestionAnswering (Megatron-BERT model)
mistral — MistralForQuestionAnswering (Mistral model)
mixtral — MixtralForQuestionAnswering (Mixtral model)
mobilebert — MobileBertForQuestionAnswering (MobileBERT model)
modernbert — ModernBertForQuestionAnswering (ModernBERT model)
mpnet — MPNetForQuestionAnswering (MPNet model)
mpt — MptForQuestionAnswering (MPT model)
mra — MraForQuestionAnswering (MRA model)
mt5 — MT5ForQuestionAnswering (MT5 model)
mvp — MvpForQuestionAnswering (MVP model)
nemotron — NemotronForQuestionAnswering (Nemotron model)
nezha — NezhaForQuestionAnswering (Nezha model)
nystromformer — NystromformerForQuestionAnswering (Nyströmformer model)
opt — OPTForQuestionAnswering (OPT model)
qdqbert — QDQBertForQuestionAnswering (QDQBert model)
qwen2 — Qwen2ForQuestionAnswering (Qwen2 model)
qwen2_moe — Qwen2MoeForQuestionAnswering (Qwen2MoE model)
qwen3 — Qwen3ForQuestionAnswering (Qwen3 model)
qwen3_moe — Qwen3MoeForQuestionAnswering (Qwen3MoE model)
reformer — ReformerForQuestionAnswering (Reformer model)
rembert — RemBertForQuestionAnswering (RemBERT model)
roberta — RobertaForQuestionAnswering (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertForQuestionAnswering (RoCBert model)
roformer — RoFormerForQuestionAnswering (RoFormer model)
splinter — SplinterForQuestionAnswering (Splinter model)
squeezebert — SqueezeBertForQuestionAnswering (SqueezeBERT model)
t5 — T5ForQuestionAnswering (T5 model)
umt5 — UMT5ForQuestionAnswering (UMT5 model)
xlm — XLMForQuestionAnsweringSimple (XLM model)
xlm-roberta — XLMRobertaForQuestionAnswering (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLForQuestionAnswering (XLM-RoBERTa-XL model)
xlnet — XLNetForQuestionAnsweringSimple (XLNet model)
xmod — XmodForQuestionAnswering (X-MOD model)
yoso — YosoForQuestionAnswering (YOSO model)

Examples:

>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> 
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForQuestionAnswering.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForQuestionAnswering class transformers.TFAutoModelForQuestionAnswering < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForQuestionAnswering.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

albert — TFAlbertForQuestionAnswering (ALBERT model)
bert — TFBertForQuestionAnswering (BERT model)
camembert — TFCamembertForQuestionAnswering (CamemBERT model)
convbert — TFConvBertForQuestionAnswering (ConvBERT model)
deberta — TFDebertaForQuestionAnswering (DeBERTa model)
deberta-v2 — TFDebertaV2ForQuestionAnswering (DeBERTa-v2 model)
distilbert — TFDistilBertForQuestionAnswering (DistilBERT model)
electra — TFElectraForQuestionAnswering (ELECTRA model)
flaubert — TFFlaubertForQuestionAnsweringSimple (FlauBERT model)
funnel — TFFunnelForQuestionAnswering (Funnel Transformer model)
gptj — TFGPTJForQuestionAnswering (GPT-J model)
layoutlmv3 — TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 model)
longformer — TFLongformerForQuestionAnswering (Longformer model)
mobilebert — TFMobileBertForQuestionAnswering (MobileBERT model)
mpnet — TFMPNetForQuestionAnswering (MPNet model)
rembert — TFRemBertForQuestionAnswering (RemBERT model)
roberta — TFRobertaForQuestionAnswering (RoBERTa model)
roberta-prelayernorm — TFRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm model)
roformer — TFRoFormerForQuestionAnswering (RoFormer model)
xlm — TFXLMForQuestionAnsweringSimple (XLM model)
xlm-roberta — TFXLMRobertaForQuestionAnswering (XLM-RoBERTa model)
xlnet — TFXLNetForQuestionAnsweringSimple (XLNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> 
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForQuestionAnswering class transformers.FlaxAutoModelForQuestionAnswering < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForQuestionAnswering.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

albert — FlaxAlbertForQuestionAnswering (ALBERT model)
bart — FlaxBartForQuestionAnswering (BART model)
bert — FlaxBertForQuestionAnswering (BERT model)
big_bird — FlaxBigBirdForQuestionAnswering (BigBird model)
distilbert — FlaxDistilBertForQuestionAnswering (DistilBERT model)
electra — FlaxElectraForQuestionAnswering (ELECTRA model)
mbart — FlaxMBartForQuestionAnswering (mBART model)
roberta — FlaxRobertaForQuestionAnswering (RoBERTa model)
roberta-prelayernorm — FlaxRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm model)
roformer — FlaxRoFormerForQuestionAnswering (RoFormer model)
xlm-roberta — FlaxXLMRobertaForQuestionAnswering (XLM-RoBERTa model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> 
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForTextEncoding class transformers.AutoModelForTextEncoding < source >

( *args **kwargs )

TFAutoModelForTextEncoding class transformers.TFAutoModelForTextEncoding < source >

( *args **kwargs )

Computer vision

The following auto classes are available for the following computer vision tasks.

AutoModelForDepthEstimation class transformers.AutoModelForDepthEstimation < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a depth estimation head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a depth estimation head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForDepthEstimation.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a depth estimation head) from a pretrained model.

depth_anything — DepthAnythingForDepthEstimation (Depth Anything model)
depth_pro — DepthProForDepthEstimation (DepthPro model)
dpt — DPTForDepthEstimation (DPT model)
glpn — GLPNForDepthEstimation (GLPN model)
prompt_depth_anything — PromptDepthAnythingForDepthEstimation (PromptDepthAnything model)
zoedepth — ZoeDepthForDepthEstimation (ZoeDepth model)

Examples:

>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> 
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForDepthEstimation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageClassification class transformers.AutoModelForImageClassification < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

beit — BeitForImageClassification (BEiT model)
bit — BitForImageClassification (BiT model)
clip — CLIPForImageClassification (CLIP model)
convnext — ConvNextForImageClassification (ConvNeXT model)
convnextv2 — ConvNextV2ForImageClassification (ConvNeXTV2 model)
cvt — CvtForImageClassification (CvT model)
data2vec-vision — Data2VecVisionForImageClassification (Data2VecVision model)
deit — DeiTForImageClassification or DeiTForImageClassificationWithTeacher (DeiT model)
dinat — DinatForImageClassification (DiNAT model)
dinov2 — Dinov2ForImageClassification (DINOv2 model)
dinov2_with_registers — Dinov2WithRegistersForImageClassification (DINOv2 with Registers model)
efficientformer — EfficientFormerForImageClassification or EfficientFormerForImageClassificationWithTeacher (EfficientFormer model)
efficientnet — EfficientNetForImageClassification (EfficientNet model)
focalnet — FocalNetForImageClassification (FocalNet model)
hiera — HieraForImageClassification (Hiera model)
ijepa — IJepaForImageClassification (I-JEPA model)
imagegpt — ImageGPTForImageClassification (ImageGPT model)
levit — LevitForImageClassification or LevitForImageClassificationWithTeacher (LeViT model)
mobilenet_v1 — MobileNetV1ForImageClassification (MobileNetV1 model)
mobilenet_v2 — MobileNetV2ForImageClassification (MobileNetV2 model)
mobilevit — MobileViTForImageClassification (MobileViT model)
mobilevitv2 — MobileViTV2ForImageClassification (MobileViTV2 model)
nat — NatForImageClassification (NAT model)
perceiver — PerceiverForImageClassificationLearned or PerceiverForImageClassificationFourier or PerceiverForImageClassificationConvProcessing (Perceiver model)
poolformer — PoolFormerForImageClassification (PoolFormer model)
pvt — PvtForImageClassification (PVT model)
pvt_v2 — PvtV2ForImageClassification (PVTv2 model)
regnet — RegNetForImageClassification (RegNet model)
resnet — ResNetForImageClassification (ResNet model)
segformer — SegformerForImageClassification (SegFormer model)
shieldgemma2 — ShieldGemma2ForImageClassification (Shieldgemma2 model)
siglip — SiglipForImageClassification (SigLIP model)
siglip2 — Siglip2ForImageClassification (SigLIP2 model)
swiftformer — SwiftFormerForImageClassification (SwiftFormer model)
swin — SwinForImageClassification (Swin Transformer model)
swinv2 — Swinv2ForImageClassification (Swin Transformer V2 model)
textnet — TextNetForImageClassification (TextNet model)
timm_wrapper — TimmWrapperForImageClassification (TimmWrapperModel model)
van — VanForImageClassification (VAN model)
vit — ViTForImageClassification (ViT model)
vit_hybrid — ViTHybridForImageClassification (ViT Hybrid model)
vit_msn — ViTMSNForImageClassification (ViTMSN model)

Examples:

>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> 
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForImageClassification class transformers.TFAutoModelForImageClassification < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForImageClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

convnext — TFConvNextForImageClassification (ConvNeXT model)
convnextv2 — TFConvNextV2ForImageClassification (ConvNeXTV2 model)
cvt — TFCvtForImageClassification (CvT model)
data2vec-vision — TFData2VecVisionForImageClassification (Data2VecVision model)
deit — TFDeiTForImageClassification or TFDeiTForImageClassificationWithTeacher (DeiT model)
efficientformer — TFEfficientFormerForImageClassification or TFEfficientFormerForImageClassificationWithTeacher (EfficientFormer model)
mobilevit — TFMobileViTForImageClassification (MobileViT model)
regnet — TFRegNetForImageClassification (RegNet model)
resnet — TFResNetForImageClassification (ResNet model)
segformer — TFSegformerForImageClassification (SegFormer model)
swiftformer — TFSwiftFormerForImageClassification (SwiftFormer model)
swin — TFSwinForImageClassification (Swin Transformer model)
vit — TFViTForImageClassification (ViT model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> 
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForImageClassification class transformers.FlaxAutoModelForImageClassification < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForImageClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

beit — FlaxBeitForImageClassification (BEiT model)
dinov2 — FlaxDinov2ForImageClassification (DINOv2 model)
regnet — FlaxRegNetForImageClassification (RegNet model)
resnet — FlaxResNetForImageClassification (ResNet model)
vit — FlaxViTForImageClassification (ViT model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> 
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForVideoClassification class transformers.AutoModelForVideoClassification < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a video classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a video classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVideoClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a video classification head) from a pretrained model.

timesformer — TimesformerForVideoClassification (TimeSformer model)
videomae — VideoMAEForVideoClassification (VideoMAE model)
vivit — VivitForVideoClassification (ViViT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> 
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVideoClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForKeypointDetection class transformers.AutoModelForKeypointDetection < source >

( *args **kwargs )

AutoModelForMaskedImageModeling class transformers.AutoModelForMaskedImageModeling < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedImageModeling.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.

deit — DeiTForMaskedImageModeling (DeiT model)
focalnet — FocalNetForMaskedImageModeling (FocalNet model)
swin — SwinForMaskedImageModeling (Swin Transformer model)
swinv2 — Swinv2ForMaskedImageModeling (Swin Transformer V2 model)
vit — ViTForMaskedImageModeling (ViT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> 
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedImageModeling.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMaskedImageModeling class transformers.TFAutoModelForMaskedImageModeling < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedImageModeling.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.

deit — TFDeiTForMaskedImageModeling (DeiT model)
swin — TFSwinForMaskedImageModeling (Swin Transformer model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> 
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForObjectDetection class transformers.AutoModelForObjectDetection < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a object detection head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a object detection head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForObjectDetection.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a object detection head) from a pretrained model.

conditional_detr — ConditionalDetrForObjectDetection (Conditional DETR model)
dab-detr — DabDetrForObjectDetection (DAB-DETR model)
deformable_detr — DeformableDetrForObjectDetection (Deformable DETR model)
deta — DetaForObjectDetection (DETA model)
detr — DetrForObjectDetection (DETR model)
rt_detr — RTDetrForObjectDetection (RT-DETR model)
rt_detr_v2 — RTDetrV2ForObjectDetection (RT-DETRv2 model)
table-transformer — TableTransformerForObjectDetection (Table Transformer model)
yolos — YolosForObjectDetection (YOLOS model)

Examples:

>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> 
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageSegmentation class transformers.AutoModelForImageSegmentation < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a image segmentation head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < source >

( **kwargs )

Parameters

config (PretrainedConfig) — The model class to instantiate is selected based on the configuration class: DetrConfig configuration class: DetrForSegmentation (DETR model)
attn_implementation (str, optional) — The attention implementation to use in the model (if relevant). Can be any of "eager" (manual implementation of the attention), "sdpa" (using F.scaled_dot_product_attention), or "flash_attention_2" (using Dao-AILab/flash-attention). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual "eager" implementation.

Instantiates one of the model classes of the library (with a image segmentation head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageSegmentation.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a image segmentation head) from a pretrained model.

detr — DetrForSegmentation (DETR model)

Examples:

>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> 
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageToImage class transformers.AutoModelForImageToImage < source >

( *args **kwargs )

AutoModelForSemanticSegmentation class transformers.AutoModelForSemanticSegmentation < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSemanticSegmentation.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.

beit — BeitForSemanticSegmentation (BEiT model)
data2vec-vision — Data2VecVisionForSemanticSegmentation (Data2VecVision model)
dpt — DPTForSemanticSegmentation (DPT model)
mobilenet_v2 — MobileNetV2ForSemanticSegmentation (MobileNetV2 model)
mobilevit — MobileViTForSemanticSegmentation (MobileViT model)
mobilevitv2 — MobileViTV2ForSemanticSegmentation (MobileViTV2 model)
segformer — SegformerForSemanticSegmentation (SegFormer model)
upernet — UperNetForSemanticSegmentation (UPerNet model)

Examples:

>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> 
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSemanticSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSemanticSegmentation class transformers.TFAutoModelForSemanticSegmentation < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSemanticSegmentation.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.

data2vec-vision — TFData2VecVisionForSemanticSegmentation (Data2VecVision model)
mobilevit — TFMobileViTForSemanticSegmentation (MobileViT model)
segformer — TFSegformerForSemanticSegmentation (SegFormer model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> 
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForInstanceSegmentation class transformers.AutoModelForInstanceSegmentation < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a instance segmentation head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a instance segmentation head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForInstanceSegmentation.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a instance segmentation head) from a pretrained model.

maskformer — MaskFormerForInstanceSegmentation (MaskFormer model)

Examples:

>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> 
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForInstanceSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForUniversalSegmentation class transformers.AutoModelForUniversalSegmentation < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a universal image segmentation head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a universal image segmentation head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForUniversalSegmentation.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a universal image segmentation head) from a pretrained model.

detr — DetrForSegmentation (DETR model)
mask2former — Mask2FormerForUniversalSegmentation (Mask2Former model)
maskformer — MaskFormerForInstanceSegmentation (MaskFormer model)
oneformer — OneFormerForUniversalSegmentation (OneFormer model)

Examples:

>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> 
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForUniversalSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForZeroShotImageClassification class transformers.AutoModelForZeroShotImageClassification < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot image classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotImageClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.

align — AlignModel (ALIGN model)
altclip — AltCLIPModel (AltCLIP model)
blip — BlipModel (BLIP model)
blip-2 — Blip2ForImageTextRetrieval (BLIP-2 model)
chinese_clip — ChineseCLIPModel (Chinese-CLIP model)
clip — CLIPModel (CLIP model)
clipseg — CLIPSegModel (CLIPSeg model)
siglip — SiglipModel (SigLIP model)
siglip2 — Siglip2Model (SigLIP2 model)

Examples:

>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> 
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForZeroShotImageClassification class transformers.TFAutoModelForZeroShotImageClassification < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

from_config < source >

( **kwargs )

Parameters

config (PretrainedConfig) — The model class to instantiate is selected based on the configuration class: BlipConfig configuration class: TFBlipModel (BLIP model) CLIPConfig configuration class: TFCLIPModel (CLIP model)
attn_implementation (str, optional) — The attention implementation to use in the model (if relevant). Can be any of "eager" (manual implementation of the attention), "sdpa" (using F.scaled_dot_product_attention), or "flash_attention_2" (using Dao-AILab/flash-attention). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual "eager" implementation.

Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForZeroShotImageClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.

blip — TFBlipModel (BLIP model)
clip — TFCLIPModel (CLIP model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> 
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForZeroShotObjectDetection class transformers.AutoModelForZeroShotObjectDetection < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot object detection head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a zero-shot object detection head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotObjectDetection.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a zero-shot object detection head) from a pretrained model.

grounding-dino — GroundingDinoForObjectDetection (Grounding DINO model)
omdet-turbo — OmDetTurboForObjectDetection (OmDet-Turbo model)
owlv2 — Owlv2ForObjectDetection (OWLv2 model)
owlvit — OwlViTForObjectDetection (OWL-ViT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> 
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

Audio

The following auto classes are available for the following audio tasks.

AutoModelForAudioClassification class transformers.AutoModelForAudioClassification < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a audio classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.

audio-spectrogram-transformer — ASTForAudioClassification (Audio Spectrogram Transformer model)
data2vec-audio — Data2VecAudioForSequenceClassification (Data2VecAudio model)
hubert — HubertForSequenceClassification (Hubert model)
sew — SEWForSequenceClassification (SEW model)
sew-d — SEWDForSequenceClassification (SEW-D model)
unispeech — UniSpeechForSequenceClassification (UniSpeech model)
unispeech-sat — UniSpeechSatForSequenceClassification (UniSpeechSat model)
wav2vec2 — Wav2Vec2ForSequenceClassification (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2BertForSequenceClassification (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2ConformerForSequenceClassification (Wav2Vec2-Conformer model)
wavlm — WavLMForSequenceClassification (WavLM model)
whisper — WhisperForAudioClassification (Whisper model)

Examples:

>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> 
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForAudioFrameClassification class transformers.TFAutoModelForAudioClassification < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a audio classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForAudioClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.

wav2vec2 — TFWav2Vec2ForSequenceClassification (Wav2Vec2 model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> 
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForAudioClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

TFAutoModelForAudioFrameClassification class transformers.AutoModelForAudioFrameClassification < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio frame (token) classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a audio frame (token) classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioFrameClassification.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a audio frame (token) classification head) from a pretrained model.

data2vec-audio — Data2VecAudioForAudioFrameClassification (Data2VecAudio model)
unispeech-sat — UniSpeechSatForAudioFrameClassification (UniSpeechSat model)
wav2vec2 — Wav2Vec2ForAudioFrameClassification (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2BertForAudioFrameClassification (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2-Conformer model)
wavlm — WavLMForAudioFrameClassification (WavLM model)

Examples:

>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> 
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioFrameClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForCTC class transformers.AutoModelForCTC < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a connectionist temporal classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a connectionist temporal classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForCTC

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCTC.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a connectionist temporal classification head) from a pretrained model.

data2vec-audio — Data2VecAudioForCTC (Data2VecAudio model)
hubert — HubertForCTC (Hubert model)
mctct — MCTCTForCTC (M-CTC-T model)
sew — SEWForCTC (SEW model)
sew-d — SEWDForCTC (SEW-D model)
unispeech — UniSpeechForCTC (UniSpeech model)
unispeech-sat — UniSpeechSatForCTC (UniSpeechSat model)
wav2vec2 — Wav2Vec2ForCTC (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2BertForCTC (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2ConformerForCTC (Wav2Vec2-Conformer model)
wavlm — WavLMForCTC (WavLM model)

Examples:

>>> from transformers import AutoConfig, AutoModelForCTC

>>> 
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCTC.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForSpeechSeq2Seq class transformers.AutoModelForSpeechSeq2Seq < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

moonshine — MoonshineForConditionalGeneration (Moonshine model)
pop2piano — Pop2PianoForConditionalGeneration (Pop2Piano model)
seamless_m4t — SeamlessM4TForSpeechToText (SeamlessM4T model)
seamless_m4t_v2 — SeamlessM4Tv2ForSpeechToText (SeamlessM4Tv2 model)
speech-encoder-decoder — SpeechEncoderDecoderModel (Speech Encoder decoder model)
speech_to_text — Speech2TextForConditionalGeneration (Speech2Text model)
speecht5 — SpeechT5ForSpeechToText (SpeechT5 model)
whisper — WhisperForConditionalGeneration (Whisper model)

Examples:

>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> 
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSpeechSeq2Seq class transformers.TFAutoModelForSpeechSeq2Seq < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

speech_to_text — TFSpeech2TextForConditionalGeneration (Speech2Text model)
whisper — TFWhisperForConditionalGeneration (Whisper model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> 
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSpeechSeq2Seq class transformers.FlaxAutoModelForSpeechSeq2Seq < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

speech-encoder-decoder — FlaxSpeechEncoderDecoderModel (Speech Encoder decoder model)
whisper — FlaxWhisperForConditionalGeneration (Whisper model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> 
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForAudioXVector class transformers.AutoModelForAudioXVector < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio retrieval via x-vector head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a audio retrieval via x-vector head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioXVector.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a audio retrieval via x-vector head) from a pretrained model.

data2vec-audio — Data2VecAudioForXVector (Data2VecAudio model)
unispeech-sat — UniSpeechSatForXVector (UniSpeechSat model)
wav2vec2 — Wav2Vec2ForXVector (Wav2Vec2 model)
wav2vec2-bert — Wav2Vec2BertForXVector (Wav2Vec2-BERT model)
wav2vec2-conformer — Wav2Vec2ConformerForXVector (Wav2Vec2-Conformer model)
wavlm — WavLMForXVector (WavLM model)

Examples:

>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> 
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioXVector.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForTextToSpectrogram class transformers.AutoModelForTextToSpectrogram < source >

( *args **kwargs )

AutoModelForTextToWaveform class transformers.AutoModelForTextToWaveform < source >

( *args **kwargs )

Multimodal

The following auto classes are available for the following multimodal tasks.

AutoModelForTableQuestionAnswering class transformers.AutoModelForTableQuestionAnswering < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < source >

( **kwargs )

Parameters

config (PretrainedConfig) — The model class to instantiate is selected based on the configuration class: TapasConfig configuration class: TapasForQuestionAnswering (TAPAS model)
attn_implementation (str, optional) — The attention implementation to use in the model (if relevant). Can be any of "eager" (manual implementation of the attention), "sdpa" (using F.scaled_dot_product_attention), or "flash_attention_2" (using Dao-AILab/flash-attention). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual "eager" implementation.

Instantiates one of the model classes of the library (with a table question answering head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> 
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = AutoModelForTableQuestionAnswering.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.

tapas — TapasForQuestionAnswering (TAPAS model)

Examples:

>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> 
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> 
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/tapas_tf_model_config.json")
>>> model = AutoModelForTableQuestionAnswering.from_pretrained(
...     "./tf_model/tapas_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForTableQuestionAnswering class transformers.TFAutoModelForTableQuestionAnswering < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

from_config < source >

( **kwargs )

Parameters

config (PretrainedConfig) — The model class to instantiate is selected based on the configuration class: TapasConfig configuration class: TFTapasForQuestionAnswering (TAPAS model)
attn_implementation (str, optional) — The attention implementation to use in the model (if relevant). Can be any of "eager" (manual implementation of the attention), "sdpa" (using F.scaled_dot_product_attention), or "flash_attention_2" (using Dao-AILab/flash-attention). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual "eager" implementation.

Instantiates one of the model classes of the library (with a table question answering head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> 
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = TFAutoModelForTableQuestionAnswering.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.

tapas — TFTapasForQuestionAnswering (TAPAS model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> 
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> 
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/tapas_pt_model_config.json")
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained(
...     "./pt_model/tapas_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForDocumentQuestionAnswering class transformers.AutoModelForDocumentQuestionAnswering < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a document question answering head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a document question answering head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> 
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = AutoModelForDocumentQuestionAnswering.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.

layoutlm — LayoutLMForQuestionAnswering (LayoutLM model)
layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3 model)

Examples:

>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> 
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> 
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/layoutlm_tf_model_config.json")
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./tf_model/layoutlm_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForDocumentQuestionAnswering class transformers.TFAutoModelForDocumentQuestionAnswering < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a document question answering head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> 
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.

layoutlm — TFLayoutLMForQuestionAnswering (LayoutLM model)
layoutlmv3 — TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> 
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> 
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/layoutlm_pt_model_config.json")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./pt_model/layoutlm_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForVisualQuestionAnswering class transformers.AutoModelForVisualQuestionAnswering < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a visual question answering head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a visual question answering head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> 
>>> config = AutoConfig.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> model = AutoModelForVisualQuestionAnswering.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a visual question answering head) from a pretrained model.

blip — BlipForQuestionAnswering (BLIP model)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 model)
vilt — ViltForQuestionAnswering (ViLT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> 
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")

>>> 
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/vilt_tf_model_config.json")
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained(
...     "./tf_model/vilt_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForVision2Seq class transformers.AutoModelForVision2Seq < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a vision-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForVision2Seq

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVision2Seq.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.

blip — BlipForConditionalGeneration (BLIP model)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 model)
chameleon — ChameleonForConditionalGeneration (Chameleon model)
git — GitForCausalLM (GIT model)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 model)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 model)
instructblip — InstructBlipForConditionalGeneration (InstructBLIP model)
instructblipvideo — InstructBlipVideoForConditionalGeneration (InstructBlipVideo model)
kosmos-2 — Kosmos2ForConditionalGeneration (KOSMOS-2 model)
llava — LlavaForConditionalGeneration (LLaVa model)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT model)
llava_next_video — LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video model)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision model)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 model)
mllama — MllamaForConditionalGeneration (Mllama model)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma model)
pix2struct — Pix2StructForConditionalGeneration (Pix2Struct model)
qwen2_5_vl — Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL model)
qwen2_vl — Qwen2VLForConditionalGeneration (Qwen2VL model)
video_llava — VideoLlavaForConditionalGeneration (VideoLlava model)
vipllava — VipLlavaForConditionalGeneration (VipLlava model)
vision-encoder-decoder — VisionEncoderDecoderModel (Vision Encoder decoder model)

Examples:

>>> from transformers import AutoConfig, AutoModelForVision2Seq

>>> 
>>> model = AutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVision2Seq.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForVision2Seq class transformers.TFAutoModelForVision2Seq < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForVision2Seq.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.

blip — TFBlipForConditionalGeneration (BLIP model)
vision-encoder-decoder — TFVisionEncoderDecoderModel (Vision Encoder decoder model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> 
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForVision2Seq class transformers.FlaxAutoModelForVision2Seq < source >

( *args **kwargs )

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForVision2Seq.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_pt (bool, optional, defaults to False) — Load the model weights from a PyTorch checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.

vision-encoder-decoder — FlaxVisionEncoderDecoderModel (Vision Encoder decoder model)

Examples:

>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> 
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForImageTextToText class transformers.AutoModelForImageTextToText < source >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a image-text-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

Instantiates one of the model classes of the library (with a image-text-to-text modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> 
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageTextToText.from_config(config)

from_pretrained < source >

( *model_args **kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/. A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when: The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory. The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download — Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded: If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done) If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the model classes of the library (with a image-text-to-text modeling head) from a pretrained model.

aria — AriaForConditionalGeneration (Aria model)
aya_vision — AyaVisionForConditionalGeneration (AyaVision model)
blip — BlipForConditionalGeneration (BLIP model)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 model)
chameleon — ChameleonForConditionalGeneration (Chameleon model)
emu3 — Emu3ForConditionalGeneration (Emu3 model)
fuyu — FuyuForCausalLM (Fuyu model)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration model)
git — GitForCausalLM (GIT model)
got_ocr2 — GotOcr2ForConditionalGeneration (GOT-OCR2 model)
idefics — IdeficsForVisionText2Text (IDEFICS model)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 model)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 model)
instructblip — InstructBlipForConditionalGeneration (InstructBLIP model)
kosmos-2 — Kosmos2ForConditionalGeneration (KOSMOS-2 model)
llama4 — Llama4ForConditionalGeneration (Llama4 model)
llava — LlavaForConditionalGeneration (LLaVa model)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT model)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision model)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 model)
mllama — MllamaForConditionalGeneration (Mllama model)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma model)
pix2struct — Pix2StructForConditionalGeneration (Pix2Struct model)
pixtral — LlavaForConditionalGeneration (Pixtral model)
qwen2_5_vl — Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL model)
qwen2_vl — Qwen2VLForConditionalGeneration (Qwen2VL model)
shieldgemma2 — Gemma3ForConditionalGeneration (Shieldgemma2 model)
smolvlm — SmolVLMForConditionalGeneration (SmolVLM model)
udop — UdopForConditionalGeneration (UDOP model)
vipllava — VipLlavaForConditionalGeneration (VipLlava model)
vision-encoder-decoder — VisionEncoderDecoderModel (Vision Encoder decoder model)

Examples:

>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> 
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased")

>>> 
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> 
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageTextToText.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

< > Update on GitHub

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.3