-
Notifications
You must be signed in to change notification settings - Fork 868
Commit 0024e86
authored
Bump transformers from 4.43.3 to 4.48.0 in /.docker (#2741)
Bumps [transformers](https://github.com/huggingface/transformers) from
4.43.3 to 4.48.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/huggingface/transformers/releases">transformers's
releases</a>.</em></p>
<blockquote>
<h2>v4.48.0: ModernBERT, Aria, TimmWrapper, ColPali, Falcon3, Bamba,
VitPose, DinoV2 w/ Registers, Emu3, Cohere v2, TextNet, DiffLlama,
PixtralLarge, Moonshine</h2>
<h2>New models</h2>
<h3>ModernBERT</h3>
<p>The ModernBert model was proposed in <a
href="https://arxiv.org/abs/2412.13663">Smarter, Better, Faster, Longer:
A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long
Context Finetuning and Inference</a> by Benjamin Warner, Antoine
Chaffin, Benjamin Clavié, Orion Weller, Oskar Hallström, Said
Taghadouini, Alexis Galalgher, Raja Bisas, Faisal Ladhak, Tom Aarsen,
Nathan Cooper, Grifin Adams, Jeremy Howard and Iacopo Poli.</p>
<p>It is a refresh of the traditional encoder architecture, as used in
previous models such as <a
href="https://huggingface.co/docs/transformers/en/model_doc/bert">BERT</a>
and <a
href="https://huggingface.co/docs/transformers/en/model_doc/roberta">RoBERTa</a>.</p>
<p>It builds on BERT and implements many modern architectural
improvements which have been developed since its original release, such
as:</p>
<ul>
<li><a
href="https://huggingface.co/blog/designing-positional-encoding">Rotary
Positional Embeddings</a> to support sequences of up to 8192
tokens.</li>
<li><a href="https://arxiv.org/abs/2208.08124">Unpadding</a> to ensure
no compute is wasted on padding tokens, speeding up processing time for
batches with mixed-length sequences.</li>
<li><a href="https://arxiv.org/abs/2002.05202">GeGLU</a> Replacing the
original MLP layers with GeGLU layers, shown to improve
performance.</li>
<li><a href="https://arxiv.org/abs/2004.05150v2">Alternating
Attention</a> where most attention layers employ a sliding window of 128
tokens, with Global Attention only used every 3 layers.</li>
<li><a href="https://github.com/Dao-AILab/flash-attention">Flash
Attention</a> to speed up processing.</li>
<li>A model designed following recent <a
href="https://arxiv.org/abs/2401.14489">The Case for Co-Designing Model
Architectures with Hardware</a>, ensuring maximum efficiency across
inference GPUs.</li>
<li>Modern training data scales (2 trillion tokens) and mixtures
(including code ande math data)</li>
</ul>
<p><img
src="https://github.com/user-attachments/assets/4256c0b1-9b40-4d71-ac42-fc94827d5e9d"
alt="image" /></p>
<ul>
<li>Add ModernBERT to Transformers by <a
href="https://github.com/warner-benjamin"><code>@warner-benjamin</code></a>
in <a
href="https://github.com/huggingface/transformers/issues/35158">#35158</a></li>
</ul>
<h3>Aria</h3>
<p>The Aria model was proposed in <a
href="https://huggingface.co/papers/2410.05993">Aria: An Open Multimodal
Native Mixture-of-Experts Model</a> by Li et al. from the Rhymes.AI
team.</p>
<p>Aria is an open multimodal-native model with best-in-class
performance across a wide range of multimodal, language, and coding
tasks. It has a Mixture-of-Experts architecture, with respectively 3.9B
and 3.5B activated parameters per visual token and text token.</p>
<ul>
<li>Add Aria by <a
href="https://github.com/aymeric-roucher"><code>@aymeric-roucher</code></a>
in <a
href="https://github.com/huggingface/transformers/issues/34157">#34157</a>
<img
src="https://github.com/user-attachments/assets/ef41fcc9-2c5f-4a75-ab1a-438f73d3d7e2"
alt="image" /></li>
</ul>
<h3>TimmWrapper</h3>
<p>We add a <code>TimmWrapper</code> set of classes such that timm
models can be loaded in as transformer models into the library.</p>
<p>Here's a general usage example:</p>
<pre lang="py"><code>import torch
from urllib.request import urlopen
from PIL import Image
from transformers import AutoConfig, AutoModelForImageClassification,
AutoImageProcessor
<p>checkpoint = "timm/resnet50.a1_in1k"
img = Image.open(urlopen(
'<a
href="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png">https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png</a>'
))</p>
<p>image_processor = AutoImageProcessor.from_pretrained(checkpoint)
</tr></table>
</code></pre></p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/huggingface/transformers/commit/6bc0fbcfa7acb6ac4937e7456a76c2f7975fefec"><code>6bc0fbc</code></a>
[WIP] Emu3: add model (<a
href="https://github.com/huggingface/transformers/issues/33770">#33770</a>)</li>
<li><a
href="https://github.com/huggingface/transformers/commit/59e28c30fa3a91213f569bccef73f082afa8c656"><code>59e28c3</code></a>
Fix flex_attention in training mode (<a
href="https://github.com/huggingface/transformers/issues/35605">#35605</a>)</li>
<li><a
href="https://github.com/huggingface/transformers/commit/7cf6230e25078742b21907ae49d1542747606457"><code>7cf6230</code></a>
push a fix for now</li>
<li><a
href="https://github.com/huggingface/transformers/commit/d6f446ffa79811d35484d445bc5c7932e8a536d6"><code>d6f446f</code></a>
when filtering we can't use the convert script as we removed them</li>
<li><a
href="https://github.com/huggingface/transformers/commit/8ce1e9578af6151e4192d59c345e2ad86ee789d4"><code>8ce1e95</code></a>
[test-all]</li>
<li><a
href="https://github.com/huggingface/transformers/commit/af2d7caff393cf8881396b73d92d0595b6a3b2ae"><code>af2d7ca</code></a>
Add Moonshine (<a
href="https://github.com/huggingface/transformers/issues/34784">#34784</a>)</li>
<li><a
href="https://github.com/huggingface/transformers/commit/42b8e7916b6b6dff5cb77252286db1aa07b7b41e"><code>42b8e79</code></a>
ModernBert: reuse GemmaRotaryEmbedding via modular + Integration tests
(<a
href="https://github.com/huggingface/transformers/issues/35459">#35459</a>)</li>
<li><a
href="https://github.com/huggingface/transformers/commit/e39c9f7a78fa2960a7045e8fc5a2d96b5d7eebf1"><code>e39c9f7</code></a>
v4.48-release</li>
<li><a
href="https://github.com/huggingface/transformers/commit/8de7b1ba8d126a6fc9f9bcc3173a71b46f0c3601"><code>8de7b1b</code></a>
Add flex_attn to diffllama (<a
href="https://github.com/huggingface/transformers/issues/35601">#35601</a>)</li>
<li><a
href="https://github.com/huggingface/transformers/commit/1e3ddcb2d0380d0d909a44edc217dff68956ec5e"><code>1e3ddcb</code></a>
ModernBERT bug fixes (<a
href="https://github.com/huggingface/transformers/issues/35404">#35404</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/huggingface/transformers/compare/v4.43.3...v4.48.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/openvinotoolkit/openvino_notebooks/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>1 parent c24b2b0 commit 0024e86Copy full SHA for 0024e86
1 file changed
+291
-435
lines changed
0 commit comments