AI & ML interests

The AI community building the future.

Recent Activity

Articles

evalstateΒ 
updated a bucket about 20 hours ago
tomaarsenΒ 
posted an update 4 days ago
view post
Post
213
πŸ€– I've just published Sentence Transformers v5.5.0, headlined by a new train-sentence-transformers Agent Skill that lets your AI coding agent (Claude Code, Codex, Cursor, Gemini CLI, ...) train and finetune embedding, reranker, and sparse encoder models for you. Plus training losses & fixes. Details:

The skill bundles curated guidance for the whole training workflow across all three model types: base model selection, loss and evaluator choice, hard-negative mining, distillation, LoRA, Matryoshka, multilingual training, static embeddings, etc. It also ships production-ready training template scripts the agent can adapt. Install it with hf skills add train-sentence-transformers, then just describe what you want, e.g. "finetune a reranker on my (question, answer) pairs, mine hard negatives, and push it to the Hub".

On the loss side: EmbedDistillLoss is a new embedding-level distillation loss for SentenceTransformer. Instead of distilling teacher scores like MarginMSELoss, it aligns the student's embeddings directly with pre-computed teacher embeddings, wtih an optional learnable projection for when the student and teacher dimensions differ. Second, ADRMSELoss is a new listwise learning-to-rank loss for CrossEncoder from the Rank-DistilLLM paper, aimed at the LLM-distillation reranking setting.

encode() and predict() also gained a per-call processing_kwargs override, so you can change processor settings like max_length, a vision-language model's image resolution, or a video's fps, for a single call without rebuilding the model.

The Agent Skill is the part of this release I'm most keen for people to try. Curious to hear how it works for you. I've been using it myself a lot to quickly set up some training runs that immediately use a bunch of best practices.

> pip install sentence-transformers==5.5.0
> hf skills add train-sentence-transformers

The full release notes: https://github.com/huggingface/sentence-transformers/releases/tag/v5.5.0
  • 3 replies
Β·
sergiopaniegoΒ 
posted an update 8 days ago
view post
Post
1701
OpenEnv is growing fast in tutorials. If you're looking to get started with RL environments, check them out

> evaluate your agents using OpenEnv
> learn how rewards work via rubrics
> connect agents via MCP
> many moreeeee!

anything you think it's missing?

https://meta-pytorch.org/OpenEnv/tutorials/index.html
sergiopaniegoΒ 
posted an update 9 days ago
view post
Post
773
OpenEnv already ships 🚒 with a ready-to-deploy RLM environment on free HF Spaces

Drop "Attention Is All You Need", write code that spawns parallel LLM calls β†’ βœ… correct answer, reward 1.0, in 4.2s

Run GRPO (TRL) β†’ model learns to write that search strategy itself

test it yourself β†’ sergiopaniego/repl-env
check out OpenEnv β†’ https://github.com/meta-pytorch/OpenEnv
evalstateΒ 
posted an update 10 days ago
view post
Post
2147
Hugging Face MCP Server v0.3.12
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The hub_repo_details tool now enables Dataset inspection (view splits, sample rows).
evalstateΒ 
posted an update 17 days ago
view post
Post
259
Hugging Face MCP Server v0.3.10
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reverted mcp bucket in favour of upcoming MCP App integration.
evalstateΒ 
posted an update 19 days ago
view post
Post
941
Hugging Face MCP Server v0.3.9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Users with a bucket named mcp will get an additional list_files tool that returns the public URL of contained files. This is primarily intended for use with Gradio Spaces that need URLs as inputs.
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
1363
Earlier this month, Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy

And… it's already supported in TRL, built by Kashif Rasul. you can really feel the pace of development in the team 🐎

Paper by Ruixiang ZHANG, He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang at Apple 🍎

How it works: the model generates completions at a training-time temperature (T_train) with top_k/top_p truncation, then fine-tunes on them with plain cross-entropy. no labels or verifier needed

You can try it right away with this ready-to-run example (Qwen3-4B on rStar-Coder):
https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd.py
or benchmark a checkpoint with the eval script:
https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd_eval.py

One neat insight from the paper: T_train and T_eval compose into an effective T_eff = T_train Γ— T_eval, so a broad band of configs works well. even very noisy samples still help

Want to dig deeper?

Paper: Embarrassingly Simple Self-Distillation Improves Code Generation (2604.01193)
Trainer docs: https://huggingface.co/docs/trl/main/en/ssd_trainer
evalstateΒ 
posted an update about 1 month ago
view post
Post
988
The experimental MCP hub_query tool now supports Paper Searching and Details as well as Daily Papers.
sergiopaniegoΒ 
posted an update about 1 month ago
tomaarsenΒ 
posted an update about 1 month ago
view post
Post
979
🌐 I've just published Sentence Transformers v5.4 to make the project fully multimodal for embeddings and reranking. The release also includes a modular CrossEncoder, and automatic Flash Attention 2 input flattening. Details:

You can now use SentenceTransformer and CrossEncoder with text, images, audio, and video, with the same familiar API. That means you can compute embeddings for an image and a text query using model.encode(), compare them with model.similarity(), and it just works. Models like Qwen3-VL-Embedding-2B and jinaai/jina-reranker-m0 are supported out of the box.

Beyond multimodal, I also fully modularized the CrossEncoder class. It's now a torch.nn.Sequential of composable modules, just like SentenceTransformer has been. This unlocked support for generative rerankers (CausalLM-based models like mxbai-rerank-v2 and the Qwen3 rerankers) via a new LogitScore module, which wasn't possible before without custom code.

Also, Flash Attention 2 now automatically skips padding for text-only inputs. If your batch has a mix of short and long texts, this gives you a nice speedup and lower VRAM usage for free.

I wrote a blog post walking through the multimodal features with practical examples. Check it out if you want to get started, or just point your Agent to the URL: https://huggingface.co/blog/multimodal-sentence-transformers

This release has set up the groundwork for more easily introducing late-interaction models (both text-only and multimodal) into Sentence Transformers in the next major release. I'm looking forward to it!
sergiopaniegoΒ 
posted an update about 1 month ago
sergiopaniegoΒ 
posted an update about 2 months ago
view post
Post
2081
TRL is officially an adult πŸ₯³

excited to announce TRL v1.0❗️

head to the blog to see how we got here and what’s next for this post-training library, designed to keep pace with the field

https://huggingface.co/blog/trl-v1
  • 2 replies
Β·