50M effective parameters is impressively light. Is there a similarly light model on the prompt injection side? Most of the mainstream ones seem heavier
I'm surprised nobody else has commented on this. This is a very straightforward and useful thing for a small locally runnable model to do.
> The model is available today under the Apache 2.0 license on Hugging Face (opens in a new window) and Github (opens in a new window).
Bringing back the Open to OpenAI..
There's some interesting technical details in this release:
> Privacy Filter is a bidirectional token-classification model with span decoding. It begins from an autoregressive pretrained checkpoint and is then adapted into a token classifier over a fixed taxonomy of privacy labels. Instead of generating text token by token, it labels an input sequence in one pass and then decodes coherent spans with a constrained Viterbi procedure.
> The released model has 1.5B total parameters with 50M active parameters.
> [To build it] we converted a pretrained language model into a bidirectional token classifier by replacing the language modeling head with a token-classification head and post-training it with a supervised classification objective.