UGLD
Uncertainty-Gated Lexical Decoding — logits processors for HuggingFace Transformers.
Install
UGLD is available on PyPI:
pip install ugldHow it works
UGLD modulates decoding-time lexical interventions according to the model's own uncertainty, so control is applied strongly when the model is unsure and backs off when it is confident. This keeps generated text fluent even at high intervention strengths.
Entropy gate
At each decoding step the model produces a probability distribution $\mathbf{p}$ over its vocabulary. The Shannon entropy of that distribution,
is used to drive a smooth sigmoid gate $\phi(\mathbf{p}) \in [0, 1]$:
$\tau$ is an entropy threshold — the gate is nearly closed when $H(\mathbf{p}) \ll \tau$ (confident prediction) and fully open when $H(\mathbf{p}) \gg \tau$ (uncertain prediction). $s > 0$ controls how fast the transition happens.
Conditioning towards a vocabulary (UGLD-t)
Given a set of green token ids $\mathcal{V}_\text{green}$, UGLD-t blends the model's distribution $\mathbf{p}$ with a conditioning prior $\mathbf{q}$ concentrated on those tokens:
Because $\alpha \in [0,1]$ this is always a valid convex combination. Three built-in priors are available for $\mathbf{q}$:
- uniform
- top-k
- renorm
uniform spreads mass equally over all green tokens; top-k restricts to the top-$K$ green tokens by current probability; renorm re-normalises the model's own distribution over the green set (generally the strongest and most adaptive choice).
Conditioning against a vocabulary (UGLD-a)
Given a set of red token ids $\mathcal{V}_\text{red}$, UGLD-a subtracts a scaled penalty vector $\lambda \mathbf{r}$ from the raw logits $\mathbf{z}$ before softmax:
Working in logit space avoids negative probabilities. Two weight-vector strategies are available:
- fixed
- dynamic
fixed applies a uniform penalty of 1 to every red token; dynamic scales each red token's penalty by its current probability (via min-max normalisation to $[1, 2]$), concentrating pressure where it matters most.
Quickstart
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, LogitsProcessorList
from ugld import UGLD_Towards, UGLDTowardsConfig
model_name = "gpt2"
tok = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
simple_words = [
" simple", " easy", " basic", " clear",
" small", " big", " light", " heavy",
" fast", " slow", " old", " new",
" good", " bad", " near", " far",
" start", " end", " help", " use",
]
green_ids = []
for w in simple_words:
green_ids.extend(tok.encode(w, add_special_tokens=False))
green_ids = list(set(green_ids))
proc = LogitsProcessorList([
UGLD_Towards(UGLDTowardsConfig(
green_token_ids=green_ids,
alpha_max=0.5,
tau=1.0,
s=0.3,
prior="renorm",
))
])
inputs = tok("Explain gravity in", return_tensors="pt")
out = model.generate(
**inputs,
max_new_tokens=50,
logits_processor=proc,
)
print(tok.decode(out[0], skip_special_tokens=True))
For a more complete walkthrough, see the interactive notebook:
API Reference
You can find the Full API Reference Here.Citation
If you use UGLD in your research, please cite:
@inproceedings{papucci-etal-2026-ugld,
title = {Lexical Conditioning of Model{'}s Distribution through
Uncertainty-gated Soft-Mixing of Probabilities},
author = {Papucci, Michele and Venturi, Giulia and Dell{'}Orletta, Felice},
booktitle = {Proceedings of the Workshop READIxTSAR @ LREC 2026},
year = {2026},
address = {Palma, Spain},
}