Answer to: Decoder only model AI making repetitive responses

Score: 0

Answered: Oct 30, 2025

User Rep: 1

I think I know what is causing the problem in your code. cross-attention to itself You build a transformerdecoder and passed memory=X, this makes the layer run cross-attention over memory=X. Because there's no encoder, the model effectively learns to attend to the same sequence via cross-attentions and tends to echo tokens which leads to repetition. So what you want for GPT-style models, is masked self-attention only, instead of cross-attention. During your training you pad sequences but your loss doesn't ignore pad tokens: Your training: loss = criterion(logits.view(-1, vocab_size), yb.view(-1)) how you build your batches: if len(ids) < block_size + 1: pad_id = tokenizer.token_to_id('<pad>') or 0 ids += [pad_id] * (block_size + 1 - len(ids)) x = ids[:block_size] y = ids[1:block_size+1] this all means that if you have a short text chunk, the remainig tokens in y are just <pad>, so your model learns: "When I see <pad>, the next token should be <pad>. So to fix this, tell the loss to ignore padding. pad_id = tokenizer.token_to_id('\<pad\>') or 0 criterion = nn.CrossEntropyLoss(ignore_index=pad_id) Try to use top-k/top-p sampling and a repetition penalty during generation, this will probably reduce looping when the model is uncertain.

python deep-learning pytorch

View Question ↗

Question

Parent Entity

Decoder only model AI making repetitive responses

Score: 2 • Views: 99

Site: stackoverflow

Other Comments / Reviews

I think the problem is that you’ve built a decoder layer with cross-attention...

Score: 2 • Accepted Oct 30, 2025