naive_speculate.speculate.scorer¶

Define Scorer, implementing token scoring (logits computing) functionality.

`ScoreOut` ¶

Bases: NamedTuple

Output of Scorer.score method.

Contains the logits at each position of the query tokens. The logits at position i are used to predict the token at position i+1.

Attributes:

Name	Type	Description
`token_logits`	`Tensor`	Logits at the query token positions. Shape `[batch_size, num_query_tokens, vocab_size]`.

`Scorer` ¶

Scorer is able to process given tokens and produce corresponding token logits (scores).

Scorer delegates scoring to a LanguageModel instance.

Scorer only scores the given query tokens, and does not generate any new tokens even though the scores can be used to do so.

In the context of speculative decoding, scoring is part of the verification process, where the speculative decoder will do speculative sampling based on the token logits produced by the scorer.

Attributes:

Name	Type	Description
`language_model`	`LanguageModel`	The language model used for scoring.

`score(query_token_ids, kv_cache)` ¶

Score the given query tokens using the provided key-value cache.

kv_cache will be updated internally as a side effect of this method.

Return the output includes logits for all query token positions, where position i gives the logits for predicting token i+1.

Parameters:

Name	Type	Description	Default
`query_token_ids`	`Tensor`	Query tokens to be scored. Shape `[batch_size, num_query_tokens]`.	required
`kv_cache`	`KVCache`	Keys and values tensor for past context tokens.	required

Returns:

Name	Type	Description
`ScoreOut`	`ScoreOut`	The scoring result containing token logits. Shape `[batch_size, num_query_tokens, vocab_size]`.

naive_speculate.speculate.scorer¶

ScoreOut ¶

Scorer ¶

score(query_token_ids, kv_cache) ¶

`ScoreOut` ¶

`Scorer` ¶

`score(query_token_ids, kv_cache)` ¶