Skip to content

naive_speculate.infer.inferencer.basic

Define BasicInferencer, implementing Inferencer.

BasicInferencer

Bases: Inferencer

Basic Inferencer implements the Inferencer protocol.

BasicInferencer delegates the forward computation to LanguageModel, and utilizes it to provide simple implementations for the prefill and decode methods.

Attributes:

Name Type Description
language_model LanguageModel

The language model used for forwarding.

decode(query_token_ids, kv_cache, max_new_tokens, sample_strategy)

Process query_token_ids and auto-regressively generate next new tokens.

Check for EOS token after each generation iteration, which means device synchronization will happen at each iteration.

Refers to the interface Inferencer.decode for more details.