naive_speculate.infer.kvcache.dynamic_no_update¶
Define DynamicNoUpdateCache, implementing KVCache with no-op update behavior.
DynamicNoUpdateCache
¶
Bases: DynamicCache
DynamicNoUpdateCache do nothing on update.
Because huggingface's model implementation will update the passed cache during forward as a side effect, therefore this wrapper provides no-op update method.
Refers to the base class DynamicCache for more details.
__getitem__(index)
¶
Retrieve kv states from the given layer index or indices.
If kv states for a certain layer do not exist, corresponding keys and values will be empty tensors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index
|
int | slice
|
The layer index or slice to retrieve. |
required |
Returns:
| Type | Description |
|---|---|
KVState | tuple[KVState, ...]
|
KVState | tuple[KVState, ...]: The kv state(s) of the specified layer(s). |
update(kv_states)
¶
Intentionally a no-op, the underlying model will update self.cache in-place.