Skip to content

API reference

The complete, auto-generated reference for the public API. New here? Start with the Home page and its examples first.

The main entry point

compactprompt.CompactPrompt

Configurable facade over all CompactPrompt strategies.

You can use the one-shot classmethod :meth:compact for the common case, or construct an instance to reuse an expensive scorer/embedder across calls::

cp = CompactPrompt(scorer=my_llm_scorer)
cp.compact(prompt_a)
cp.compact(prompt_b)

Parameters:

Name Type Description Default
scorer Optional[DynamicScorer]

Pluggable dynamic self-information scorer (text -> surprisals) for hard pruning. None uses static-only scoring unless you pass one. See :class:~compactprompt.scoring.LocalLMScorer for an offline option.

None
static Optional[StaticSelfInformation]

Static self-information scorer. Defaults to best-available.

None
delta_threshold float

Static/dynamic fusion threshold (paper default 0.1).

0.1
use_phrases bool

Group words into grammatical phrases (needs spaCy).

True
spacy_model str

spaCy model name.

'en_core_web_sm'
ngram int

N-gram length for abbreviation (paper best: 2).

2
top_k int

Number of frequent n-grams to abbreviate.

100
pruner Optional[object]

Custom pruning engine exposing compress(text, ratio=, budget=) -> HardPromptResult. Defaults to the built-in :class:~compactprompt.hard_prompt.HardPromptCompressor. Pass a :class:~compactprompt.llmlingua.LLMLinguaCompressor to prune with LLMLingua instead. When supplied, the scorer/static/phrase options above are ignored (they configure the built-in engine).

None

run

run(prompt: str, ratio: float = 0.5, budget: Optional[int] = None, prune: bool = True, abbreviate: bool = False, ngram: Optional[int] = None, top_k: Optional[int] = None) -> CompactResult

Compress prompt with the configured strategies.

See :meth:compact for argument documentation; this is the instance method it delegates to.

compact classmethod

compact(prompt: str, *, ratio: float = 0.5, budget: Optional[int] = None, prune: bool = True, abbreviate: bool = False, ngram: int = 2, top_k: int = 100, scorer: Optional[DynamicScorer] = None, static: Optional[StaticSelfInformation] = None, delta_threshold: float = 0.1, use_phrases: bool = True, spacy_model: str = 'en_core_web_sm', engine: str = 'builtin', pruner: Optional[object] = None) -> CompactResult

Compress a prompt. The main entry point.

Parameters:

Name Type Description Default
prompt str

The prompt text to compress.

required
ratio float

Target fraction of tokens to remove via hard pruning (0-1). 0.5 aims to halve the prompt. Ignored if budget is set.

0.5
budget Optional[int]

Optional absolute target token count for hard pruning.

None
prune bool

Apply lossy hard-prompt pruning (low-information phrases). On by default — the result is usable as-is.

True
abbreviate bool

Also apply lossless, reversible n-gram abbreviation. Off by default: abbreviated text needs its dictionary as a legend to be interpretable downstream, so enable it when you control both ends (e.g. compressing attached documents).

False
ngram int

N-gram length for abbreviation (paper's best: 2).

2
top_k int

Number of frequent n-grams to abbreviate.

100
scorer Optional[DynamicScorer]

Pluggable dynamic self-information scorer for context-aware pruning. None falls back to static-only scoring. Pass a :class:~compactprompt.scoring.LocalLMScorer for offline context-aware scoring, or any text -> surprisals callable.

None
static Optional[StaticSelfInformation]

Static self-information scorer (defaults to best-available).

None
delta_threshold float

Static/dynamic fusion threshold (paper default 0.1).

0.1
use_phrases bool

Preserve grammar by pruning whole phrases (needs spaCy).

True
spacy_model str

spaCy model name for phrase parsing.

'en_core_web_sm'
engine str

Pruning engine. "builtin" (default) uses this library's self-information pruner; "llmlingua" uses LLMLingua with default settings (needs the llmlingua extra); "caveman" uses LLM-based caveman-style compression (needs an LLM — see :class:~compactprompt.caveman.CavemanCompressor).

'builtin'
pruner Optional[object]

An explicit pruning engine instance (overrides engine). See :class:~compactprompt.llmlingua.LLMLinguaCompressor.

None

Returns:

Name Type Description
A CompactResult

class:CompactResult. result.text is the compressed prompt;

CompactResult

result.restore() reverses the abbreviation step.

Example

from compactprompt import CompactPrompt r = CompactPrompt.compact("Please kindly summarize ...") r.text # doctest: +SKIP r.ratio # doctest: +SKIP

compactprompt.CompactResult dataclass

The output of :meth:CompactPrompt.compact.

Attributes:

Name Type Description
text str

The final compressed prompt.

original str

The input prompt.

tokens_before / tokens_after

Token counts.

dictionary Dict[str, str]

Reversible n-gram mapping (empty if abbreviation was off).

steps List[str]

Names of the strategies applied, in order.

stats Dict[str, object]

Per-step diagnostic numbers.

ratio property

ratio: float

Compression ratio tokens_before / tokens_after (e.g. 2.3x).

savings property

savings: float

Fraction of tokens saved, in [0, 1].

restore

restore() -> str

Undo the lossless n-gram abbreviation step.

Note: hard-prompt pruning is lossy, so this recovers the pruned prompt with abbreviations expanded, not the verbatim original.

Shortcuts

compactprompt.compact

compact(prompt: str, **kwargs) -> CompactResult

Shorthand for :meth:CompactPrompt.compact.

compactprompt.abbreviate

abbreviate(text: str, n: int = 2, top_k: int = 100) -> Abbreviation

Shorthand for reversible n-gram abbreviation.

compactprompt.restore

restore(text: str, dictionary) -> str

Reverse n-gram abbreviation given its dictionary.

Strategies

Hard prompt pruning

compactprompt.HardPromptCompressor

Prune low-information phrases from prompts.

Parameters:

Name Type Description Default
scorer Optional[DynamicScorer]

Pluggable dynamic self-information scorer (text -> surprisals). Defaults to :class:~compactprompt.scoring.LocalLMScorer only if you opt in by passing one; if None and unavailable, static-only scoring is used.

None
static Optional[StaticSelfInformation]

Static self-information scorer. Defaults to the best available (wordfreq if installed, else bootstrapped from the input text).

None
delta_threshold float

Fusion threshold for Eq. 2-3 (paper default 0.1).

0.1
use_phrases bool

Group words into phrases with spaCy when available.

True
spacy_model str

spaCy model name for dependency parsing.

'en_core_web_sm'
protect_entities bool

Never prune named entities / numbers (recommended).

True

compress

compress(text: str, ratio: float = 0.5, budget: Optional[int] = None) -> HardPromptResult

Prune text down to a token budget.

Parameters:

Name Type Description Default
text str

The prompt to compress.

required
ratio float

Target fraction of tokens to remove (0-1). 0.5 aims to halve the prompt. Ignored when budget is given.

0.5
budget Optional[int]

Optional absolute target token count. Pruning stops once the prompt is at or below this many tokens.

None

Returns:

Name Type Description
A HardPromptResult

class:HardPromptResult. Note: hard pruning is lossy (removed

HardPromptResult

words are not recoverable); use n-gram abbreviation for reversible

HardPromptResult

compression.

compactprompt.HardPromptResult dataclass

Result of :meth:HardPromptCompressor.compress.

ratio property

ratio: float

Compression ratio tokens_before / tokens_after (e.g. 2.3x).

savings property

savings: float

Fraction of tokens removed, in [0, 1].

from_texts classmethod

from_texts(original: str, compressed: str, removed_units: Optional[List[str]] = None) -> 'HardPromptResult'

Build a result, counting tokens of both strings with count_tokens.

Pruning engines should use this so token counts always come from this library's tokenizer (not a backend's), keeping ratios comparable.

N-gram abbreviation

compactprompt.NgramAbbreviator

Reversibly abbreviate frequent n-grams.

Parameters:

Name Type Description Default
n int

N-gram length in words. The paper finds n=2 best for QA accuracy.

2
top_k int

Number of top-frequency patterns to abbreviate. The paper's extraction step selects ~100-150; its best-accuracy ablation uses 3.

100
min_count int

Only abbreviate patterns occurring at least this many times.

2
marker str

Prefix for placeholders (default "@"). Placeholders are kept token-cheap (e.g. @0) so abbreviation actually reduces tokens.

'@'
placeholder Optional[Callable[[int], str]]

Optional function index -> placeholder string to fully override the placeholder scheme.

None
require_savings bool

When True (default), only abbreviate a pattern if its placeholder costs strictly fewer tokens than the phrase, so the output is never longer than the input.

True

compress

compress(text: str) -> Abbreviation

Abbreviate text and return a reversible :class:Abbreviation.

decompress staticmethod

decompress(text: str, dictionary: Dict[str, str]) -> str

Reverse abbreviation. Placeholders are restored longest-first.

compactprompt.Abbreviation dataclass

Result of :meth:NgramAbbreviator.compress.

Attributes:

Name Type Description
text str

The abbreviated text.

dictionary Dict[str, str]

Reversible mapping placeholder -> original phrase.

original_tokens int

Word count before abbreviation (for reporting).

restore

restore() -> str

Reconstruct the original text exactly (lossless).

Numeric quantization

compactprompt.quantize

Numerical Quantization (CompactPrompt Sec. 3.3).

Shrink the token footprint of numeric data by lowering precision within a bounded error.

  • Uniform integer quantization (Eq. 4-5): map values to L = 2**b integer levels with a guaranteed max absolute error (max-min)/(L-1).
  • K-means quantization: map values to k learned centroids, minimizing average squared reconstruction error.

Uniform quantization is pure Python (works on lists or numpy arrays). K-means quantization uses scikit-learn (install the ml extra).

QuantizedColumn dataclass

A quantized numeric column plus the metadata needed to reconstruct it.

Attributes:

Name Type Description
codes List[int]

Integer codes (uniform) or centroid indices (k-means).

method str

"uniform" or "kmeans".

metadata dict

Reconstruction metadata (e.g. min, max, bits or centroids).

max_error property

max_error: float

The bound epsilon_max on absolute reconstruction error.

reconstruct

reconstruct() -> List[float]

Reconstruct approximate original values from codes + metadata.

quantize_uniform

quantize_uniform(values: Sequence[float], bits: int = 8) -> QuantizedColumn

Uniform integer quantization (CompactPrompt Eq. 4-5).

Parameters:

Name Type Description Default
values Sequence[float]

The numeric column.

required
bits int

Bit-width b; yields L = 2**b levels.

8

Returns:

Name Type Description
A QuantizedColumn

class:QuantizedColumn with method="uniform".

quantize_kmeans

quantize_kmeans(values: Sequence[float], k: int = 16, random_state: int = 42) -> QuantizedColumn

K-means quantization: map values to k centroids.

Requires scikit-learn (pip install 'compactprompt[ml]').

Parameters:

Name Type Description Default
values Sequence[float]

The numeric column.

required
k int

Number of centroids. Clamped to the number of distinct values.

16

Returns:

Name Type Description
A QuantizedColumn

class:QuantizedColumn with method="kmeans".

quantize

quantize(values: Sequence[float], method: str = 'uniform', bits: int = 8, k: int = 16) -> QuantizedColumn

Quantize a numeric column by method ("uniform" or "kmeans").

quantize_dataframe

quantize_dataframe(df: 'object', columns: Optional[Iterable[str]] = None, method: str = 'uniform', bits: int = 8, k: int = 16) -> tuple

Quantize numeric columns of a pandas DataFrame in place-safe fashion.

Parameters:

Name Type Description Default
df 'object'

A pandas DataFrame.

required
columns Optional[Iterable[str]]

Columns to quantize; defaults to all numeric columns.

None
method str

Quantization method, passed to :func:quantize.

'uniform'
bits int

Bit-width for uniform quantization.

8
k int

Number of clusters for k-means quantization.

16

Returns:

Type Description
tuple

A (new_df, results) tuple where new_df has reconstructed

tuple

(quantized) values and results maps column name to

tuple

class:QuantizedColumn.

compactprompt.quantize_uniform

quantize_uniform(values: Sequence[float], bits: int = 8) -> QuantizedColumn

Uniform integer quantization (CompactPrompt Eq. 4-5).

Parameters:

Name Type Description Default
values Sequence[float]

The numeric column.

required
bits int

Bit-width b; yields L = 2**b levels.

8

Returns:

Name Type Description
A QuantizedColumn

class:QuantizedColumn with method="uniform".

compactprompt.quantize_kmeans

quantize_kmeans(values: Sequence[float], k: int = 16, random_state: int = 42) -> QuantizedColumn

K-means quantization: map values to k centroids.

Requires scikit-learn (pip install 'compactprompt[ml]').

Parameters:

Name Type Description Default
values Sequence[float]

The numeric column.

required
k int

Number of centroids. Clamped to the number of distinct values.

16

Returns:

Name Type Description
A QuantizedColumn

class:QuantizedColumn with method="kmeans".

compactprompt.quantize_dataframe

quantize_dataframe(df: 'object', columns: Optional[Iterable[str]] = None, method: str = 'uniform', bits: int = 8, k: int = 16) -> tuple

Quantize numeric columns of a pandas DataFrame in place-safe fashion.

Parameters:

Name Type Description Default
df 'object'

A pandas DataFrame.

required
columns Optional[Iterable[str]]

Columns to quantize; defaults to all numeric columns.

None
method str

Quantization method, passed to :func:quantize.

'uniform'
bits int

Bit-width for uniform quantization.

8
k int

Number of clusters for k-means quantization.

16

Returns:

Type Description
tuple

A (new_df, results) tuple where new_df has reconstructed

tuple

(quantized) values and results maps column name to

tuple

class:QuantizedColumn.

compactprompt.QuantizedColumn dataclass

A quantized numeric column plus the metadata needed to reconstruct it.

Attributes:

Name Type Description
codes List[int]

Integer codes (uniform) or centroid indices (k-means).

method str

"uniform" or "kmeans".

metadata dict

Reconstruction metadata (e.g. min, max, bits or centroids).

max_error property

max_error: float

The bound epsilon_max on absolute reconstruction error.

reconstruct

reconstruct() -> List[float]

Reconstruct approximate original values from codes + metadata.

Few-shot example selection

compactprompt.select_examples

select_examples(texts: Sequence[str], k_range: Tuple[int, int] = (5, 50), numeric_features: Optional[Sequence[Sequence[float]]] = None, embedder: Optional[Callable[[List[str]], 'object']] = None, random_state: int = 42) -> SelectionResult

Select representative few-shot exemplars via clustering.

Parameters:

Name Type Description Default
texts Sequence[str]

Candidate exemplar texts.

required
k_range Tuple[int, int]

Inclusive (min_k, max_k) search range for cluster count.

(5, 50)
numeric_features Optional[Sequence[Sequence[float]]]

Optional per-text numeric features; standardized and concatenated to the embeddings (as in the paper).

None
embedder Optional[Callable[[List[str]], 'object']]

Callable List[str] -> array of embeddings. Defaults to all-mpnet-base-v2 via sentence-transformers.

None
random_state int

Seed for k-means.

42

Returns:

Name Type Description
A SelectionResult

class:SelectionResult.

compactprompt.SelectionResult dataclass

Result of :func:select_examples.

Attributes:

Name Type Description
indices List[int]

Indices (into the input list) of the chosen exemplars.

examples List[str]

The chosen exemplar texts.

k_star int

The selected number of clusters.

silhouette float

Silhouette score at k_star.

silhouette_by_k dict

{k: score} for every evaluated k.

Measuring fidelity

compactprompt.cosine_fidelity

cosine_fidelity(original: Union[str, Sequence[str]], compressed: Union[str, Sequence[str]], embedder: Optional[Callable[[List[str]], 'object']] = None) -> FidelityResult

Compute cosine-similarity fidelity between original and compressed text.

Accepts either single strings or equal-length sequences of strings.

Returns:

Name Type Description
A FidelityResult

class:FidelityResult with per-pair scores plus mean / 5th pct.

compactprompt.FidelityResult dataclass

Cosine-similarity fidelity statistics.

Attributes:

Name Type Description
scores List[float]

Per-pair cosine similarities.

mean float

Mean cosine similarity.

p5 float

5th-percentile cosine similarity (worst-case fidelity).

Alternative pruning engines

compactprompt.LLMLinguaCompressor

Adapter exposing LLMLingua as a compactprompt pruning engine.

Parameters:

Name Type Description Default
model_name str

Hugging Face model for LLMLingua. Defaults to a compact LLMLingua-2 model.

DEFAULT_MODEL
use_llmlingua2 bool

Use the LLMLingua-2 token-classification compressor (recommended; faster and matches DEFAULT_MODEL).

True
device_map str

Torch device ("cpu", "cuda", ...). Defaults to CPU.

'cpu'
compressor_kwargs object

Forwarded to llmlingua.PromptCompressor.

{}

The underlying model is loaded lazily on first :meth:compress call (or via :meth:load), so constructing this object is cheap and import-safe.

load

load()

Eagerly load the LLMLingua model (otherwise loaded on first use).

compress

compress(text: str, ratio: float = 0.5, budget: Optional[int] = None, instruction: str = '', question: str = '', **compress_kwargs: object) -> HardPromptResult

Compress text with LLMLingua, returning a :class:HardPromptResult.

Parameters:

Name Type Description Default
text str

The prompt to compress.

required
ratio float

Target fraction of tokens to remove (0-1). Mapped to LLMLingua's rate (keep-fraction) as 1 - ratio. Ignored when budget is set.

0.5
budget Optional[int]

Absolute target token count. Mapped to LLMLingua's target_token.

None
instruction str

Optional LLMLingua query-aware field; LLMLingua keeps tokens relevant to it.

''
question str

Optional LLMLingua query-aware field; LLMLingua keeps tokens relevant to it.

''
compress_kwargs object

Forwarded to PromptCompressor.compress_prompt.

{}

Returns:

Name Type Description
A HardPromptResult

class:HardPromptResult. Token counts use this library's

HardPromptResult

func:~compactprompt.tokens.count_tokens for consistency with the

HardPromptResult

rest of the pipeline.

compactprompt.CavemanCompressor

Caveman LLM-based compression as a compactprompt pruning engine.

Parameters:

Name Type Description Default
llm Optional[LLM]

Pluggable callable prompt -> completion. Defaults to :func:default_anthropic_llm (Anthropic SDK or claude CLI).

None
model Optional[str]

Model name for the default LLM caller (ignored if llm given).

None
max_retries int

Validation fix-retry attempts (caveman default 2).

2

compress

compress(text: str, ratio: float = 0.5, budget: Optional[int] = None) -> HardPromptResult

Compress text into caveman style, preserving structure.

ratio/budget are ignored — caveman rewrites prose to its own degree rather than to a token target.

Returns:

Name Type Description
A HardPromptResult

class:HardPromptResult. Raises ValueError if the LLM cannot

HardPromptResult

produce a structure-valid compression within max_retries.

Scoring internals

compactprompt.StaticSelfInformation

Estimate I_stat(t) = -log2 p(t) from token probabilities.

Build it one of three ways:

  • :meth:from_corpus — count unigrams in a corpus you supply (the faithful reproduction of the paper's offline corpus statistics).
  • :meth:from_wordfreq — use the wordfreq package's global frequency table (install the freq extra). A good zero-setup default.
  • :meth:from_text — bootstrap from the text being compressed itself. Needs nothing installed; rare words in the prompt score as informative.

score

score(token: str) -> float

Return the static self-information of token in bits.

from_corpus classmethod

from_corpus(texts: Sequence[str], smoothing: float = 1.0) -> 'StaticSelfInformation'

Build counts from texts with add-smoothing (Laplace) probs.

from_text classmethod

from_text(text: str, smoothing: float = 1.0) -> 'StaticSelfInformation'

Bootstrap static statistics from a single document (no deps).

from_wordfreq classmethod

from_wordfreq(lang: str = 'en') -> 'StaticSelfInformation'

Use the global wordfreq frequency table (needs the freq extra).

compactprompt.LocalLMScorer

Offline dynamic self-information via a local Hugging Face causal LM.

Computes per-token surprisal -log2 P_model(t | context) by running the model once over the text and reading the log-probability assigned to each actual next token. Subword pieces are merged back to whole words by summing their surprisals (so the score aligns with word/phrase pruning).

Requires the dynamic extra (torch + transformers). Defaults to GPT-2, which is small and downloads quickly; pass any causal LM name.

compactprompt.count_tokens

count_tokens(text: str, encoding: str = 'cl100k_base') -> int

Count the number of tokens in text.

Uses tiktoken (encoding cl100k_base by default, the GPT-4 family tokenizer) when installed; otherwise falls back to a regex word/punctuation tokenizer.

Parameters:

Name Type Description Default
text str

The string to measure.

required
encoding str

Name of the tiktoken encoding to use when available.

'cl100k_base'

Returns:

Type Description
int

The number of tokens.