logoalt Hacker News

pbhjpbhjlast Saturday at 6:00 PM1 replyview on HN

It feels like one could produce a digest of the context that works very similarly but fits in the available context window - not just by getting the LLM to use succinct language, but also mathematically; like reducing a sparse matrix.

There might be an input that would produce that sort of effect, perhaps it looks like nonsense (like reading zipped data) but when the LLM attempts to do interactive in it the outcome is close to consuming the context?


Replies

docjaylast Saturday at 8:43 PM

``` §CONV_DIGEST§ T1:usr_query@llm-ctx-compression→math-analog(sparse-matrix|zip)?token-seq→nonsense-input→semantic-equiv-output? T2:rsp@asymmetry_problem:compress≠decompress|llm=predict¬decode→no-bijective-map|soft-prompts∈embedding-space¬token-space+require-training|gisting(ICAE)=aux-model-compress→memory-tokens|token-compress-fails:nonlinear-distributed-mapping+syntax-semantic-entanglement|works≈lossy-semantic-distill@task-specific+finetune=collapse-instruction→weights §T3:usr→design-full-python-impl§ T4:arch_blueprint→ DIR:src/context_compressor/{core/(base|result|pipeline)|compressors/(extractive|abstractive|semantic|entity_graph|soft_prompt|gisting|hybrid)|embeddings/(providers|clustering)|evaluation/(metrics|task_performance|benchmark)|models/(base|openai|anthropic|local)|utils/(tokenization|text_processing|config)} CLASSES:CompressionMethod=Enum(EXTRACTIVE|ABSTRACTIVE|SEMANTIC_CLUSTERING|ENTITY_GRAPH|SOFT_PROMPT|GISTING|HYBRID)|CompressionResult@(original_text+compressed_text+original_tokens+compressed_tokens+method+compression_ratio+metadata+soft_vectors?)|TokenCounter=Protocol(count|truncate_to_limit)|EmbeddingProvider=Protocol(embed|embed_single)|LLMBackend=Protocol(generate|get_token_limit)|ContextCompressor=ABC(token_counter+target_ratio=0.25+min_tokens=50+max_tokens?→compress:abstract)|TrainableCompressor(ContextCompressor)+(train+save+load) COMPRESSORS:extractive→(TextRank|MMR|LeadSentence)|abstractive→(LLMSummary|ChainOfDensity|HierarchicalSummary)|semantic→(ClusterCentroid|SemanticChunk|DiversityMaximizer)|entity→(EntityRelation|FactList)|soft→(SoftPrompt|PromptTuning)|gist→(GistToken|Autoencoder)|hybrid→(Cascade|Ensemble|Adaptive) EVAL:EvaluationResult@(compression_ratio+token_reduction+embedding_similarity+entailment_score+entity_recall+fact_recall+keyword_overlap+qa_accuracy?+reconstruction_bleu?)→composite_score(weights)|CompressionEvaluator(embedding_provider+llm?+nli?)→evaluate|compare_methods PIPELINE:CompressionPipeline(steps:list[Compressor])→sequential-apply|AdaptiveRouter(compressors:dict+classifier?)→content-based-routing DEPS:numpy|torch|transformers|sentence-transformers|tiktoken|networkx|sklearn|spacy|openai|anthropic|pandas|pydantic+optional(accelerate|peft|datasets|sacrebleu|rouge-score) ```