libs/langchain_v1/langchain/agents/middleware/summarization.py · langchain-ai/langchain

1"""Summarization middleware."""23import uuid4import warnings5from collections.abc import Callable, Iterable, Mapping6from functools import partial7from typing import Any, Literal, TypedDict, cast89from langchain_core.messages import (10    AIMessage,11    AnyMessage,12    MessageLikeRepresentation,13    RemoveMessage,14    ToolMessage,15)16from langchain_core.messages.human import HumanMessage17from langchain_core.messages.utils import (18    count_tokens_approximately,19    get_buffer_string,20    trim_messages,21)22from langgraph.graph.message import (23    REMOVE_ALL_MESSAGES,24)25from langgraph.runtime import Runtime26from typing_extensions import override2728from langchain.agents.middleware.types import AgentMiddleware, AgentState, ContextT, ResponseT29from langchain.chat_models import BaseChatModel, init_chat_model3031TokenCounter = Callable[[Iterable[MessageLikeRepresentation]], int]3233DEFAULT_SUMMARY_PROMPT = """<role>34Context Extraction Assistant35</role>3637<primary_objective>38Your sole objective in this task is to extract the highest quality/most relevant context from the conversation history below.39</primary_objective>4041<objective_information>42You're nearing the total number of input tokens you can accept, so you must extract the highest quality/most relevant pieces of information from your conversation history.43This context will then overwrite the conversation history presented below. Because of this, ensure the context you extract is only the most important information to continue working toward your overall goal.44</objective_information>4546<instructions>47The conversation history below will be replaced with the context you extract in this step.48You want to ensure that you don't repeat any actions you've already completed, so the context you extract from the conversation history should be focused on the most important information to your overall goal.4950You should structure your summary using the following sections. Each section acts as a checklist - you must populate it with relevant information or explicitly state "None" if there is nothing to report for that section:5152## SESSION INTENT5354What is the user's primary goal or request? What overall task are you trying to accomplish? This should be concise but complete enough to understand the purpose of the entire session.5556## SUMMARY5758Extract and record all of the most important context from the conversation history. Include important choices, conclusions, or strategies determined during this conversation. Include the reasoning behind key decisions. Document any rejected options and why they were not pursued.5960## ARTIFACTS6162What artifacts, files, or resources were created, modified, or accessed during this conversation? For file modifications, list specific file paths and briefly describe the changes made to each. This section prevents silent loss of artifact information.6364## NEXT STEPS6566What specific tasks remain to be completed to achieve the session intent? What should you do next?6768</instructions>6970The user will message you with the full message history from which you'll extract context to create a replacement. Carefully read through it all and think deeply about what information is most important to your overall goal and should be saved:7172With all of this in mind, please carefully read over the entire conversation history, and extract the most important and relevant context to replace it so that you can free up space in the conversation history.73Respond ONLY with the extracted context. Do not include any additional information, or text before or after the extracted context.7475<messages>76Messages to summarize:77{messages}78</messages>"""  # noqa: E50179"""Default prompt used to summarize conversation history.8081The `<messages>` marker (on its own line) and the `{messages}` placeholder are82part of this constant's public contract, not just cosmetic formatting.83Downstream consumers depend on them: for example, deep agents'84`SummarizationMiddleware` splices an extra instruction block in immediately85before the `<messages>` marker via `str.replace`. Removing, renaming, or86reformatting the marker (or the `{messages}` placeholder) is a breaking change87for those consumers even though it does not alter any function signature, so88treat edits to it accordingly.89"""9091_DEFAULT_MESSAGES_TO_KEEP = 2092_DEFAULT_TRIM_TOKEN_LIMIT = 400093_DEFAULT_FALLBACK_MESSAGE_COUNT = 159495# Some providers tag emitted messages with a `model_provider` string that differs from96# their LangSmith `ls_provider`. The reported-token check below compares the two, so we97# accept known aliases per `ls_provider`.98_LS_PROVIDER_ALIASES: dict[str, frozenset[str]] = {99    "amazon_bedrock": frozenset({"bedrock", "bedrock_converse"}),100}101102103def _provider_matches(message_provider: str, model_ls_provider: str | None) -> bool:104    if model_ls_provider is None:105        return False106    if message_provider == model_ls_provider:107        return True108    aliases = _LS_PROVIDER_ALIASES.get(model_ls_provider)109    return aliases is not None and message_provider in aliases110111112ContextFraction = tuple[Literal["fraction"], float]113"""Fraction of model's maximum input tokens.114115Example:116    To specify 50% of the model's max input tokens:117118    ```python119    ("fraction", 0.5)120    ```121"""122123ContextTokens = tuple[Literal["tokens"], int]124"""Absolute number of tokens.125126Example:127    To specify 3000 tokens:128129    ```python130    ("tokens", 3000)131    ```132"""133134ContextMessages = tuple[Literal["messages"], int]135"""Absolute number of messages.136137Example:138    To specify 50 messages:139140    ```python141    ("messages", 50)142    ```143"""144145ContextSize = ContextFraction | ContextTokens | ContextMessages146"""Union type for context size specifications.147148Can be either:149150- [`ContextFraction`][langchain.agents.middleware.summarization.ContextFraction]: A151    fraction of the model's maximum input tokens.152- [`ContextTokens`][langchain.agents.middleware.summarization.ContextTokens]: An absolute153    number of tokens.154- [`ContextMessages`][langchain.agents.middleware.summarization.ContextMessages]: An155    absolute number of messages.156157Depending on use with `trigger` or `keep` parameters, this type indicates either158when to trigger summarization or how much context to retain.159160Example:161    ```python162    # ContextFraction163    context_size: ContextSize = ("fraction", 0.5)164165    # ContextTokens166    context_size: ContextSize = ("tokens", 3000)167168    # ContextMessages169    context_size: ContextSize = ("messages", 50)170    ```171"""172173174class TriggerClause(TypedDict, total=False):175    """Dictionary-based trigger specification for AND conditions.176177    All specified thresholds in a single `TriggerClause` must be met for the clause to178    trigger summarization (AND semantics). When multiple clauses are provided in a list,179    summarization triggers if any clause is met (OR semantics).180181    Example:182        ```python183        # AND: Trigger when tokens >= 4000 AND messages >= 10184        trigger_clause: TriggerClause = {"tokens": 4000, "messages": 10}185186        # Use in a list for OR semantics:187        trigger_list: list[TriggerClause] = [188            {"tokens": 5000, "messages": 3},189            {"tokens": 3000, "messages": 6},190        ]191        ```192    """193194    tokens: int195    """Trigger when the computed (or provider-reported) token count reaches or196    exceeds this value.197    """198199    messages: int200    """Trigger when message count reaches or exceeds this value."""201202    fraction: float203    """Trigger when the computed (or provider-reported) token count reaches or204    exceeds this fraction of the model's maximum input tokens.205    """206207208def _get_approximate_token_counter(model: BaseChatModel) -> TokenCounter:209    """Tune parameters of approximate token counter based on model type."""210    if model._llm_type.startswith("anthropic-chat"):  # noqa: SLF001211        # 3.3 was estimated in an offline experiment, comparing with Claude's token-counting212        # API: https://platform.claude.com/docs/en/build-with-claude/token-counting213        return partial(214            count_tokens_approximately, use_usage_metadata_scaling=True, chars_per_token=3.3215        )216    return partial(count_tokens_approximately, use_usage_metadata_scaling=True)217218219class SummarizationMiddleware(AgentMiddleware[AgentState[ResponseT], ContextT, ResponseT]):220    """Summarizes conversation history when token limits are approached.221222    This middleware monitors message token counts and automatically summarizes older223    messages when a threshold is reached, preserving recent messages and maintaining224    context continuity by ensuring AI/Tool message pairs remain together.225    """226227    def __init__(228        self,229        model: str | BaseChatModel,230        *,231        trigger: (ContextSize | TriggerClause | list[ContextSize | TriggerClause] | None) = None,232        keep: ContextSize = ("messages", _DEFAULT_MESSAGES_TO_KEEP),233        token_counter: TokenCounter = count_tokens_approximately,234        summary_prompt: str = DEFAULT_SUMMARY_PROMPT,235        trim_tokens_to_summarize: int | None = _DEFAULT_TRIM_TOKEN_LIMIT,236        **deprecated_kwargs: Any,237    ) -> None:238        """Initialize summarization middleware.239240        Args:241            model: The language model to use for generating summaries.242            trigger: One or more thresholds that trigger summarization.243244                Provide a single245                [`ContextSize`][langchain.agents.middleware.summarization.ContextSize]246                tuple, or a single247                [`TriggerClause`][langchain.agents.middleware.summarization.TriggerClause]248                dict, or a list mixing either form.249250                A `ContextSize` tuple expresses one threshold. A `TriggerClause` dict251                expresses multiple thresholds that must *all* be met (AND). When a list is252                provided, summarization runs if *any* item is met (OR).253254                !!! example255256                    ```python257                    # Trigger summarization when 50 messages is reached258                    ("messages", 50)259260                    # Trigger summarization when 3000 tokens is reached261                    ("tokens", 3000)262263                    # Trigger summarization either when 80% of model's max input tokens264                    # is reached or when 100 messages is reached (whichever comes first)265                    [("fraction", 0.8), ("messages", 100)]266267                    # Trigger when tokens >= 4000 AND messages >= 10268                    {"tokens": 4000, "messages": 10}269270                    # Trigger when (tokens >= 5000 AND messages >= 3) OR271                    # (tokens >= 3000 AND messages >= 6)272                    [{"tokens": 5000, "messages": 3}, {"tokens": 3000, "messages": 6}]273                    ```274275                    See [`ContextSize`][langchain.agents.middleware.summarization.ContextSize]276                    for more details.277            keep: Context retention policy applied after summarization.278279                Provide a [`ContextSize`][langchain.agents.middleware.summarization.ContextSize]280                tuple to specify how much history to preserve.281282                Defaults to keeping the most recent `20` messages.283284                Does not support multiple values like `trigger`.285286                !!! example287288                    ```python289                    # Keep the most recent 20 messages290                    ("messages", 20)291292                    # Keep the most recent 3000 tokens293                    ("tokens", 3000)294295                    # Keep the most recent 30% of the model's max input tokens296                    ("fraction", 0.3)297                    ```298            token_counter: Function to count tokens in messages.299            summary_prompt: Prompt template for generating summaries.300            trim_tokens_to_summarize: Maximum tokens to keep when preparing messages for301                the summarization call.302303                Pass `None` to skip trimming entirely.304        """305        # Handle deprecated parameters306        if "max_tokens_before_summary" in deprecated_kwargs:307            value = deprecated_kwargs["max_tokens_before_summary"]308            warnings.warn(309                "max_tokens_before_summary is deprecated. Use trigger=('tokens', value) instead.",310                DeprecationWarning,311                stacklevel=2,312            )313            if trigger is None and value is not None:314                trigger = ("tokens", value)315316        if "messages_to_keep" in deprecated_kwargs:317            value = deprecated_kwargs["messages_to_keep"]318            warnings.warn(319                "messages_to_keep is deprecated. Use keep=('messages', value) instead.",320                DeprecationWarning,321                stacklevel=2,322            )323            if keep == ("messages", _DEFAULT_MESSAGES_TO_KEEP):324                keep = ("messages", value)325326        super().__init__()327328        if isinstance(model, str):329            model = init_chat_model(model)330331        self.model = model332333        self.trigger: ContextSize | TriggerClause | list[ContextSize | TriggerClause] | None = (334            self._copy_trigger(trigger)335        )336337        # Canonical trigger representation: AND within a clause, OR across clauses.338        self._trigger_clauses = self._normalize_trigger(self.trigger)339        # Legacy compatibility view for private consumers that inspected the previous340        # tuple-normalized representation. LangChain behavior is driven by341        # `_trigger_clauses`, not this attribute. Remove in LangChain 2.0.342        self._trigger_conditions = self._legacy_trigger_conditions(self.trigger)343344        self.keep = self._validate_context_size(keep, "keep")345        if token_counter is count_tokens_approximately:346            self.token_counter = _get_approximate_token_counter(self.model)347            self._partial_token_counter: TokenCounter = partial(  # type: ignore[call-arg]348                self.token_counter, use_usage_metadata_scaling=False349            )350        else:351            self.token_counter = token_counter352            self._partial_token_counter = token_counter353        self.summary_prompt = summary_prompt354        self.trim_tokens_to_summarize = trim_tokens_to_summarize355356        requires_profile = any("fraction" in clause for clause in self._trigger_clauses)357        if self.keep[0] == "fraction":358            requires_profile = True359        if requires_profile and self._get_profile_limits() is None:360            msg = (361                "Model profile information is required to use fractional token limits, "362                "and is unavailable for the specified model. Please use absolute token "363                "counts instead, or pass "364                '`\n\nChatModel(..., profile={"max_input_tokens": ...})`.\n\n'365                "with a desired integer value of the model's maximum input tokens."366            )367            raise ValueError(msg)368369    @override370    def before_model(371        self, state: AgentState[Any], runtime: Runtime[ContextT]372    ) -> dict[str, Any] | None:373        """Process messages before model invocation, potentially triggering summarization.374375        Args:376            state: The agent state.377            runtime: The runtime environment.378379        Returns:380            An updated state with summarized messages if summarization was performed.381        """382        messages = state["messages"]383        self._ensure_message_ids(messages)384385        total_tokens = self.token_counter(messages)386        if not self._should_summarize(messages, total_tokens):387            return None388389        cutoff_index = self._determine_cutoff_index(messages)390391        if cutoff_index <= 0:392            return None393394        messages_to_summarize, preserved_messages = self._partition_messages(messages, cutoff_index)395396        summary = self._create_summary(messages_to_summarize)397        new_messages = self._build_new_messages(summary)398399        return {400            "messages": [401                RemoveMessage(id=REMOVE_ALL_MESSAGES),402                *new_messages,403                *preserved_messages,404            ]405        }406407    @override408    async def abefore_model(409        self, state: AgentState[Any], runtime: Runtime[ContextT]410    ) -> dict[str, Any] | None:411        """Process messages before model invocation, potentially triggering summarization.412413        Args:414            state: The agent state.415            runtime: The runtime environment.416417        Returns:418            An updated state with summarized messages if summarization was performed.419        """420        messages = state["messages"]421        self._ensure_message_ids(messages)422423        total_tokens = self.token_counter(messages)424        if not self._should_summarize(messages, total_tokens):425            return None426427        cutoff_index = self._determine_cutoff_index(messages)428429        if cutoff_index <= 0:430            return None431432        messages_to_summarize, preserved_messages = self._partition_messages(messages, cutoff_index)433434        summary = await self._acreate_summary(messages_to_summarize)435        new_messages = self._build_new_messages(summary)436437        return {438            "messages": [439                RemoveMessage(id=REMOVE_ALL_MESSAGES),440                *new_messages,441                *preserved_messages,442            ]443        }444445    @staticmethod446    def _copy_trigger(447        trigger: ContextSize | TriggerClause | list[ContextSize | TriggerClause] | None,448    ) -> ContextSize | TriggerClause | list[ContextSize | TriggerClause] | None:449        """Copy mutable trigger containers so caller mutations do not affect this instance."""450        if isinstance(trigger, Mapping):451            return cast("TriggerClause", dict(trigger))452        if isinstance(trigger, list):453            return [454                cast("TriggerClause", dict(item)) if isinstance(item, Mapping) else item455                for item in trigger456            ]457        return trigger458459    def _legacy_trigger_conditions(460        self,461        trigger: ContextSize | TriggerClause | list[ContextSize | TriggerClause] | None,462    ) -> list[ContextSize]:463        """Project tuple-expressible triggers to the legacy private representation."""464        if trigger is None:465            return []466        if isinstance(trigger, tuple):467            return [self._validate_context_size(trigger, "trigger")]468        if isinstance(trigger, Mapping):469            if len(trigger) != 1:470                return []471            kind, value = next(iter(trigger.items()))472            return [self._validate_context_size(cast("ContextSize", (kind, value)), "trigger")]473474        conditions: list[ContextSize] = []475        for item in trigger:476            if isinstance(item, tuple):477                conditions.append(self._validate_context_size(item, "trigger"))478            elif isinstance(item, Mapping) and len(item) == 1:479                kind, value = next(iter(item.items()))480                conditions.append(481                    self._validate_context_size(cast("ContextSize", (kind, value)), "trigger")482                )483        return conditions484485    def _normalize_trigger(486        self,487        trigger: (ContextSize | TriggerClause | list[ContextSize | TriggerClause] | None),488    ) -> list[TriggerClause]:489        """Normalize supported trigger inputs into list of Trigger clauses.490491        - tuple ("tokens", 3000) -> [{"tokens": 3000}]492        - dict {"tokens": 4000, "messages": 10} -> [{"tokens": 4000, "messages": 10}]493        - list of either -> OR across items494        """495        if trigger is None:496            return []497498        def _validate_and_convert_tuple(t: ContextSize) -> TriggerClause:499            kind, value = self._validate_context_size(t, "trigger")500            return cast("TriggerClause", {kind: value})501502        def _validate_mapping(m: Mapping[str, Any]) -> TriggerClause:503            """Validate and convert a mapping to a TriggerClause.504505            Type checks reject silent coercion (booleans, numeric strings, and506            fractional floats for integer metrics) so a misconfigured clause fails loudly507            at construction. Range and positivity checks are delegated to508            `_validate_context_size`, keeping a single source of truth for the rules and509            error messages shared with the tuple form.510            """511            if not m:512                msg = "trigger clause must specify at least one of 'tokens', 'messages', 'fraction'"513                raise ValueError(msg)514            out: dict[str, float | int] = {}515            for k, v in m.items():516                if k not in {"tokens", "messages", "fraction"}:517                    msg = f"Unsupported trigger metric: {k!r}"518                    raise ValueError(msg)519                # `bool` is an `int` subclass; reject it so `{"messages": True}` cannot520                # silently become a threshold of 1. Raise `ValueError` (not `TypeError`)521                # so every trigger-config error stays one catchable type.522                if isinstance(v, bool):523                    msg = f"{k} trigger value must be numeric, got {v!r}"524                    raise ValueError(msg)  # noqa: TRY004525                if k == "fraction":526                    if not isinstance(v, (int, float)):527                        msg = f"Fraction trigger values must be numeric, got {v!r}"528                        raise ValueError(msg)529                elif not isinstance(v, int):530                    # Reject floats and numeric strings rather than truncating/coercing.531                    msg = f"{k} trigger values must be integers, got {v!r}"532                    raise ValueError(msg)533                # Delegate range/positivity validation so dict and tuple forms share534                # identical rules and error messages.535                self._validate_context_size(cast("ContextSize", (k, v)), "trigger")536                out[k] = v537            return cast("TriggerClause", out)538539        clauses: list[TriggerClause] = []540        # `trigger` may originate from untyped callers, so dispatch on the runtime type541        # and raise on anything unsupported.542        subject: Any = trigger543        if isinstance(subject, Mapping):544            clauses.append(_validate_mapping(subject))545        elif isinstance(subject, tuple):546            clauses.append(_validate_and_convert_tuple(cast("ContextSize", subject)))547        elif isinstance(subject, list):548            for item in subject:549                if isinstance(item, Mapping):550                    clauses.append(_validate_mapping(item))551                elif isinstance(item, tuple):552                    clauses.append(_validate_and_convert_tuple(cast("ContextSize", item)))553                else:554                    msg = f"Unsupported trigger item type: {type(item)}"555                    raise TypeError(msg)556        else:557            msg = f"Unsupported trigger type: {type(subject)}"558            raise TypeError(msg)559        return clauses560561    def _should_summarize_based_on_reported_tokens(562        self, messages: list[AnyMessage], threshold: float563    ) -> bool:564        """Check if reported token usage from last AIMessage exceeds threshold."""565        last_ai_message = next(566            (msg for msg in reversed(messages) if isinstance(msg, AIMessage)),567            None,568        )569        if (  # noqa: SIM103570            isinstance(last_ai_message, AIMessage)571            and last_ai_message.usage_metadata is not None572            and (reported_tokens := last_ai_message.usage_metadata.get("total_tokens", -1))573            and reported_tokens >= threshold574            and (message_provider := last_ai_message.response_metadata.get("model_provider"))575            and _provider_matches(576                message_provider,577                self.model._get_ls_params().get("ls_provider"),  # noqa: SLF001578            )579        ):580            return True581        return False582583    def _should_summarize(self, messages: list[AnyMessage], total_tokens: int) -> bool:584        """Determine whether summarization should run for the current token usage."""585        if not self._trigger_clauses:586            return False587588        for clause in self._trigger_clauses:589            clause_met = True590            for kind, value in clause.items():591                if kind == "messages" and len(messages) < cast("int", value):592                    clause_met = False593                    break594                if kind == "tokens":595                    threshold_tokens = cast("int", value)596                    # Trigger if total tokens exceed threshold OR reported tokens do597                    if (598                        total_tokens < threshold_tokens599                        and not self._should_summarize_based_on_reported_tokens(600                            messages, float(threshold_tokens)601                        )602                    ):603                        clause_met = False604                        break605                if kind == "fraction":606                    max_input_tokens = self._get_profile_limits()607                    if max_input_tokens is None:608                        clause_met = False609                        break610                    threshold = int(max_input_tokens * cast("float", value))611                    if threshold <= 0:612                        threshold = 1613                    if (614                        total_tokens < threshold615                        and not self._should_summarize_based_on_reported_tokens(616                            messages, float(threshold)617                        )618                    ):619                        clause_met = False620                        break621            if clause_met:622                return True623        return False624625    def _determine_cutoff_index(self, messages: list[AnyMessage]) -> int:626        """Choose cutoff index respecting retention configuration."""627        kind, value = self.keep628        if kind in {"tokens", "fraction"}:629            token_based_cutoff = self._find_token_based_cutoff(messages)630            if token_based_cutoff is not None:631                return token_based_cutoff632            # None cutoff -> model profile data not available (caught in __init__ but633            # here for safety), fallback to message count634            return self._find_safe_cutoff(messages, _DEFAULT_MESSAGES_TO_KEEP)635        return self._find_safe_cutoff(messages, cast("int", value))636637    def _find_token_based_cutoff(self, messages: list[AnyMessage]) -> int | None:638        """Find cutoff index based on target token retention."""639        if not messages:640            return 0641642        kind, value = self.keep643        if kind == "fraction":644            max_input_tokens = self._get_profile_limits()645            if max_input_tokens is None:646                return None647            target_token_count = int(max_input_tokens * value)648        elif kind == "tokens":649            target_token_count = int(value)650        else:651            return None652653        if target_token_count <= 0:654            target_token_count = 1655656        if self.token_counter(messages) <= target_token_count:657            return 0658659        # Use binary search to identify the earliest message index that keeps the660        # suffix within the token budget.661        left, right = 0, len(messages)662        cutoff_candidate = len(messages)663        max_iterations = len(messages).bit_length() + 1664        for _ in range(max_iterations):665            if left >= right:666                break667668            mid = (left + right) // 2669            if self._partial_token_counter(messages[mid:]) <= target_token_count:670                cutoff_candidate = mid671                right = mid672            else:673                left = mid + 1674675        if cutoff_candidate == len(messages):676            cutoff_candidate = left677678        if cutoff_candidate >= len(messages):679            if len(messages) == 1:680                return 0681            cutoff_candidate = len(messages) - 1682683        # Advance past any ToolMessages to avoid splitting AI/Tool pairs684        return self._find_safe_cutoff_point(messages, cutoff_candidate)685686    def _get_profile_limits(self) -> int | None:687        """Retrieve max input token limit from the model profile."""688        try:689            profile = self.model.profile690        except AttributeError:691            return None692693        if not isinstance(profile, Mapping):694            return None695696        max_input_tokens = profile.get("max_input_tokens")697698        if not isinstance(max_input_tokens, int):699            return None700701        return max_input_tokens702703    @staticmethod704    def _validate_context_size(context: ContextSize, parameter_name: str) -> ContextSize:705        """Validate context configuration tuples."""706        kind, value = context707        if kind == "fraction":708            if not 0 < value <= 1:709                msg = f"Fractional {parameter_name} values must be between 0 and 1, got {value}."710                raise ValueError(msg)711        elif kind in {"tokens", "messages"}:712            if value <= 0:713                msg = f"{parameter_name} thresholds must be greater than 0, got {value}."714                raise ValueError(msg)715        else:716            msg = f"Unsupported context size type {kind} for {parameter_name}."717            raise ValueError(msg)718        return context719720    @staticmethod721    def _build_new_messages(summary: str) -> list[HumanMessage]:722        return [723            HumanMessage(724                content=f"Here is a summary of the conversation to date:\n\n{summary}",725                additional_kwargs={"lc_source": "summarization"},726            )727        ]728729    @staticmethod730    def _ensure_message_ids(messages: list[AnyMessage]) -> None:731        """Ensure all messages have unique IDs for the add_messages reducer."""732        for msg in messages:733            if msg.id is None:734                msg.id = str(uuid.uuid4())735736    @staticmethod737    def _partition_messages(738        conversation_messages: list[AnyMessage],739        cutoff_index: int,740    ) -> tuple[list[AnyMessage], list[AnyMessage]]:741        """Partition messages into those to summarize and those to preserve."""742        messages_to_summarize = conversation_messages[:cutoff_index]743        preserved_messages = conversation_messages[cutoff_index:]744745        return messages_to_summarize, preserved_messages746747    def _find_safe_cutoff(self, messages: list[AnyMessage], messages_to_keep: int) -> int:748        """Find safe cutoff point that preserves AI/Tool message pairs.749750        Returns the index where messages can be safely cut without separating751        related AI and Tool messages. Returns `0` if no safe cutoff is found.752753        This is aggressive with summarization - if the target cutoff lands in the754        middle of tool messages, we advance past all of them (summarizing more).755        """756        if len(messages) <= messages_to_keep:757            return 0758759        target_cutoff = len(messages) - messages_to_keep760        return self._find_safe_cutoff_point(messages, target_cutoff)761762    @staticmethod763    def _find_safe_cutoff_point(messages: list[AnyMessage], cutoff_index: int) -> int:764        """Find a safe cutoff point that doesn't split AI/Tool message pairs.765766        If the message at `cutoff_index` is a `ToolMessage`, search backward for the767        `AIMessage` containing the corresponding `tool_calls` and adjust the cutoff to768        include it. This ensures tool call requests and responses stay together.769770        Falls back to advancing forward past `ToolMessage` objects only if no matching771        `AIMessage` is found (edge case).772        """773        if cutoff_index >= len(messages) or not isinstance(messages[cutoff_index], ToolMessage):774            return cutoff_index775776        # Collect tool_call_ids from consecutive ToolMessages at/after cutoff777        tool_call_ids: set[str] = set()778        idx = cutoff_index779        while idx < len(messages) and isinstance(messages[idx], ToolMessage):780            tool_msg = cast("ToolMessage", messages[idx])781            if tool_msg.tool_call_id:782                tool_call_ids.add(tool_msg.tool_call_id)783            idx += 1784785        # Search backward for AIMessage with matching tool_calls786        for i in range(cutoff_index - 1, -1, -1):787            msg = messages[i]788            if isinstance(msg, AIMessage) and msg.tool_calls:789                ai_tool_call_ids = {tc.get("id") for tc in msg.tool_calls if tc.get("id")}790                if tool_call_ids & ai_tool_call_ids:791                    # Found the AIMessage - move cutoff to include it792                    return i793794        # Fallback: no matching AIMessage found, advance past ToolMessages to avoid795        # orphaned tool responses796        return idx797798    def _create_summary(self, messages_to_summarize: list[AnyMessage]) -> str:799        """Generate summary for the given messages.800801        Args:802            messages_to_summarize: Messages to summarize.803        """804        if not messages_to_summarize:805            return "No previous conversation history."806807        trimmed_messages = self._trim_messages_for_summary(messages_to_summarize)808        if not trimmed_messages:809            return "Previous conversation was too long to summarize."810811        # Serialize as XML so URL-based multimodal blocks remain visible in the summary812        # prompt while excluding raw message metadata from the token budget.813        formatted_messages = get_buffer_string(trimmed_messages, format="xml")814815        try:816            response = self.model.invoke(817                self.summary_prompt.format(messages=formatted_messages).rstrip(),818                config={"metadata": {"lc_source": "summarization"}},819            )820            return response.text.strip()821        except Exception as e:822            return f"Error generating summary: {e!s}"823824    async def _acreate_summary(self, messages_to_summarize: list[AnyMessage]) -> str:825        """Generate summary for the given messages.826827        Args:828            messages_to_summarize: Messages to summarize.829        """830        if not messages_to_summarize:831            return "No previous conversation history."832833        trimmed_messages = self._trim_messages_for_summary(messages_to_summarize)834        if not trimmed_messages:835            return "Previous conversation was too long to summarize."836837        # Serialize as XML so URL-based multimodal blocks remain visible in the summary838        # prompt while excluding raw message metadata from the token budget.839        formatted_messages = get_buffer_string(trimmed_messages, format="xml")840841        try:842            response = await self.model.ainvoke(843                self.summary_prompt.format(messages=formatted_messages).rstrip(),844                config={"metadata": {"lc_source": "summarization"}},845            )846            return response.text.strip()847        except Exception as e:848            return f"Error generating summary: {e!s}"849850    def _trim_messages_for_summary(self, messages: list[AnyMessage]) -> list[AnyMessage]:851        """Trim messages to fit within summary generation limits."""852        try:853            if self.trim_tokens_to_summarize is None:854                return messages855            return cast(856                "list[AnyMessage]",857                trim_messages(858                    messages,859                    max_tokens=self.trim_tokens_to_summarize,860                    token_counter=self.token_counter,861                    start_on="human",862                    strategy="last",863                    allow_partial=True,864                    include_system=True,865                ),866            )867        except Exception:868            return messages[-_DEFAULT_FALLBACK_MESSAGE_COUNT:]
Code quality findings 26

Overuse may indicate design issues; consider polymorphism
L328
isinstance-overuse
if isinstance(model, str):
Ensure functions have docstrings for documentation
L370
missing-docstring
def before_model(
Ensure functions have docstrings for documentation
L408
missing-docstring
async def abefore_model(
Overuse may indicate design issues; consider polymorphism
L450
isinstance-overuse
if isinstance(trigger, Mapping):
Overuse may indicate design issues; consider polymorphism
L452
isinstance-overuse
if isinstance(trigger, list):
Overuse may indicate design issues; consider polymorphism
L454
isinstance-overuse
cast("TriggerClause", dict(item)) if isinstance(item, Mapping) else item
Overuse may indicate design issues; consider polymorphism
L466
isinstance-overuse
if isinstance(trigger, tuple):
Overuse may indicate design issues; consider polymorphism
L468
isinstance-overuse
if isinstance(trigger, Mapping):
Overuse may indicate design issues; consider polymorphism
L476
isinstance-overuse
if isinstance(item, tuple):
Overuse may indicate design issues; consider polymorphism
L478
isinstance-overuse
elif isinstance(item, Mapping) and len(item) == 1:
Overuse may indicate design issues; consider polymorphism
L522
isinstance-overuse
if isinstance(v, bool):
Overuse may indicate design issues; consider polymorphism
L526
isinstance-overuse
if not isinstance(v, (int, float)):
Overuse may indicate design issues; consider polymorphism
L529
isinstance-overuse
elif not isinstance(v, int):
Overuse may indicate design issues; consider polymorphism
L543
isinstance-overuse
if isinstance(subject, Mapping):
Overuse may indicate design issues; consider polymorphism
L545
isinstance-overuse
elif isinstance(subject, tuple):
Overuse may indicate design issues; consider polymorphism
L547
isinstance-overuse
elif isinstance(subject, list):
Overuse may indicate design issues; consider polymorphism
L549
isinstance-overuse
if isinstance(item, Mapping):
Overuse may indicate design issues; consider polymorphism
L551
isinstance-overuse
elif isinstance(item, tuple):
Overuse may indicate design issues; consider polymorphism
L566
isinstance-overuse
(msg for msg in reversed(messages) if isinstance(msg, AIMessage)),
Overuse may indicate design issues; consider polymorphism
L570
isinstance-overuse
isinstance(last_ai_message, AIMessage)
Overuse may indicate design issues; consider polymorphism
L693
isinstance-overuse
if not isinstance(profile, Mapping):
Overuse may indicate design issues; consider polymorphism
L698
isinstance-overuse
if not isinstance(max_input_tokens, int):
Overuse may indicate design issues; consider polymorphism
L773
isinstance-overuse
if cutoff_index >= len(messages) or not isinstance(messages[cutoff_index], ToolMessage):
Overuse may indicate design issues; consider polymorphism
L779
isinstance-overuse
while idx < len(messages) and isinstance(messages[idx], ToolMessage):
Overuse may indicate design issues; consider polymorphism
L788
isinstance-overuse
if isinstance(msg, AIMessage) and msg.tool_calls:
Catch specific exceptions instead of Exception to avoid masking bugs
L867
broad-except
except Exception:
Code quality findings 26

Get this view in your editor