CLAUDE.md · langchain-ai/langchain

1# Global development guidelines for the LangChain monorepo23This document provides context to understand the LangChain Python project and assist with development.45## Project architecture and context67### Monorepo structure89This is a Python monorepo with multiple independently versioned packages that use `uv`.1011```txt12langchain/13├── libs/14│   ├── core/             # `langchain-core` primitives and base abstractions15│   ├── langchain/        # `langchain-classic` (legacy, no new features)16│   ├── langchain_v1/     # Actively maintained `langchain` package17│   ├── partners/         # Third-party integrations18│   │   ├── openai/       # OpenAI models and embeddings19│   │   ├── anthropic/    # Anthropic (Claude) integration20│   │   ├── ollama/       # Local model support21│   │   └── ... (other integrations maintained by the LangChain team)22│   ├── text-splitters/   # Document chunking utilities23│   ├── standard-tests/   # Shared test suite for integrations24│   ├── model-profiles/   # Model configuration profiles25├── .github/              # CI/CD workflows and templates26├── .vscode/              # VSCode IDE standard settings and recommended extensions27└── README.md             # Information about LangChain28```2930- **Core layer** (`langchain-core`): Base abstractions, interfaces, and protocols. Users should not need to know about this layer directly.31- **Implementation layer** (`langchain`): Concrete implementations and high-level public utilities32- **Integration layer** (`partners/`): Third-party service integrations. Note that this monorepo is not exhaustive of all LangChain integrations; some are maintained in separate repos, such as `langchain-ai/langchain-google` and `langchain-ai/langchain-aws`. Usually these repos are cloned at the same level as this monorepo, so if needed, you can refer to their code directly by navigating to `../langchain-google/` from this monorepo.33- **Testing layer** (`standard-tests/`): Standardized integration tests for partner integrations3435### Development tools & commands3637- `uv` – Fast Python package installer and resolver (replaces pip/poetry)38- `make` – Task runner for common development commands. Feel free to look at the `Makefile` for available commands and usage patterns.39- `ruff` – Fast Python linter and formatter40- `mypy` – Static type checking41- `pytest` – Testing framework4243This monorepo uses `uv` for dependency management. Local development uses editable installs: `[tool.uv.sources]`4445Each package in `libs/` has its own `pyproject.toml` and `uv.lock`.4647Before running your tests, set up all packages by running:4849```bash50# For all groups51uv sync --all-groups5253# or, to install a specific group only:54uv sync --group test55```5657```bash58# Run unit tests (no network)59make test6061# Run specific test file62uv run --group test pytest tests/unit_tests/test_specific.py63```6465```bash66# Lint code67make lint6869# Format code70make format7172# Type checking73uv run --group lint mypy .74```7576#### Environment and dependency management7778Use `uv` for all environment and dependency operations in this monorepo. Do not invoke `pip`, `poetry`, or `conda` directly.7980- Let `uv` manage the interpreter and virtual environments — `uv sync` and `uv run` operate without manual `source .venv/bin/activate`. Do not create ad-hoc virtual environments outside the package directory.81- Each package targets its own supported Python range via its `pyproject.toml`; do not pin a global Python version. If you need an interpreter explicitly, defer to the package's `requires-python` rather than assuming system Python.82- Install dependencies explicitly through `uv sync` (optionally `--group <name>` / `--all-groups`); never let them install implicitly.83- Don't mix environments within a session, and don't add new dependencies unless strictly required — when you do, justify them (recent releases/commits, adoption).8485#### Key config files8687- pyproject.toml: Main workspace configuration with dependency groups88- uv.lock: Locked dependencies for reproducible builds89- Makefile: Development tasks9091#### PR and commit titles9293Follow Conventional Commits. See `.github/workflows/pr_lint.yml` for allowed types and scopes. All titles must include a scope with no exceptions — even for the main `langchain` package.9495- Start the text after `type(scope):` with a lowercase letter, unless the first word is a proper noun (e.g. `Azure`, `GitHub`, `OpenAI`) or a named entity (class, function, method, parameter, or variable name).96- Wrap named entities in backticks so they render as code. Proper nouns are left unadorned.97- Keep titles short and descriptive — save detail for the body.9899Examples:100101```txt102feat(langchain): add new chat completion feature103fix(core): resolve type hinting issue in vector store104chore(anthropic): update infrastructure dependencies105feat(langchain): `ls_agent_type` tag on `create_agent` calls106fix(openai): infer Azure chat profiles from model name107```108109#### Branch naming110111Branches should be prefixed `<github-username>/<scope>/<short-description>`:112113- `<github-username>` — the author's GitHub login (e.g. `mdrxy`).114- `<scope>` — the same scope used in the Conventional Commit title (`core`, `langchain`, partner name, `infra`, `docs`, etc.).115- `<short-description>` — kebab-case, brief, no trailing slash.116117Examples:118119```txt120mdrxy/anthropic/normalize-tool-call-ids121mdrxy/core/vector-store-type-hints122mdrxy/infra/agents-md-branch123```124125#### PR descriptions126127The description *is* the summary — do not add a `# Summary` header.128129- When the PR closes an issue, lead with the closing keyword on its own line at the very top, followed by a horizontal rule and then the body:130131  ```txt132  Closes #123133134  ---135136  <rest of description>137  ```138139  Only `Closes`, `Fixes`, and `Resolves` auto-close the referenced issue on merge. `Related:` or similar labels are informational and do not close anything.140141- Explain the *why*: who benefits, what problem they had, and how this solves it. Prefer a simple user story over a long summary.142- Write for readers who may be unfamiliar with this area of the codebase. Avoid insider shorthand and prefer language that is friendly to public viewers — this aids interpretability.143- Do **not** cite line numbers; they go stale as soon as the file changes.144- Rarely include full file paths or filenames. Reference the affected symbol, class, or subsystem by name instead.145- Wrap class, function, method, parameter, and variable names in backticks.146- For net new features or behavior-changing bugfixes, PR descriptions should include a `## Release note` section that states the user-visible change in release-note-ready language.147- Skip dedicated "Test plan" or "Testing" sections in most cases. Mention tests only when coverage is non-obvious, risky, or otherwise notable.148- Call out areas of the change that require careful review.149- Add a brief disclaimer noting AI-agent involvement in the contribution.150151## Core development principles152153### Maintain stable public interfaces154155CRITICAL: Always attempt to preserve function signatures, argument positions, and names for exported/public methods. Do not make breaking changes.156You should warn the developer for any function signature changes, regardless of whether they look breaking or not.157158**Before making ANY changes to public APIs:**159160- Check if the function/class is exported in `__init__.py`161- Look for existing usage patterns in tests and examples162- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`163- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)164165Ask: "Would this change break someone's code if they used it last week?"166167### Code quality standards168169All Python code MUST include type hints and return types.170171```python title="Example"172def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:173    """Single line description of the function.174175    Any additional context about the function can go here.176177    Args:178        users: List of user identifiers to filter.179        known_users: Set of known/valid user identifiers.180181    Returns:182        List of users that are not in the `known_users` set.183    """184```185186- Use descriptive, self-explanatory variable names.187- Follow existing patterns in the codebase you're modifying188- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense189190### Testing requirements191192Every new feature or bugfix MUST be covered by unit tests.193194- Unit tests: `tests/unit_tests/` (no network calls allowed)195- Integration tests: `tests/integration_tests/` (network calls permitted)196- We use `pytest` as the testing framework; if in doubt, check other existing tests for examples.197- The testing file structure should mirror the source code structure.198199**Checklist:**200201- [ ] Tests fail when your new logic is broken202- [ ] Happy path is covered203- [ ] Edge cases and error conditions are tested204- [ ] Use fixtures/mocks for external dependencies205- [ ] Tests are deterministic (no flaky tests)206- [ ] Does the test suite fail if your new logic is broken?207208### Security and risk assessment209210- No `eval()`, `exec()`, or `pickle` on user-controlled input211- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages212- Remove unreachable/commented code before committing213- Race conditions or resource leaks (file handles, sockets, threads).214- Ensure proper resource cleanup (file handles, connections)215216### Documentation standards217218Use Google-style docstrings with Args section for all public functions.219220```python title="Example"221def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:222    """Send an email to a recipient with specified priority.223224    Any additional context about the function can go here.225226    Args:227        to: The email address of the recipient.228        msg: The message body to send.229        priority: Email priority level.230231    Returns:232        `True` if email was sent successfully, `False` otherwise.233234    Raises:235        InvalidEmailError: If the email address format is invalid.236        SMTPConnectionError: If unable to connect to email server.237    """238```239240- Types go in function signatures, NOT in docstrings241  - If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.242- Focus on "why" rather than "what" in descriptions243- Document all parameters, return values, and exceptions244- Keep descriptions concise but clear245- Ensure American English spelling (e.g., "behavior", not "behaviour")246- Do NOT use Sphinx-style double backtick formatting (` ``code`` `). Use single backticks (`` `code` ``) for inline code references in docstrings and comments.247248#### Model references in docs and examples249250Always use the latest generally available (GA) models when referencing LLMs in docstrings and illustrative code snippets. Avoid preview or beta identifiers unless the model has no GA equivalent. Outdated model names signal stale code and confuse users.251252Before writing or updating model references, verify current model IDs against the provider's official docs. Do not rely on memorized or cached model names — they go stale quickly.253254Changing **shipped default parameter values** in code (e.g., a `model=` kwarg default in a class constructor) may constitute a breaking change — see "Maintain stable public interfaces" above. This guidance applies to documentation and examples, not code defaults.255256For model *profile data* (capability flags, context windows), use the `langchain-profiles` CLI described below.257258## Model profiles259260Model profiles are generated using the `langchain-profiles` CLI in `libs/model-profiles`. The `--data-dir` must point to the directory containing `profile_augmentations.toml`, not the top-level package directory.261262```bash263# Run from libs/model-profiles264cd libs/model-profiles265266# Refresh profiles for a partner in this repo267uv run langchain-profiles refresh --provider openai --data-dir ../partners/openai/langchain_openai/data268269# Refresh profiles for a partner in an external repo (requires echo y to confirm)270echo y | uv run langchain-profiles refresh --provider google --data-dir /path/to/langchain-google/libs/genai/langchain_google_genai/data271```272273Example partners with profiles in this repo:274275- `libs/partners/openai/langchain_openai/data/` (provider: `openai`)276- `libs/partners/anthropic/langchain_anthropic/data/` (provider: `anthropic`)277- `libs/partners/perplexity/langchain_perplexity/data/` (provider: `perplexity`)278279The `echo y |` pipe is required when `--data-dir` is outside the `libs/model-profiles` working directory.280281## CI/CD infrastructure282283### Release process284285Each partner package is released independently. The full flow is:2862871. **Version bump PR.** Create a PR that bumps three files by one line each:288   - `langchain_<partner>/_version.py` — `__version__`289   - `pyproject.toml` — `version`290   - `uv.lock` — run `uv lock` from the package directory. If the diff includes unrelated changes (e.g. environment-dependent marker lines from a different local Python version), revert them and keep only the `version = "..."` line for the package being released291292   Title follows Conventional Commits: `release(<partner>): <version>` (e.g. `release(openrouter): 0.2.6`). Use the branch name `release/<partner>-<version>`.293294   Patch vs. minor bump follows in-repo precedent: within a `0.x` series, fixes and additive features get a patch bump (e.g. `session_id` field → 0.2.1→0.2.2, `parallel_tool_calls` → 0.2.3→0.2.4).2952962. **Merge the PR** to `master`.2972983. **Trigger the release workflow.** Run `gh workflow run` against the "🚀 Package Release" workflow (`_release.yml`, file ID `63880841`):299300   ```bash301   gh workflow run 63880841 --repo langchain-ai/langchain \302     -f working-directory=<partner> -f release-version=<version>303   ```304305   `working-directory` is the short partner name from the workflow's dropdown (e.g. `openrouter`, not `libs/partners/openrouter`).3063074. **The workflow handles everything else automatically** — do **not** create a GitHub release or tag manually. The `mark-release` job (using `ncipollo/release-action`) creates the GitHub release, tag, and release notes after PyPI publish succeeds. The release notes body is auto-generated from commit history between the previous tag and HEAD.308309   Monitor the run:310311   ```bash312   gh run view <run-id> --repo langchain-ai/langchain313   ```314315   The full job chain is: build → release-notes → pre-release-checks → TestPyPI publish → PyPI publish → tag GitHub release.316317### PR labeling and linting318319**Title linting** (`.github/workflows/pr_lint.yml`)320321**Auto-labeling:**322323- `.github/workflows/pr_labeler.yml` – Unified PR labeler (size, file, title, external/internal, contributor tier)324- `.github/workflows/pr_labeler_backfill.yml` – Manual backfill of PR labels on open PRs325- `.github/workflows/auto-label-by-package.yml` – Issue labeling by package326- `.github/workflows/tag-external-issues.yml` – Issue external/internal classification327328### Integration test tracing (LangSmith)329330Scheduled and manually dispatched integration tests (`integration_tests.yml`) trace every run to LangSmith so failures link back to the originating Actions run. (`_release.yml` runs integration tests too, but does not currently configure LangSmith tracing.)331332**Env vars set by CI:**333334- `LANGSMITH_API_KEY` — authenticates to LangSmith (repo secret, scoped to the "Scheduled testing" GitHub environment in `integration_tests.yml`).335- `LANGSMITH_TRACING: "true"` — enables tracing for the test process.336- `LANGSMITH_PROJECT` — the project traces are sent to. Defaults to `scheduled-testing-py` via a repo variable override: `${{ vars.LANGSMITH_PROJECT || 'scheduled-testing-py' }}`. To change the project, set the `LANGSMITH_PROJECT` repository variable in GitHub settings — do not hardcode it in the workflow.337- `LANGSMITH_TAGS` — comma-separated tags identifying the run: `github-actions`, the matrix working directory (e.g. `libs/partners/openai`), the Python version, and the commit SHA.338- `LANGSMITH_METADATA` — a JSON object built by the "Build LangSmith Metadata" step, containing `github_sha`, `github_run_id`, `github_run_attempt`, `github_run_url`, `github_workflow`, `github_event`, `github_ref`, `working_directory`, and `python_version`.339340**The tracing bridge plugin:** The LangSmith SDK does not natively read `LANGSMITH_TAGS` or `LANGSMITH_METADATA` from the environment. The pytest plugin at `libs/standard-tests/langchain_tests/_langsmith_plugin.py` bridges that gap by entering `langsmith.run_helpers.tracing_context` for the duration of the test session. It only activates when `GITHUB_ACTIONS=true`, so local development is unaffected. Auto-discovered via the `pytest11` entry point in any package that depends on `langchain-tests`.341342**Unit test isolation:** Unit tests must never make network calls or send traces. The `make test` target in the `libs/core` Makefile uses `env -u` to unset the tracing vars (`LANGCHAIN_TRACING_V2`, `LANGCHAIN_API_KEY`, `LANGSMITH_API_KEY`, `LANGSMITH_TRACING`, `LANGCHAIN_PROJECT`) before running pytest. Additionally, `libs/core/tests/unit_tests/runnables/conftest.py` has a session-scoped autouse fixture that explicitly disables tracing for runnable unit tests, restoring the original environment afterward.343344### Adding a new partner to CI345346When adding a new partner package, update these files:347348- `.github/ISSUE_TEMPLATE/*.yml` – Add to package dropdown349- `.github/dependabot.yml` – Add dependency update entry350- `.github/scripts/pr-labeler-config.json` – Add file rule and scope-to-label mapping351- `.github/workflows/_release.yml` – Add API key secrets if needed352- `.github/workflows/auto-label-by-package.yml` – Add package label353- `.github/workflows/check_diffs.yml` – Add to change detection354- `.github/workflows/integration_tests.yml` – Add integration test config355- `.github/workflows/pr_lint.yml` – Add to allowed scopes356357## GitHub Actions & Workflows358359This repository require actions to be pinned to a full-length commit SHA. Attempting to use a tag will fail. Use the `gh` cli to query. Verify tags are not annotated tag objects (which would need dereferencing).360361## Additional resources362363- **Documentation:** https://docs.langchain.com/oss/python/langchain/overview and source at https://github.com/langchain-ai/docs or `../docs/`. Prefer the local install and use file search tools for best results. If needed, use the docs MCP server as defined in `.mcp.json` for programmatic access.364- **Contributing Guide:** [Contributing Guide](https://docs.langchain.com/oss/python/contributing/overview)
Findings

✓ No findings reported for this file.
Findings

Get this view in your editor