src/doc/rustc-dev-guide/src/debugging-support-in-rustc.md MARKDOWN 379 lines View on github.com → Search inside
1# Debugging support in the Rust compiler23This document explains the state of debugging tools support in the Rust compiler (rustc).4It gives an overview of GDB, LLDB, WinDbg/CDB,5as well as infrastructure around Rust compiler to debug Rust code.6If you want to learn how to debug the Rust compiler itself,7see [Debugging the Compiler].89The material is gathered from the video,10[Tom Tromey discusses debugging support in rustc].1112## Preliminaries1314### Debuggers1516According to Wikipedia1718> A [debugger or debugging tool] is a computer program that is used to test and debug19> other programs (the "target" program).2021Writing a debugger from scratch for a language requires a lot of work, especially if22debuggers have to be supported on various platforms.23GDB and LLDB, however, can be extended to support debugging a language.24This is the path that Rust has chosen.25This document's main goal is to document the said debuggers support in Rust compiler.2627### DWARF2829According to the [DWARF] standard website3031> DWARF is a debugging file format used by many compilers and debuggers to support source level32> debugging. It addresses the requirements of a number of procedural languages,33> such as C, C++, and Fortran, and is designed to be extensible to other languages.34> DWARF is architecture independent and applicable to any processor or operating system.35> It is widely used on Unix, Linux and other operating systems,36> as well as in stand-alone environments.3738DWARF reader is a program that consumes the DWARF format and creates debugger compatible output.39This program may live in the compiler itself.40 DWARF uses a data structure called41Debugging Information Entry (DIE) which stores the information as "tags" to denote functions,42variables etc., e.g., `DW_TAG_variable`, `DW_TAG_pointer_type`, `DW_TAG_subprogram` etc.43You can also invent your own tags and attributes.4445### CodeView/PDB4647[PDB] (Program Database) is a file format created by Microsoft that contains debug information.48PDBs can be consumed by debuggers such as WinDbg/CDB and other tools to display debug information.49A PDB contains multiple streams that describe debug information about a specific binary such50as types, symbols, and source files used to compile the given binary.51CodeView is another52format which defines the structure of [symbol records] and [type records] that appear within53PDB streams.5455## Supported debuggers5657### GDB5859#### Rust expression parser6061To be able to show debug output, we need an expression parser.62This (GDB) expression parser is written in [Bison],63and can parse only a subset of Rust expressions.64GDB parser was written from scratch and has no relation to any other parser,65including that of rustc.6667GDB has Rust-like value and type output.68It can print values and types in a way that look like Rust syntax in the output.69Or when you print a type as [ptype] in GDB,70it also looks like Rust source code.71Checkout the documentation in the [manual for GDB/Rust].7273#### Parser extensions7475Expression parser has a couple of extensions in it to facilitate features that you cannot do76with Rust.77Some limitations are listed in the [manual for GDB/Rust].78There is some special code in the DWARF reader in GDB to support the extensions.7980A couple of examples of DWARF reader support needed are as follows:81821. Enum: Needed for support for enum types.83   The Rust compiler writes the information about enum into DWARF,84   and GDB reads the DWARF to understand where is the tag field,85   or if there is a tag field,86   or if the tag slot is shared with non-zero optimization etc.87882. Dissect trait objects: DWARF extension where the trait object's description in the DWARF89   also points to a stub description of the corresponding vtable which in turn points to the90   concrete type for which this trait object exists.91   This means that you can do a `print *object`92   for that trait object, and GDB will understand how to find the correct type of the payload in93   the trait object.9495**TODO**: Figure out if the following should be mentioned in the GDB-Rust document rather than96this guide page so there is no duplication.97This is regarding the following comments:9899[This comment by Tom](https://github.com/rust-lang/rustc-dev-guide/pull/316#discussion_r284027340)100> gdb's Rust extensions and limitations are documented in the gdb manual:101https://sourceware.org/gdb/onlinedocs/gdb/Rust.html -- however, this neglects to mention that102gdb convenience variables and registers follow the gdb $ convention, and that the Rust parser103implements the gdb @ extension.104105[This question by Aman](https://github.com/rust-lang/rustc-dev-guide/pull/316#discussion_r285401353)106> @tromey do you think we should mention this part in the GDB-Rust document rather than this107document so there is no duplication etc.?108109### LLDB110111#### Rust expression parser112113This expression parser is written in C++.114It is a type of [Recursive Descent parser].115It implements slightly less of the Rust language than GDB.116LLDB has Rust-like value and type output.117118#### Developer notes119120* LLDB has a plugin architecture but that does not work for language support.121* GDB generally works better on Linux.122123### WinDbg/CDB124125Microsoft provides [Windows Debugging Tools] such as the Windows Debugger (WinDbg) and126the Console Debugger (CDB) which both support debugging programs written in Rust.127These debuggers parse the debug info for a binary from the `PDB`, if available, to construct a128visualization to serve up in the debugger.129130#### Natvis131132Both WinDbg and CDB support defining and viewing custom visualizations for any given type133within the debugger using the Natvis framework.134The Rust compiler defines a set of Natvis135files that define custom visualizations for a subset of types in the standard libraries such136as, `std`, `core`, and `alloc`.137These Natvis files are embedded into `PDBs` generated by the138`*-pc-windows-msvc` target triples to automatically enable these custom visualizations when139debugging.140This default can be overridden by setting the `strip` rustc flag to either `debuginfo` or `symbols`.141142Rust has support for embedding Natvis files for crates outside of the standard libraries by143using the `#[debugger_visualizer]` attribute.144For more details on how to embed debugger visualizers,145please refer to the section on the [`debugger_visualizer` attribute].146147## DWARF and `rustc`148149[DWARF] is the standard way compilers generate debugging information that debuggers read.150It is _the_ debugging format on macOS and Linux.151It is a multi-language and extensible format,152and is mostly good enough for Rust's purposes.153Hence, the current implementation reuses DWARF's concepts.154This is true even if some of the concepts in DWARF do not align with Rust semantically because,155generally, there can be some kind of mapping between the two.156157We have some DWARF extensions that the Rust compiler emits and the debuggers understand that158are _not_ in the DWARF standard.159160* Rust compiler will emit DWARF for a virtual table, and this `vtable` object will have a161  `DW_AT_containing_type` that points to the real type.162  This lets debuggers dissect a trait object pointer to correctly find the payload.163  Here is an example of such a DIE, from a test case in the gdb repository:164165  ```asm166  <1><1a9>: Abbrev Number: 3 (DW_TAG_structure_type)167     <1aa>   DW_AT_containing_type: <0x1b4>168     <1ae>   DW_AT_name        : (indirect string, offset: 0x23d): vtable169     <1b2>   DW_AT_byte_size   : 0170     <1b3>   DW_AT_alignment   : 8171  ```172173* The other extension is that the Rust compiler can emit a tagless discriminated union.174  See [DWARF feature request] for this item.175176### Current limitations of DWARF177178* Traits - require a bigger change than normal to DWARF, on how to represent Traits in DWARF.179* DWARF provides no way to differentiate between Structs and Tuples.180  Rust compiler emits181fields with `__0` and debuggers look for a sequence of such names to overcome this limitation.182For example, in this case the debugger would look at a field via `x.__0` instead of `x.0`.183This is resolved via the Rust parser in the debugger so now you can do `x.0`.184185DWARF relies on debuggers to know some information about platform ABI.186Rust does not do that all the time.187188## Developer notes189190This section is from the talk about certain aspects of development.191192## What is missing193194### Code signing for LLDB debug server on macOS195196According to Wikipedia, [System Integrity Protection] is197198> System Integrity Protection (SIP, sometimes referred to as rootless) is a security feature199> of Apple's macOS operating system introduced in OS X El Capitan. It comprises a number of200> mechanisms that are enforced by the kernel. A centerpiece is the protection of system-owned201> files and directories against modifications by processes without a specific "entitlement",202> even when executed by the root user or a user with root privileges (sudo).203204It prevents processes using `ptrace` syscall.205If a process wants to use `ptrace` it has to be code signed.206The certificate that signs it has to be trusted on your machine.207208See [Apple developer documentation for System Integrity Protection].209210We may need to sign up with Apple and get the keys to do this signing.211Tom has looked into if Mozilla cannot do this because it is at the maximum number of212keys it is allowed to sign.213Tom does not know if Mozilla could get more keys.214215Alternatively, Tom suggests that maybe a Rust legal entity is needed to get the keys via Apple.216This problem is not technical in nature.217If we had such a key, we could sign GDB as well and ship that.218219### DWARF and Traits220221Rust traits are not emitted into DWARF at all.222The impact of this is calling a method `x.method()` does not work as-is.223The reason being that method is implemented by a trait, as opposed to a type.224That information is not present, so finding trait methods is missing.225226DWARF has a notion of interface types (possibly added for Java).227Tom's idea was to use this interface type as traits.228229DWARF only deals with concrete names, not the reference types.230So, a given implementation of a231trait for a type would be one of these interfaces (`DW_tag_interface` type).232Also, the type for which it is implemented would describe all the interfaces this type implements.233This requires a DWARF extension.234235Issue on GitHub: [https://github.com/rust-lang/rust/issues/33014]236237## Typical process for a Debug Info change (LLVM)238239LLVM has Debug Info (DI) builders.240This is the primary thing that Rust calls into.241This is why we need to change LLVM first because that is emitted first and not DWARF directly.242This is a kind of metadata that you construct and hand-off to LLVM.243For the Rustc/LLVM hand-off,244some LLVM DI builder methods are called to construct representation of a type.245246The steps of this process are as follows:2472481. LLVM needs changing.249250   LLVM does not emit Interface types at all, so this needs to be implemented in LLVM first.251252   Get sign off on LLVM maintainers that this is a good idea.2532542. Change the DWARF extension.2552563. Update the debuggers.257258   Update DWARF readers, expression evaluators.2592604. Update Rust compiler.261262   Change it to emit this new information.263264### Procedural macro stepping265266A deeply profound question is that how do you actually debug a procedural macro?267What is the location you emit for a macro expansion?268Consider some of the following cases -269270* You can emit location of the invocation of the macro.271* You can emit the location of the definition of the macro.272* You can emit locations of the content of the macro.273274RFC: [https://github.com/rust-lang/rfcs/pull/2117]275276Focus is to let macros decide what to do.277This can be achieved by having some kind of attribute278that lets the macro tell the compiler where the line marker should be.279This affects where you set the breakpoints and what happens when you step it.280281## Source file checksums in debug info282283Both DWARF and CodeView (PDB) support embedding a cryptographic hash of each source file that284contributed to the associated binary.285286The cryptographic hash can be used by a debugger to verify that the source file matches the287executable.288If the source file does not match, the debugger can provide a warning to the user.289290The hash can also be used to prove that a given source file has not been modified since it was291used to compile an executable.292Because MD5 and SHA1 both have demonstrated vulnerabilities,293using SHA256 is recommended for this application.294295The Rust compiler stores the hash for each source file in the corresponding `SourceFile` in296the `SourceMap`.297The hashes of input files to external crates are stored in `rlib` metadata.298299A default hashing algorithm is set in the target specification.300This allows the target to301specify the best hash available, since not all targets support all hash algorithms.302303The hashing algorithm for a target can also be overridden with the `-Z source-file-checksum=`304command-line option.305306#### DWARF 5307DWARF version 5 supports embedding an MD5 hash to validate the source file version in use.308DWARF 5 - Section 6.2.4.1 opcode DW_LNCT_MD5309310#### LLVM311LLVM IR supports MD5 and SHA1 (and SHA256 in LLVM 11+) source file checksums in the DIFile node.312313[LLVM DIFile documentation](https://llvm.org/docs/LangRef.html#difile)314315#### Microsoft Visual C++ Compiler /ZH option316The MSVC compiler supports embedding MD5, SHA1, or SHA256 hashes in the PDB using the `/ZH`317compiler option.318319[MSVC /ZH documentation](https://docs.microsoft.com/en-us/cpp/build/reference/zh)320321#### Clang322Clang always embeds an MD5 checksum, though this does not appear in documentation.323324## Future work325326#### Name mangling changes327328* New demangler in `libiberty` (gcc source tree).329* New demangler in LLVM or LLDB.330331**TODO**: Check the location of the demangler source.332[#1157](https://github.com/rust-lang/rustc-dev-guide/issues/1157)333334#### Reuse Rust compiler for expressions335336This is an important idea because debuggers by and large do not try to implement type inference.337You need to be much more explicit when you type into the debugger than your actual source code.338So, you cannot just copy and paste an expression from your source339code to debugger and expect the same answer, but this would be nice.340This can be helped by using compiler.341342It is certainly doable, but it is a large project.343You certainly need a bridge to the debugger because the debugger alone has access to the memory.344Both GDB (gcc) and LLDB (clang) have this feature.345LLDB uses Clang to compile code to JIT and GDB can do the same with GCC.346347Both debuggers expression evaluation implement both a superset and a subset of Rust.348They implement just the expression language,349but they also add some extensions like GDB has convenience variables.350Therefore, if you are taking this route,351then you not only need to do this bridge,352but may have to add some mode to let the compiler understand some extensions.353354[Tom Tromey discusses debugging support in rustc]: https://www.youtube.com/watch?v=elBxMRSNYr4355[Debugging the Compiler]: compiler-debugging.md356[debugger or debugging tool]: https://en.wikipedia.org/wiki/Debugger357[Bison]: https://www.gnu.org/software/bison/358[ptype]: https://ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_109.html359[rust-lang/lldb wiki page]: https://github.com/rust-lang/lldb/wiki360[DWARF]: http://dwarfstd.org361[manual for GDB/Rust]: https://sourceware.org/gdb/onlinedocs/gdb/Rust.html362[GDB Bugzilla]: https://sourceware.org/bugzilla/363[Recursive Descent parser]: https://en.wikipedia.org/wiki/Recursive_descent_parser364[System Integrity Protection]: https://en.wikipedia.org/wiki/System_Integrity_Protection365[https://github.com/rust-dev-tools/gdb]: https://github.com/rust-dev-tools/gdb366[DWARF feature request]: http://dwarfstd.org/ShowIssue.php?issue=180517.2367[https://docs.python.org/3/c-api/stable.html]: https://docs.python.org/3/c-api/stable.html368[https://github.com/rust-lang/rfcs/pull/2117]: https://github.com/rust-lang/rfcs/pull/2117369[https://github.com/rust-lang/rust/issues/33014]: https://github.com/rust-lang/rust/issues/33014370[https://github.com/rust-lang/rust/issues/34457]: https://github.com/rust-lang/rust/issues/34457371[Apple developer documentation for System Integrity Protection]: https://developer.apple.com/library/archive/releasenotes/MacOSX/WhatsNewInOSX/Articles/MacOSX10_11.html#//apple_ref/doc/uid/TP40016227-SW11372[https://github.com/rust-lang/lldb]: https://github.com/rust-lang/lldb373[https://github.com/rust-lang/llvm-project]: https://github.com/rust-lang/llvm-project374[PDB]: https://llvm.org/docs/PDB/index.html375[symbol records]: https://llvm.org/docs/PDB/CodeViewSymbols.html376[type records]: https://llvm.org/docs/PDB/CodeViewTypes.html377[Windows Debugging Tools]: https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/378[`debugger_visualizer` attribute]: https://doc.rust-lang.org/nightly/reference/attributes/debugger.html#the-debugger_visualizer-attribute

Findings

✓ No findings reported for this file.

Get this view in your editor

Same data, no extra tab — call code_get_file + code_get_findings over MCP from Claude/Cursor/Copilot.