Background: CodeView vs. PDB
CodeView is a debug information format invented by Microsoft in the mid 1980s. For various reasons, other debuggers developed an independent format called DWARF, which eventually became standardized and is now widely supported by many compilers and programming languages. CodeView, like DWARF, defines a set of records that describe mappings between source lines and code addresses, as well as types and symbols that your program uses. The debugger then uses this information to let you set breakpoints by function name, display the value of a variable, etc. But CodeView is only somewhat documented, with the most recent official documentation being at least 20 years old. While some records still have the format documented above, others have evolved, and entirely new records have been introduced that are not documented anywhere.It’s important to understand though that CodeView is just a collection of records. What happens when the user says “show me the value of Foo”? The debugger has to find the record that describes Foo. And now things start getting complicated. What optimizations are enabled? What version of the compiler was used? (These could be important if there are certain ABI incompatibilities between different versions of the compiler, or as a hint when trying to reconstruct a backtrace in heavily optimized code, or if the stack has been smashed). There are a billion other symbols in the program, how can we find the one named Foo without doing an exhaustive O(n) search? How can we support incremental linking so that it doesn’t take a long time to re-generate debug info when only a small amount of code has actually changed? How can we save space by de-duplicating strings that are used repeatedly? Enter PDB.
PDB (Program Database) is, as you might have guessed from the name, a database. It contains CodeView but it also contains many other things that allow indexing of the CodeView records in various ways. This allows for fast lookups of types and symbols by name or address, the philosophical equivalent of “tables” for individual input files, and various other things that are mostly invisible to you as a user but largely responsible for making the debugging experience on Windows so great. But there’s a problem: While CodeView is at least kind-of documented, PDB is completely undocumented. And it’s highly non-trivial.
We’re Stuck (Or Are We?)
Several years ago, we decided that the path forward was to abandon any hope of emitting CodeView and PDB, and instead focus on two things:
Make clang-cl emit DWARF debug information on Windows
Port LLDB to Windows and teach it about the Windows ABI, which would be significantly easier than teaching Visual Studio and/or WinDbg to be able to interpret DWARF (assuming this is even possible at all, given that everything would have to be done strictly through the Visual Studio / WinDbg extensibility model)
After about a year and a half of studying this code, hacking away, studying the code some more, hacking away some more, etc, I’m proud to say that lld (the LLVM linker) can finally emit working PDBs. All the basics like setting breakpoints by line, or by name, or viewing variables, or searching for symbols or types, everything works (minus bugs, of course).
Bring on the Bugs!
So this is where you come in. We’ve tested simple debugging scenarios with our PDBs, but we still consider this alpha in terms of debug info quality. We’d love for you to try it out and report issues on our bug tracker. To get you started, download the latest snapshot of clang for Windows. Here are two simple ways to test out this new functionality:Have clang-cl invoke lld automatically
clang-cl -fuse-ld=lld -Z7 -MTd hello.cpp
Invoke clang-cl and lld separately.
clang-cl -c -Z7 -MTd -o hello.obj hello.cpp
lld-link -debug hello.obj
We look forward to the onslaught of bug reports!
We would like to extend a very sincere and deep thanks to Microsoft for their help in getting the code uploaded to the github repository, as we would never have gotten this far without it.
And to leave you with something to get you even more excited for the future, it's worth reiterating that all of this is done without a dependency on any windows specific api, dll, or library. It's 100% portable. Do I hear cross-compilation?
Zach Turner (on behalf of the the LLVM Windows Team)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4