I want to cover a lot of ground in this post.
Let's discuss how clang(1) works and explore what I think would be an ideal
toolchain for Illumos-based systems. Firstly, how does clang work?
Let's eplore that a bit.
alex@meek:/home/alex$ clang -v -target x86_64-pc-hydraos1.0 test.c -o test
clang version 3.8.0 (tags/RELEASE_380/final)
Target: x86_64-pc-hydraos1.0
Thread model: posix
InstalledDir: /home/alex/llvm/bin
"/home/alex/llvm/bin/clang-3.8" -cc1 -triple x86_64-pc-hydraos1.0
-emit-obj -mrelax-all -disable-free -disable-llvm-verifier
-main-file-name test.c -mrelocation-model static -mthread-model posix
-mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases
-munwind-tables -target-cpu x86-64 -v -dwarf-column-info
-debugger-tuning=gdb -resource-dir /home/alex/llvm/bin/../lib/clang/3.8.0
-fdebug-compilation-dir /home/alex -ferror-limit 19 -fmessage-length 80
-fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics
-o /var/tmp/test-84a095.o -xc test.c
There's a lot going on here. The x86_64-pc-hydraos1.0 is a host triple that
I've created for this effort, because host triples are actually more
significant than you might think. The seemingly innocuous host triple is a
semantic hook that much of the toolchain behavior hangs upon, and I want a
toolchain that behaves much more like the one on MacOS X than the one you find
on most Linux or BSD systems.
Why on Earth would I want that? Doesn't Linux work?
When we invoke clang, it is actually running as
a so-called compiler driver which manages the rest of the compilation,
assembly, and linking process. Each of these steps may or may not be done by an
LLVM project! In this case, I am trying to use Clang's C compiler and
integrated assembler with the Illumos system linker - ld(1), which is something
of a departure from Linux or BSD systems which use both gcc and libgcc.
This class diagram makes it fairly clear.
http://clang.llvm.org/doxygen/classclang_1_1driver_1_1ToolChain.html
The Linux, BSD, and Solaris toolchains (in LLVM parlance) all inherit from the
Generic_GCC class. What I want here is a clang::driver::toolchains::ELF that
behaves much more like clang::driver::toolchains::MachO in terms of separating
debug information from the final artifact. This concept is explained in more
depth at https://gcc.gnu.org/wiki/DebugFission - it's a good idea! The end goal
is that -g -O2 will become production flags, so you have an optimized artifact
with debugging information in a separate file for consumption by DTrace and MDB.
There is another major semantic hook here, -debugger-tuning=gdb, which tells
clang to emit debugging information for consumption by gdb. As compared to
Illumos systems, post-mortem debugging on Linux systems is an afterthought.
Future developments in this area will feature a -debugger-tuning=mdb option
for the Illumos modular debugger.
Moving along, clang(1) immediately invokes itself again with the -cc1 option,
which is what causes it to behave like a compiler... but Illumos systems won't
build with just any compiler, the post-mortem debugging features require support
from the compiler. The first thing is -msave-args
To add a cc1 option, you edit tools/clang/include/clang/Driver/CC1Options.td and add the option in the appropriate section.
def msave_args : Separate<["-"], "msave-args">,
HelpText<"Save the first 6 register-passed args after the frame pointer."
"NOTE: This is only meaningful for 64-bit targets.">;
That's still not enough! Clang has to know to pass -msave-args from the driver
invocation to the cc1 invocation. We do this by adding code in
tools/clang/lib/Frontend/CompilerInvocation.cpp much like the other machine
code generation options.
So what does -msave-args do? The Illumos libsaveargs code explains it thusly:
* The Sun Studio and GCC (patched for opensolaris/illumos) compilers
* implement a argument saving scheme on amd64 via the -Wu,save-args or
* options. When the option is specified, INTEGER type function arguments
* passed via registers will be saved on the stack immediately after %rbp, and
* will not be modified through out the life of the routine.
*
* +--------+
* %rbp --> | %rbp |
* +--------+
* -0x8(%rbp) | %rdi |
* +--------+
* -0x10(%rbp) | %rsi |
* +--------+
* -0x18(%rbp) | %rdx |
* +--------+
* -0x20(%rbp) | %rcx |
* +--------+
* -0x28(%rbp) | %r8 |
* +--------+
* -0x30(%rbp) | %r9 |
* +--------+
This sounds like a job for llvm/lib/CodeGen/PrologEpilogInserter.cpp!
The PrologEpilogInserter makes a call like this:
const TargetFrameLowering *TFI = F.getSubtarget().getFrameLowering();
So if we want to support -msave-args on multiple architectures, we'll need to
add that support in the frame lowering code for each of them. On x86, this is:
http://llvm.org/docs/doxygen/html/X86FrameLowering_8cpp_source.html
On AArch64:
http://llvm.org/docs/doxygen/html/AArch64FrameLowering_8cpp_source.html
The implementation is a little hairy, but not the most complex thing in the
world. We need to detect OPT_msaveargs and determine if we're producing a
64-bit executable, and if both of those are true, we need to grab all of the
integer arguments, add them to the stack after the frame pointer, and pad it to
a 16-byte alignment. More on this later!