0. Why study linking and loading in detail in an OS course? Linking and loading play an unglamorous but central role in major OS functions. As overall OS (kernel + libraries + service processes) complexity keeps growing (which is often unavoidable, for good reasons), it is imperative that the running image of a process gets assembled from many parts. Some of these parts are functionally different and are created by different tools and parts of tools in the toolchain within a single executable or shared object file, others are even written and maintained by different organizations. The trick, then, is to engineer the *binary format*, the *Application Binary Interface* (ABI), and the corresponding parts of the both the development toolchain and the OS ("binutils" in GNU-speak, the binary loader that backs the exec*() family of system calls, and the kernel's own module loader) in such a way of integrating all these parts remains both _tractable_ and _extensible_. I suggest calling engineering patterns that go into well-proven linking-and-loading designs such as the ELF format and GNU/Linux ABI *"integration patterns"*, by analogy with the programming idioms known as "design patterns" that help keep code tractable and maintainable are (e.g., http://en.wikipedia.org/wiki/Design_pattern_(computer_science) ). CAVEAT: This view is not very common. There is only one affordable, dedicated book that covers the subject: John Levine's "Linkers and Loaders". Luckily, it's almost complete draft copy is freely available online: http://www.iecc.com/linker/ For a collection of links on ELF hacking see: http://www.hackercurriculum.org/elf We will work through the ELF symbol table and dynamic linking structure next week. 1. DTrace examples # dtrace -n 'proc:::exec-success { printf("%s", curpsinfo->pr_psargs); }' dtrace: description 'proc:::exec-success ' matched 1 probe CPU ID FUNCTION:NAME 0 16132 exec_common:exec-success ls callout.d intr1.d proc1.d 0 16132 exec_common:exec-success hostname (output produced by "ls *.d" in another terminal window) Observe the curpsinfo variable pointing to a special "struct psinfo_t" filled with info about the "current" process (i.e., the process that caused the probe to fire), as described in "Table 25–1 proc Probes" at http://docs.sun.com/app/docs/doc/817-6223/chp-proc?a=view Observe that this struct's pr_psargs member contains the string of arguments to ls after the Bash shell expanded them. --- # Files opened by process, dtrace -n 'syscall::open*:entry { printf("%s %s",execname,copyinstr(arg0)); }' Observe the "copyinstr" used to copy the syscall's string argument into kernel space. --- # Syscall count by program, dtrace -n 'syscall:::entry { @num[execname] = count(); }' Observe special aggregation syntax: @num[execname] = count(); creates a counting table that increments the count of each individual execname when the probe fires. On exit (^C) DTrace prints the table, nicely formatted and sorted. "num" is the name of the table. If you only have one, you may omit it and just have @[execname] = count(); Note that execname (or any expression in @[...]) first gets evaluated and then the count() action is taken on the associated value in the table (i.e., execname is used as a key into the table, the value for that key is extraced and incremented). count() merely increments, whereas sum() will add that value. So @[execname] = sum(1); has the same effect as above. --- Suggestion: work thought the rest of the examples in http://www.brendangregg.com/DTrace/dtrace_oneliners.txt , using the DTrace Guide (http://docs.sun.com/app/docs/doc/817-6223) More examples: http://developers.sun.com/solaris/articles/dtrace_example.html http://blogs.sun.com/uejio/entry/dtrace_tutorial_for_x_window --- Extra: OpenSolaris has demo scripts in /usr/demo/dtrace/ . Study them. --- 2. DTrace aggregation functionality DTrace probes can be used for kernel and application code profiling, such as counting the number of invocations of certain functions and the time spent in them. DTrace provides a built-in datatype for a "counting table". Aggregations are briefly in the DTrace Quick Reference: http://developers.sun.com/solaris/articles/dtrace_quickref/dtrace_quickref.html (While you are at it, have a look at the built-in variables: http://developers.sun.com/solaris/articles/dtrace_quickref/dtrace_built_in_vars.html. Starting from these, you can unravel the kernel's data structures. For example, you can follow curthread to the process' struct proc through curthread->t_procp, and so on to other elements of the proc structure as depicted in Figure 2.3 on page 57) and in the DTrace Guide, Chapter 9: http://docs.sun.com/app/docs/doc/817-6223 http://www.sun.com/bigadmin/content/dtrace/ provides several DTrace tutorials, DTrace developer blogs and use cases. Suggestion: go through the "DTrace 1-liners" from the previous lecture. 3. SDT provider and other non-function-boundary probes. DTrace's syscal: and fbt: providers' probes can be naturally aligned with kernel function boundaries. A simple "symbol table" (a mapping from probe names to function start addresses) can be used to place the probes. However, system events targeted by other providers such as proc:, sched:, and io: have more complex logic that does not easily align with mere function boundaries. In a word, respective probes must be placed inside if-then-else branch blocks rather than at function boundaries. For example, the logic behind the event may be in the middle of a function, or almost at the end but not quite. Look at the definitions of the DTRACE_PROBE -derived macros in http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/sys/sdt.h#77 and provider-specific macros starting at http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/sys/sdt.h#142 and then search for their uses with "Full Search" (http://src.opensolaris.org/source/ -- since macros are not "symbols") For example: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/exec.c#495 defines the proc:::exec-success probe . When not activated, it is is present in the kernel as the run of 5 NOPs, which you can see by staring "mdb -k" and scrolling down several screens of exec_common::dis disassembly. When activated (e.g., with "dtrace -n "proc:::"), it will include the invalid "LOCK NOP" instruction, with will cause the #UD trap to fire. Naturally, DTrace hangs its own clause off of the #UD handler, Further details on the invalid instruction and the related "F00F bug": http://www.cs.dartmouth.edu/~sergey/cs108/2009/f00f-bug.txt (NOTE: the www.x86.org link seems broken, local copy: http://www.cs.dartmouth.edu/~sergey/cs108/2009/F00FBug.html) A talk on exploiting processor bugs of the same order as "F00F": http://conference.hitb.org/hitbsecconf2008kl/?page_id=214 author: http://i.zdnet.com/blogs/kris_kaspersky.jpg?tag=col1%3bpost-1492 slides: google "Remote Code Execution Through Intel CPU Bugs", local "D2T1 - Kris Kaspersky - Remote Code Execution Through Intel CPU Bugs.pdf"