1. DTrace aggregation functionality DTrace probes can be used for kernel and application code profiling, such as counting the number of invocations of certain functions and the time spent in them. DTrace provides a built-in datatype for a "counting table". Aggregations are briefly in the DTrace Quick Reference: http://developers.sun.com/solaris/articles/dtrace_quickref/dtrace_quickref.html (While you are at it, have a look at the built-in variables: http://developers.sun.com/solaris/articles/dtrace_quickref/dtrace_built_in_vars.html. Starting from these, you can unravel the kernel's data structures. For example, you can follow curthread to the process' struct proc through curthread->t_procp, and so on to other elements of the proc structure as depicted in Figure 2.3 on page 57) and in the DTrace Guide, Chapter 9: http://docs.sun.com/app/docs/doc/817-6223 http://www.sun.com/bigadmin/content/dtrace/ provides several DTrace tutorials, DTrace developer blogs and use cases. Suggestion: go through the "DTrace 1-liners" from the previous lecture. 2. How DTrace probes work We take a look at one particular mechanism DTrace uses for the :entry probe into arbitrary functions. In short, the function's first instruction is replaced with an invalid instruction that causes the CPU to trap to the "undefined opcode" handler. DTrace replaces this handler with its own handler, which does the actual work of matching predicates and performing actions. Interestingly enough, the particular malformed instruction that DTrace uses is related to the famous Pentium F00F bug (the bug was that the malformedness was not caught correctly and caused the processor to freeze rather than to trap it). See f00f-bug.txt 3. Using the Modular Debugger (MDB) to examine kernel state "mdb -k" launches the debugger and "attaches" it to the running kernel (just as "mbd -p " would attach it to a running use process). The debugger has a somewhat different command syntax from GDB. Its ::help command is a useful entry point. Here is a (larger) tutorial: http://learningsolaris.com/docs/chpt_mdb_os.pdf Tip: You can pipe debugger commands' output through grep when there is too much of it, rather than dealing with the internal pager. E.g.: ::dcmds !grep module to catch all commands with names or descriptions that contain "module". To see the address (in hex) of a symbol (function, variable, or any other thing that the debugger knows the address of): =X E.g.: getpid=X -- the address of the getpid function as a 32bit hex number To see the contents (in hex) of memory at that address, use / rather than = getpid/X -- the opcodes at the start of the getpid function as one 32bit hex number (little endian) getpid/4B -- the same, as 4 separate bytes, in order getpid/4i -- the opcodes disassembled into instructions (stops at first 4) More about formats: ::formats (e.g., "::formats !grep hex") getpid::dis -- disassembles the whole function ==== Useful commands to look up: ::objects ::ps ::print -t "struct proc" -- print the definition of a data type e1ddb380::print proc_t -- print the contents of memory at e1ddb380, interpreting it as a proc_t struct type (I got the above address from the ::ps "walker" command that knows how to find and walk the process control block) e1ddb380::print proc_t p_ppid -- print only the selected part of the data structure at that address (remember to use !grep liberally if you don't like scrolling) More info on this is in Chapter 2.4 ("Process Structures"). This is what ps and other process reporting utilities extract; we are going to see how. 4. Tracing ps ps on modern operating systems does little beyond reading /proc and interpreting and pretty-printing its contents. The kernel exposes its process control info in /proc's pseudo-files, and it is the kernel's functions that walk the process control blocks in response to your ps's "open" and "read" system calls. The design which re-dispatches these general file-related system calls to the appropriate worker functions is called VFS. Linux uses a similar design (except Linux's inodes are the same thing as Solaris' vnodes). See Figure 1.5 on p. 31 and table 1.1 on p.32 for an overview of VFS. 5. Doing the work of /proc See what system calls ps makes with "truss ps". The getdents64 function is what lists the contents of a directory (this is what ls uses, too). The directory read by ps is, of course, /proc. See "man getdents". See procdents.d for the script that exposes kernel functions called in response to a ps commands' getdents call. Look at their code in the Solaris source browser. We will read the revealed functions' code to see how they walk the process list next time.