=========== Address spaces & segments =========== As you read this, look at the textbook diagram scans in bookpages/. Address spaces ('struct as') of processes ('proc_t') -- as well as of the kernel's address space 'kas' -- are made up of segments ('struct seg'). These segments are arranged in an AVL tree (a balanced kind of a binary search tree) to make finding a segment for a faulting address efficient; see the definitions for 'avl_tree_t', 'avl_node_t', and their containing 'struct as' and 'struct seg' respectively. Kernel (virtual) address layouts: http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/i86pc/os/startup.c#380 Within these kernel address ranges, address mappings are created and managed by several kernel "segment drivers": Kernel's segments (Ch. 11.1): seg_kmem -- normal non-pageable kernel memory management seg_kp -- kernel pageable memory management seg_map -- file cache pages mapped into kernel space (::addr2smap DCMD) seg_kpm -- all physical memory pages mapped into kernel space (only in x64) Read more about these in 11.1.5-6 . The Smap layer establishes a mapping from a virtual address to a "page identity" : http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/vm/seg_map.h#75 In MDB, it is reported by the ::addr2smap command. An important observation about "large pages" is found on p. 530, 11.1.2 regarding the effect of the large pages on the kernel's efficiency. 10% improvement is a lot. ---[ AVL trees, offsets & embedding ]--- Note that the tree and node structure are _contained_ rather than pointed to in the respective OS abstractions -- note the offset manipulation used for translation between embedded nodes and their containing data structures. http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/vm/vm_as.c#as_findseg (note caching of the last looked up segment) hands off to http://src.illumos.org/source/xref/illumos-gate/usr/src/common/avl/avl.c#avl_find (essentially, a simple binary search) For every kernel-space data structure, "how is access to it is synchronized?" is essential (one cannot write any kernel code using a data structure without understanding the sync model of the data structure). So comment on line 26 explains the synch http://src.illumos.org/source/xref/illumos-gate/usr/src/common/avl/avl.c#26 ---- Kernel AS (kas) example ---- kas::print { a_contents = { _opaque = [ 0 ] } a_flags = 0 a_vbits = 0 a_cv = { _opaque = 0 } a_hat = 0xffffff0149476e78 a_hrm = 0 a_userlimit = 0 a_seglast = kmapseg a_lock = { _opaque = [ 0 ] } a_size = 0xff3814b000 a_lastgap = 0 a_lastgaphl = 0 a_segtree = { avl_root = kvseg+0x20 avl_compar = as_segcompar avl_offset = 0x20 avl_numnodes = 0x9 avl_size = 0x60 } ... (Reminder: use ::print -t to see the types of the struct members, -a to see hex offsets at which they are stored in the struct). Observe 9 segments in the kernel address space. At the root of the kernel AS tree is kvseg seg_t structure. > kvseg::print { s_base = 0xffffff0149400000 s_size = 0xfe76c00000 s_szc = 0 s_flags = 0 s_as = kas s_tree = { avl_child = [ kpseg+0x20, ktextseg+0x20 ] avl_pcb = 0 } s_ops = segkmem_ops s_data = kvps ... Abbreviated display of a segment struct with ::seg command: > kvseg::seg SEG BASE SIZE DATA OPS fffffffffbc31530 ffffff0149400000 fe76c00000 fffffffffbceea30 segkmem_ops From the child nodes kpseg and ktextseg you can explore the full AVL tree of the kernel address space. [Do it! Note the different *_ops and s_data structs for segments -- their interplay makes up the "segment drivers" described in the textbook.] Observe the switching between the avl_node_t embedded into the respective seg_t structs (at 0x20, the a_segtree's avl_offset) and the actual seg_t objects. In avl.c it is provided by macros AVL_NODE2DATA and AVL_DATA2NODE. The implicit assumption is that in any avl_tree_t, the offsets involved are the same for all tree nodes; the same holds for the comparator function applied to tree nodes in avl.c. Suggestion: finish examining 'kas' in kernel space, draw the full tree of kernel segments. Observe their different *_ops arrays, compare with the segment drivers above. Check yourself by using $m walker command (or ::mappings command) Suggestion: Starting from a proc_t pointer of a process (find a process you like with ::ps), navigate the AS of that process pointed to by p_as. For example, here's me looking at "xeyes" process: > ::ps ! grep xeyes R 5620 4470 5620 1349 0 0x4a004200 ffffff015c70e060 xeyes > ffffff015c70e060::print proc_t p_as p_as = 0xffffff015c682ce0 > 0xffffff015c682ce0::print -t 'struct as' struct as { kmutex_t a_contents = { void *[1] _opaque = [ 0 ] } uchar_t a_flags = 0 uchar_t a_vbits = 0 kcondvar_t a_cv = { ushort_t _opaque = 0 } struct hat *a_hat = 0xffffff015c718e98 struct hrmstat *a_hrm = 0 caddr_t a_userlimit = 0xfefff000 struct seg *a_seglast = 0 krwlock_t a_lock = { void *[1] _opaque = [ 0 ] } size_t a_size = 0x47e000 struct seg *a_lastgap = 0xffffff015d44c540 struct seg *a_lastgaphl = 0 avl_tree_t a_segtree = { <<================== AVL tree struct avl_node *avl_root = 0xffffff015d463dc0 <===== root node int (*)() avl_compar = as_segcompar size_t avl_offset = 0x20 <<== node is embedded at in seg at 0x20 ulong_t avl_numnodes = 0x34 size_t avl_size = 0x60 } avl_tree_t a_wpage = { struct avl_node *avl_root = 0 int (*)() avl_compar = wp_compare size_t avl_offset = 0 ulong_t avl_numnodes = 0 size_t avl_size = 0x38 } uchar_t a_updatedir = 0x1 timespec_t a_updatetime = { time_t tv_sec = 2014 Jan 24 23:48:55 long tv_nsec = 0x239474fb } vnode_t **a_objectdir = 0 size_t a_sizedir = 0 struct as_callback *a_callbacks = 0 void *a_xhat = 0 proc_t *a_proc = 0xffffff015c70e060 <<====== back reference to proc_t size_t a_resvsize = 0x47e000 } Now to examine that root node: > 0xffffff015d463dc0::print 'struct avl_node' { avl_child = [ 0xffffff015ceba378, 0xffffff015d4634c0 ] avl_pcb = 0 } (pointers look like 64 kernel pointers -- good! :)) Now to find the segment of the AS that this node is embedded in (watch the pointers to check sanity): > 0xffffff015d463dc0-0x20::print 'struct seg' { s_base = 0xfec30000 <==== segment's start address, will show on ::pmap s_size = 0x6000 <==== segment's consequtive length s_szc = 0 s_flags = 0 s_as = 0xffffff015c682ce0 <=== back reference for owning process' AS s_tree = { avl_child = [ 0xffffff015ceba378, 0xffffff015d4634c0 ] <=== more segments' embedded avl_nodes avl_pcb = 0 } s_ops = segvn_ops <==== This is "segvn" driver, for file-backed and anon memort regions s_data = 0xffffff015d4675a0 <=== this holds the vnode and other "driver"-specific data s_pmtx = { _opaque = [ 0 ] } s_phead = { p_lnext = 0xffffff015d463df0 p_lprev = 0xffffff015d463df0 } } Same thing summarized by the specialized dcmd: > 0xffffff015d463dc0-0x20::seg SEG BASE SIZE DATA OPS ffffff015d463da0 fec30000 6000 ffffff015d4675a0 segvn_ops Child segments (notice skipping -0x20 to the start of the segment struct from the start of its embedded avl_node struct): > 0xffffff015ceba378-0x20::seg SEG BASE SIZE DATA OPS ffffff015ceba358 feb74000 5000 ffffff015ceb80c8 segvn_ops > 0xffffff015d4634c0-0x20::seg SEG BASE SIZE DATA OPS ffffff015d4634a0 fee30000 1000 ffffff015d461500 segvn_ops > and so on. Use "::print 'struct segvn_data'" on the data pointers to find out what these regions of memory are, maped files or anonymous mappings. Note "struct vnode *vp" points the mapped file's vnode when not zero. Check with: > ::ps ! grep xeyes | awk '{print $8}' ffffff015c70e060 > ffffff015c70e060::pmap [..do it! :)..] ==================== Building Up Kernel's Memory Space ==================== For the starting point of creating the kernel memory image: http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/i86pc/os/startup.c#670 Also peek at the constant definitions setting the platform limit of linear addresses, starting at line 189. These will be used to compute everything else in the address layout (see line 383 for an example layout). Note also lines 289--363 for the important kernel symbols: these variables get created here, will be referred to throughout other code as 'extern' declarations. (Notice that this is an x86 platform-specific startup; it's got to start platform-specific till higher level abstractions like Vmem can be used. Note that such abstractions still have to work around the so-called "memory hole", a range of 64bit addresses that cannot be used by most systems. You will find checks for memory hole even in high-level objects like AS: e.g., seg_alloc, the function that builds the segments of an address space, http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/vm/vm_seg.c#seg_alloc calls valid_va_range: http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/i86pc/vm/vm_machdep.c#valid_va_range ) The kernel is loaded into a set of fixed, platform-specific virtual address ranges. To re-iterate, all addresses are virtual, and so is any address that is encoded as a part of an instruction.