Blob Blame History Raw

			Current lsof data sources
		      and what it does with the data.

    From the proc and user structures:

	process status (p_stat) (e.g., SZOMB)

	    Lsof uses this information to decide if the process
	    is active or is a "zombie" it should ignore.

	command name

	    Lsof reports this information to the user.

	Process IDentification number (PID) (p_pid)

	    Lsof reports this information to the user and uses
	    it in some error messages.

	Process Group ID (PGRP) (p_pgrp)

	    Lsof reports this information to the user, if requested
	    to do so with an option.  Lsof post-processing scripts
	    may use this information to establish process chains.

	Parent PID (PPID) (p_ppid)

	    Lsof reports this information to the user, if requested.
	    to do so with an option.

	User ID (UID) (p_uid)

	    Lsof reports this information to the user.  By
	    default it converts UID to login name, but the default
	    mode can be change with an option.

	Current working directory vnode pointer (p_cdir)

	    Lsof normally reports CWD, but its reporting can be
	    suppressed.  CWD vnode pointer processing is handled as
	    any other vnode pointer from the file structure would be.

	Root directory vnode pointer (p_rdir)

	    Lsof normally reports a root directory, if it exists,
	    but the reporting can be suppressed.  Root directory
	    vnode pointer processing is handled as any other
	    vnode pointer from the file structure would be.

	Text and loader file information (via VM map)
	
	    Lsof follows the pregion and region changes leading
	    from the process' VM pointer to any vnodes that are
	    shadowed by memory regions.  It then processes their
	    vnode pointers just as it processes any others.  This
	    results in a display of the text file used by the
	    process, and the shared library files it uses.

    From file structures:

	The file structure address (from the open file "chunk")

	    Lsof uses this pointer to read the file structure.

	    It also reports the address for post-processing scripts
	    that need to identify unique file structure usage --
	    e.g., check for file descriptor leakage between child
	    and parent processes.

	The usage count (f_count)

	    Lsof uses this to determine if the file structure is
	    worth further examination.

	The node address (f_data)

	    Lsof uses this to read file structure successor --
	    e.g., socket, vnode.

	    Lsof also reports this address for post-processing
	    scripts that need to identify unique file usage.

	The access code (f_flag)

	    Lsof uses this code to report whether the file was opened
 	    for read access, write access, or read/write access.  Lsof
 	    can also be asked to report the full contents of f_flag.

	The file structure type (f_type)

	    Lsof determines the processing to be applied to the
	    file structure successor, pointed at by f_data.

	The file offset (f_offset)

	    Lsof saves this *most* useful information for reporting
	    to the user.  It's particularly interesting, for example,
	    on character device tape files, because when it changes,
	    the tape user knows the tape is moving.

    From socket structures:

	Lsof sometimes uses the file structure alone to identify
	socket files.  At other times it also uses the socket stream
	header pointer.

	Lsof gets the socket structure pointer from f_data in file
	structures whose f_type is DTYPE_SOCKET.  (UNIX dialects
	whose socket implementations are stream based often don't
	have file structure types.)  It then reads the socket
	structure and uses these fields:

	    Socket type (so_type)

		Lsof uses the socket type -- e.g., SOCK_RAW for
		special processing of its structures.

	    Socket protocol (so_proto)
	
		Lsof reads the protocol domain switch, protosw,
		to get the domain structure, so it can find out
		to what domain -- e.g., AF_INET -- the socket
		belongs.

	    Socket receive buffer parameters (so_rcv)
	    Socket send parameters (so_snd)

		Lsof reads these buffer parameters to be able to
		report file size or position.  For example, an
		advancing position on an ftp socket can be used
		as an indicator that the transfer is making progress.

	    Socket protocol control block address (so_pcb)

		Lsof gets TCP/IP addresses from the protocol control
		block.

	    Socket stream header (so_sth)

		Lsof sometimes uses the stream head pointer to
		learn information about the socket file.  It reads
		the stream module names, looking for IP, TCP and
		UDP modules and their private q_ptr addresses.

		If an IP q_ptr is located, lsof reads the ipcs_s
		structure to which it points to get TCP/IP connection
		addresses.

		Lsof also reads other TCP and UDP structures to
		get state and read/write buffer information that
		it can report to the lsof user.

    From the vnode:

	Lsof gets the vnode address from the f_data member of file
	structures.

	It saves the vnode address for use in extracting path
	name components from the kernel name cache.

	Reading the vnode:

	    Usually f_data values point to vnodes.  Lsof reads the
	    structure appropriate to the file type.

	Vnode type (derived from v_op)

	    Lsof must know the vnode type in order to be able to
	    read any further information about the open file from
	    the structures that follow the vnode.  It determines
	    the vnode type by comparing the v_op address to a
	    set of *_vnodeops switch addresses it gets from the
	    kernel via nlist().

	Vnode locks (v_locklist)

	    If the vnode has a lock list pointer, lsof examines
	    the chain it addresses, looking for locks belonging
	    to the process owning the file.

	Vnode virtual file system structure (v_vfsp)

	    Lsof reads the virtual file system structure to
	    determine the file system on which the vnode resides.
	    The vfs structure yields the file system information --
	    mounted-on directory, mounted-on device, NFS device
	    number, etc.  This information is cached by vfs structure
	    address so vfs information for later vnodes can be
	    found quickly.

	Vnode successors (v_data)

	    Based on the vnode type it established from v_op, lsof
	    reads the vnode successor structures addressed by v_data.

	    AFS type vnodes lead to an AFS node structure; VxFS type
	    vnodes lead to a vxnode; CDFS nodes to a cdnode; NFS vnodes
	    to an rnode; etc.

	    There are many special cases -- e.g., FIFOs and pipes.
	    The most special are VCHR nodes.  They can lead to
	    snodes, and from there back to vnodes.  They can lead
	    to device nodes on a special file system -- e.g., /dev
	    nodes on a VxFS file system.  They can lead to simple
	    inodes.

	    Eventually lsof derives the following data from the
	    nodes it reads:

		Vnode address

		File device numbers -- cooked and raw

		File node number

		   Most node structures have a number, but lsof may
		   have to generate a node number from /dev information
		   for some character device files.

		File size

		Link count

		File type (v_type) (e.g., FIFO, VREG, etc.)

		    It's a good indicator of the purpose of the file.

	Streams (v_stream)

	    Lsof saves the device numbers for streams and checks
	    their module names for IP, TCP, and UDP.  If any of
	    those three appear, lsof processes the vnode stream as
	    a socket stream.

    The kernel name cache:

	While device and inode  number uniquely identify files,
	they aren't easy to use by people who read lsof output.
	Lsof output readers want path names.

	Lsof tries to accommodate that by reading the contents of
	the kernel name cache, keyed by vnode addresses, and matching
	that to vnode addresses accumulated while reading open file
	structures and vnodes.

	Since the kernel cache for a particular vnode also points
	to the vnode of its parent, lsof is often successful in
	constructing an entire path name from the kernel cache.
	Sometimes it can't construct the entire name, but some
	terminating components, and even that is helpful.

    The mount table:

	Lsof reads the mount table via the setmntent()/getmntent()
	interface and stores the information for decoding file
	system residence, especially when printing open file results.
	The file system, for example, is the first component of a
	full path name, derived from the kernel name cache, even
	when lsof finds all other path name components in the kernel
	name cache.

    The device table:

	Lsof reads the device table (/dev) using opendir(), readdir(),
	and stat() calls, so it can report full device names on
	VCHR files.

	Since the above operations are sometimes time-consuming
	and the information seldom changes, lsof caches the
	information and tries to get it first from a cache file,
	located in the user's home directory.  (The cache file may
	be stored globally, but it's much more restricted [for
	security reasons] in that case.)


    Vic Abell <abe@purdue.edu>
    March 15, 1999