An Lsof API
This lsof API definition is based on obtaining the information
lsof currently delivers in its /dev/kmem configuration. There
may be better ways to get the necessary information, and I'll
try to indicate where options are possible.
I've not tried to determine if any current API already provides
any of the following information.
Lsof needs information on processes and their open files, so
I'll divide the API discussion into those two sections, and
add a section on how lsof might be provided path name information.
Process Information
===================
For each process lsof needs the following information:
The name of the command executing for the process.
The process PID.
The process parent PID.
The process group ID.
The UID of the owner of the process.
All open file information on:
The current working directory
The root directory
The executing text file
Share libraries
Any memory-mapped files
Open File Information
=====================
For each open file, lsof needs this information:
File table address -- this information is used for
checking for file descriptor leaks, for example.
Any sort of unique ID that would enable that check
would suffice.
Current file position
Number of open instances of the file
File size
Link count
Device (raw and cooked) major and minor numbers
File system identification -- mounted-on directory
and mounted-on device name
Open state -- read, write, or read/write
File type -- e.g., regular, directory, fifo, pipe,
socket, etc.
Vnode address -- this information is used in two ways:
1) to identify files shared by processes; and 2) to
locate path name components in the kernel name cache.
Sometimes the vnode has a "capability" ID that connects
it to the cache entry. Lsof might need that.
A unique identifier could satisfy the first need.
A full path name could satisfy the second.
Node number -- CD node number, NFS node number, inode
number, etc.
Full path names or a way to get path names from the
kernel's name cache
For sockets:
Protocol, including AF_*, PF_*, and IPPROTO_*
values, IPv4 or IPv6, etc.
Local and remote IP (v4 or v6) addresses and ports
Unix domain information -- peer ID, file system
object (path name, device, node number)
For pipes:
Pipe ID
Peer ID
Pipe status -- read or write, amount in the pipe, etc.
Path Names
Lsof currently reports path names by using vnode addresses to
assemble their components from the kernel name cache.
I see two options here: 1) report full path names directly
in the API; or 2) provide an API that delivers the kernel
name cache data to lsof that it can connect to open files
-- e.g., via vnode address, or (device,node_number) doublet.
Here's what lsof extracts from the kernel name cache that
an API would need to supply:
A connection between open files and cache entries, such
a vnode address or a (device,node_number) doublet
The path entry name component -- e.g., the "stat.h"
part of "/usr/include/sys/stat.h".
A connection from the cache entry to its immediate
directory parent -- e.g., from stat.h to sys.
In this cache this is usually done with a vnode
pointer.
Any unique ID (i.e., a "capability" ID from the vnode)
that might distinguish the cache entry.
Vic Abell <abe@purdue.edu>
March 18, 1999