### source-git / iproute

 Packit Service 3880ab Notes about distribution tables from Nistnet Packit Service 3880ab ------------------------------------------------------------------------------- Packit Service 3880ab I. About the distribution tables Packit Service 3880ab Packit Service 3880ab The table used for "synthesizing" the distribution is essentially a scaled, Packit Service 3880ab translated, inverse to the cumulative distribution function. Packit Service 3880ab Packit Service 3880ab Here's how to think about it: Let F() be the cumulative distribution Packit Service 3880ab function for a probability distribution X. We'll assume we've scaled Packit Service 3880ab things so that X has mean 0 and standard deviation 1, though that's not Packit Service 3880ab so important here. Then: Packit Service 3880ab Packit Service 3880ab  F(x) = P(X <= x) = \int_{-inf}^x f Packit Service 3880ab Packit Service 3880ab where f is the probability density function. Packit Service 3880ab Packit Service 3880ab F is monotonically increasing, so has an inverse function G, with range Packit Service 3880ab 0 to 1. Here, G(t) = the x such that P(X <= x) = t. (In general, G may Packit Service 3880ab have singularities if X has point masses, i.e., points x such that Packit Service 3880ab P(X = x) > 0.) Packit Service 3880ab Packit Service 3880ab Now we create a tabular representation of G as follows: Choose some table Packit Service 3880ab size N, and for the ith entry, put in G(i/N). Let's call this table T. Packit Service 3880ab Packit Service 3880ab The claim now is, I can create a (discrete) random variable Y whose Packit Service 3880ab distribution has the same approximate "shape" as X, simply by letting Packit Service 3880ab Y = T(U), where U is a discrete uniform random variable with range 1 to N. Packit Service 3880ab To see this, it's enough to show that Y's cumulative distribution function, Packit Service 3880ab (let's call it H), is a discrete approximation to F. But Packit Service 3880ab Packit Service 3880ab  H(x) = P(Y <= x) Packit Service 3880ab  = (# of entries in T <= x) / N -- as Y chosen uniformly from T Packit Service 3880ab  = i/N, where i is the largest integer such that G(i/N) <= x Packit Service 3880ab  = i/N, where i is the largest integer such that i/N <= F(x) Packit Service 3880ab  -- since G and F are inverse functions (and F is Packit Service 3880ab  increasing) Packit Service 3880ab  = floor(N*F(x))/N Packit Service 3880ab Packit Service 3880ab as desired. Packit Service 3880ab Packit Service 3880ab II. How to create distribution tables (in theory) Packit Service 3880ab Packit Service 3880ab How can we create this table in practice? In some cases, F may have a Packit Service 3880ab simple expression which allows evaluating its inverse directly. The Packit Service 3880ab pareto distribution is one example of this. In other cases, and Packit Service 3880ab especially for matching an experimentally observed distribution, it's Packit Service 3880ab easiest simply to create a table for F and "invert" it. Here, we give Packit Service 3880ab a concrete example, namely how the new "experimental" distribution was Packit Service 3880ab created. Packit Service 3880ab Packit Service 3880ab 1. Collect enough data points to characterize the distribution. Here, I Packit Service 3880ab collected 25,000 "ping" roundtrip times to a "distant" point (time.nist.gov). Packit Service 3880ab That's far more data than is really necessary, but it was fairly painless to Packit Service 3880ab collect it, so... Packit Service 3880ab Packit Service 3880ab 2. Normalize the data so that it has mean 0 and standard deviation 1. Packit Service 3880ab Packit Service 3880ab 3. Determine the cumulative distribution. The code I wrote creates a table Packit Service 3880ab covering the range -10 to +10, with granularity .00005. Obviously, this Packit Service 3880ab is absurdly over-precise, but since it's a one-time only computation, I Packit Service 3880ab figured it hardly mattered. Packit Service 3880ab Packit Service 3880ab 4. Invert the table: for each table entry F(x) = y, make the y*TABLESIZE Packit Service 3880ab (here, 4096) entry be x*TABLEFACTOR (here, 8192). This creates a table Packit Service 3880ab for the ("normalized") inverse of size TABLESIZE, covering its domain 0 Packit Service 3880ab to 1 with granularity 1/TABLESIZE. Note that even with the granularity Packit Service 3880ab used in creating the table for F, it's possible not all the entries in Packit Service 3880ab the table for G will be filled in. So, make a pass through the Packit Service 3880ab inverse's table, filling in any missing entries by linear interpolation. Packit Service 3880ab Packit Service 3880ab III. How to create distribution tables (in practice) Packit Service 3880ab Packit Service 3880ab If you want to do all this yourself, I've provided several tools to help: Packit Service 3880ab Packit Service 3880ab 1. maketable does the steps 2-4 above, and then generates the appropriate Packit Service 3880ab header file. So if you have your own time distribution, you can generate Packit Service 3880ab the header simply by: Packit Service 3880ab Packit Service 3880ab  maketable < time.values > header.h Packit Service 3880ab Packit Service 3880ab 2. As explained in the other README file, the somewhat sleazy way I have Packit Service 3880ab of generating correlated values needs correction. You can generate your Packit Service 3880ab own correction tables by compiling makesigtable and makemutable with Packit Service 3880ab your header file. Check the Makefile to see how this is done. Packit Service 3880ab Packit Service 3880ab 3. Warning: maketable, makesigtable and especially makemutable do Packit Service 3880ab enormous amounts of floating point arithmetic. Don't try running Packit Service 3880ab these on an old 486. (NIST Net itself will run fine on such a Packit Service 3880ab system, since in operation, it just needs to do a few simple integral Packit Service 3880ab calculations. But getting there takes some work.) Packit Service 3880ab Packit Service 3880ab 4. The tables produced are all normalized for mean 0 and standard Packit Service 3880ab deviation 1. How do you know what values to use for real? Here, I've Packit Service 3880ab provided a simple "stats" utility. Give it a series of floating point Packit Service 3880ab values, and it will return their mean (mu), standard deviation (sigma), Packit Service 3880ab and correlation coefficient (rho). You can then plug these values Packit Service 3880ab directly into NIST Net.