USING THE STATS PROGRAM The easiest way to use the program is to dedicate a directory to hold all your variable files. (A variable file is a file of real numbers.) Then run the program by typing "/public/bin/stats" in an xterm window. If /public/bin is in your path (you can check with "echo $PATH") you can run the program by typing just "stats". ***If you run the program in the directory containing your variable files, the command "vars" will list the correct files; (this behavior can be changed by putting an edited copy of statsrc in the directory from which you run the program.) In the sample below, I temporarily made a directory named "junk" to hold some files and ran the program in that directory. A sample session: _________________________________________________________________________ linux63:friedman/junk> /public/bin/stats *******************WELCOME TO STATS******************* *****(All programming and code by Chas. Friedman)***** ****************************************************** Type ? to see a list of commands. cmd>vars data1 data2 data2-2 data3 data3.bak data4 data5 data6 data7 data8 VARIABLES LOADED: cmd>load data5 cmd>count data5 number of data values = 1000 [variable: data5] min = 0.002533 max = 0.997375 cmd>fcount data5 FREQUENCY COUNT FOR data5 number of data values in [0.002533, 0.102017] = 115 [freq = 0.115000] number of data values in (0.102017, 0.201501] = 100 [freq = 0.100000] number of data values in (0.201501, 0.300986] = 92 [freq = 0.092000] number of data values in (0.300986, 0.400470] = 85 [freq = 0.085000] number of data values in (0.400470, 0.499954] = 102 [freq = 0.102000] number of data values in (0.499954, 0.599438] = 103 [freq = 0.103000] number of data values in (0.599438, 0.698922] = 111 [freq = 0.111000] number of data values in (0.698922, 0.798407] = 116 [freq = 0.116000] number of data values in (0.798407, 0.897891] = 93 [freq = 0.093000] number of data values in (0.897891, 0.997375] = 82 [freq = 0.082000] cmd>mean data5 mean = 0.494395 [variable: data5] A confidence interval for the mean with confidence coeff 1-a is given by 0.494395 +- z_{a/2}(0.009059) where z_{a/2) = xval_n 1-a/2. A 90 % confidence interval is: 0.494395 +- 0.014893 A 95 % confidence interval is: 0.494395 +- 0.017747 A 99 % confidence interval is: 0.494395 +- 0.023327 cmd>f_n 2 f_n(2.0000) = 0.9772 cmd>xval_n .9772 xval_n(p=0.977200)=1.999000 cmd>quit linux63:friedman/junk> _________________________________________________________________________ The following is the output of the ? command: The following are implemented commands: ? for help !! command to execute shell command sh to start a shell editstart editor (on file if given) vars to see directory of variables for stats program load variable to load a variable file into the program unload variable to unload a variable from the program mean variable find mean of (loaded) variable Also computes some *confidence intervals* means var1 var2 find means and sample stdev of var1, var2 and some *confidence intervals* for the diff of the means var variable find variance and st.dev. of (loaded) variable svar variable find sample variance and sample st.dev. of (loaded) variable Also computes some *confidence intervals* count variable for count of (loaded) variable count a b variable for count of (loaded) variable in (a,b] fcount variable for frequency count of (loaded) variable nth variable n exhibit entry n of (loaded) variable fact n to compute n! perm n r to compute (n)(n-1)...(n-r+1) [permutations] comb n r to compute (n)(n-1)...(n-r+1)/r! [combinations] bin x n p to compute the pf of the binomial distribution [pf: comb(n,x)p^x(1-p)^{n-x}] f_bin x n p to compute the cdf of the binomial distribution [the sum of bin i n p for i=0,...,x] negbin x r p to compute the pf of the negative binomial dist. [pf: comb(x-1,r-1)p^r(1-p)^{x-r} when r=1 this is the geometric distribution] f_negbin x r p to compute the cdf of the negative binomial dist. [the sum of negbin i r p for i=r,...,x] hgeom x r N n to compute the pf of the hypergeometric dist. [pf: comb(r,x)comb(N-r,n-x)/comb(N,n)] f_hgeom x r N n to compute the cdf of the hypergeometric dist. [the sum of hgeom i r N n for i=0,...,x] poisson x l to compute the pf of the Poisson distribution [pf: e^{-l}l^x/x!] f_poisson x l to compute the cdf of the Poisson distribution [the sum of poisson i l for i=0,...,x] gamma x to compute gamma function of x. [gamma(k)=(k-1)!] f_gamma x l r to compute cdf of gamma distribution [pdf:(l^r/gamma(r))x^{r-1}e^{-lx)] f_chisqr x k to compute cdf of chisqr distribution [pdf:((1/2)^{k/2}/gamma(k/2))x^{k/2-1)e^{-x/2)] f_n x to compute the cumulative normal distribution of x [standard pdf - mean 0, variance 1] xval_n p to compute 1st x for which f_n(x)=p xval_t p k to compute 1st x for which f_t(x,k)=p xval_f p m n to compute 1st x for which f_f(x,m,n)=p xval_chisqr p k to compute 1st x for which f_chisqr(x,k)=p xval_bin P n p to compute 1st x for which f_bin(x,n,p) >= P xval_negbin P r p to compute 1st x for which f_negbin(x,r,p) >= P xval_hgeom P r N n to compute 1st x for which f_hgeom(x,r,N,n) >= P xval_poisson P l to compute 1st x for which f_poisson(x,l) >= P f_f x m n to compute cdf of F distribution [pdf:(G((m+n)/2)/(G(m/2)G(n/2))(m/n)^{m/2)x^{(m-2)/2}(1+mx/n)^{-(m+n)/2} where G=gamma] f_t x k to compute cdf of t distribution [pdf:(G((k+1)/2)/G(k/2)sqrt(k*pi))(1+x^2/k)^{-(k+1)/2} where G=gamma] regress to do a linear least squares regression and fit var_0 = b_0 + b_1(var_1) + ...+ b_n(var_n) anova1 to do a one way anova chisqr var1 var2 to compute the chisquare statistic for goodness of fit to a multinomial distribution. var1 should be a file of nonnegative integers ni, var2 should be a file containing probabilities pi. The chisquare statistic is the sum of the terms (ni-n*pi)^2/n*pi where n is the sum of the ni. quit to exit the program