/* This program carries out a simple calculation. The programs after this will begin to do more substantial heavy lifting. Specifically, this program considers two samples of numbers: Sample I (Xs): 9.19 9.54 8.65 7.31 8.47 9.78 Sample II (Ys): 8.73 8.17 6.40 6.31 7.09 7.99 5.89 6.38 8.24 and carries out the classical two-sample t-test for the hypothesis H0: E(X)=E(Y) The two non-comment lines tells the C compiler to include standard C ``header'' files that have prototypes for standard functions. #include is needed for printf(). #include is needed to introduce common math functions like sine (sin(x)), cosine (cos(x)), log (log(x) for base e, log10(x) for base 10, sqrt (square root), and many others. Of these, we use sqrt(x) and fabs(x), the latter of which gives the absolute value of a floating-point number. */ #include #include /* Variables in C: C has many different types of variables, and also has ways to create additional variable types of your own. This program uses integer variables for integer values and `doubles' for real numbers that might have fractional parts `Double' is short for `double-precision floating-point' number. These are the most common floating-point variables in C. Variables in C must be ``declared'' before they are used, so that the compiler will know what kind of code to generate for them. They can also be ``initialized'' (set equal to a starting value) at the same time. Two typical variable declarations are int i; double ff; which tell the compiler that `i' will be an integer variable and that `ff' is a double (that is, a double-precision floating-point variable). The syntax of variable names in C is that their names can be arbitrary strings of letters (a-z, A-Z) and digits (0-9), such that a letter comes first. For example, x, ytop, xval, Xval, x2, x22, and f2g37zz are legal variable names. 21xtop is not. Case is significant, so that xval, Xval, and XVAL are three different variables. Examples of declarations with initializations are int i=5; double ff=37.1371; These mean that the integer variable i starts out with the value 5 and ff starts out with the value 37.1371. In C, the statement i=5 (or x=y) means that the value on the right (5 or y) is stored in the variable on the left (i or x). In contrast, i==5 and x==y are logical statements, as in if (i==5) x=y; Here the statement `x=y' is only carried out if i==5 beforehand. The value of i (whether it was equal to 5 or not) is unchanged. Confusing the two is a common cause of programming errors, but modern C compilers will warn if you might have made this mistake. An ARRAY of variables is a list of variables that are kept together to make them easier to work with. The declarations int ii[5]; double gg[10]; define ``ii'' as a collection of 5 integer-value variables and ``gg'' as a collection of 10 doubles. An oddity of C is that array indices begin with 0 and not 1. That is, the first integer in the array `ii' is ii[0], the second is ii[1], and so forth. Examples of declarations of arrays with initializations are int ii[5] = { 2, 12, 151, -1, -2 }; double gg[10] = { 1.0, 2, 5.1, 37.1, 22 }; After this initiallization, the 5 integers in int ii[5]; (with their starting values) are ii[0]=2, ii[1]=12, ii[2]=151, ii[3]=-1, and ii[4]=-2 In particular, even though the array ii[] is of length 5, the value ii[5] is not defined, since it points to one integer beyond the end of the array ii[]. Referring to a value outside of an array (like `ii[5]=7;') can cause a program to crash with what is called a ``segmentation fault''. These are caused by memory references to memory locations that have not been allocated by the computer's central processor and thus may not exist (or else may exist but be in some other program). The first of the two samples of data is defined by */ int mm=6; double xx[20] = { 9.19, 9.54, 8.65, 7.31, 8.47, 9.78 }; /* These statements define and initialize an integer mm, which is the number of values in the first sample, and the first simple itself, stored in the array xx[]. For simplicity, we declare more memory for xx[] than we will actually need. The second sample size (nn) and the second sample (yy[]) are */ int nn=9; double yy[20] = { 8.73, 8.17, 6.40, 6.31, 7.09, 7.99, 5.89, 6.38, 8.24 }; /* The main() (starting) function of this program is: */ int main(void) { /* main() begins with declarations of two integer variables: */ int i, degfree; /* and follows with several declarations of doubles, most of which will be scratch variables in the computations to follow: */ double sum1,sum2, xmean,ymean, ss1,ss2, xss,yss; double hmn,pooledss,tt; /* This is a declaration of a double with an initial value that we will need below. */ double pval=0.00985; /* By C convention, adjacent literal text strings "xxxxx" "yyyy" are combined to form one text string "xxxxxyyyy", even if the two text strings are on different lines. Thus we can print (display) the following two lines with one printf statement. We do it this way to make the source file (this program) easier to read. */ printf ( "\nThis is the program TWOSAMP written by XXXXX.\n" "We want to test the hypothesis H_0:mu_X=mu_Y using data from\n" " the two samples:\n\n"); /* We now display the two samples before carrying out the test, so that someone running the program will know what two samples we were talking about. The first argument in a printf() statement with variables is the `format string'. In the format string, `%d' means that printf() expects an integer value (here mm) at this point in the argument list of printf() following the format string, and will substitute the value of that integer variable for %d. `%f' and `%g' mean doubles. `\n' is the EOL character (for end-of-line), which starts a new line. The `for' loop `for(i=0; i= %g) = %g (two-sided)\n\n", degfree, fabs(tt), pval); printf ("so that we reject H_0 and conclude mu_X != mu_Y.\n"); /* In this example, we found `pval' separately and stored its value in the double variable `pval'. In most of this course we will use C computer code that computes Student-t P-values directly without having to refer to tables or other computers. INTEGER DIVISION: In C, ``integer division'' is quite different from `floating-point division''. If you divide two integers, you get another integer with the remainder thrown away, so that 3/3=1, 4/3=1, 5/3=1, 6/3=2, 7/3=2, etc. This may look strange, but is useful for computing array indices and other purposes. Floating-point division works as expected, so that in contrast 3.0/3=1.0, 4.0/3=1.33333..., 5.0/3=1.66666...., etc If either value in a binary expression (like x*y or x/y) is a floating point, the only value is ``promoted'' to a floating point value and C uses floating point operations. By C convention, a literal number with a decimal point is a floating-point constant (like 3.0 or 3.) and a number without a decimal point (like 3 or 33) is an integer. Integers in expressions with floats are first automatically converted to floats, so that 3.0/2 = 3.0/2.0 = 1.50 (a float). In particular, if we had said `hmn=(1/mm)+(1/nn);' above, then we would have gotten `hmn=0.0', since the compiler would have assumed that we meant integer divison on the right-hand side of the equation, and `x=0' for floating point x and integer 0 means that x=0.0 (i.e., floating point 0). The program has now done its task, and is free to go home. */ return 0; }