**********************************************************; * Two groups of rats are given two different protective treatments and * then exposed to a carcinogen, DMBA. Survival times (actually, times * until cancer diagnosis) are given in days. There were no observed * failures (cancer diagnoses) until 142 days after exposure, then * about half of the rats were diagnosed in the next 83 days. A lag * or threshold time of 100 days was assumed before fitting a Weibull * distribution. * * The main question is whether the two groups differ significantly in * their survival times. A second question is the Weibull shape * parameter describing the survival times. * * It will turn out that the survival times differ significantly * (P=0.0455) assuming a Weibull distribution but are not significantly * different (P=0.5935 or P=0.5926) if exponential survival times are * assumed. In general, guessing the shape of the survival time * distribution incorrectly can have a very strong effect on the * P-values for significance of covariates. * * The two groups are called `Agroup' `Bgroup' rather than Group 1 and * Group 2 to make the surival plots clearer. * * Data from Example 6.2, p142, in E.T.Lee & J.W.Wang, ``Statistical * Methods for Survival Data Analysis'', 3rd edn. **********************************************************; title 'TWO GROUPS OF RATS EXPOSED TO A CARCINOGEN'; options ls=75 ps=60 pageno=1 nocenter; data rats; retain group; input zz$ @@; lagtime=100; * Lag time; if substr(zz,2,6)='group' then group=zz; else do; days=input(zz,12.0); status=(days<0); days=abs(days) - lagtime; output; end; datalines; Agroup 143 164 188 188 190 192 206 209 213 216 220 227 230 234 246 265 304 -216 -244 Bgroup 142 156 173 198 205 232 232 233 233 233 233 239 240 261 280 280 296 296 323 -204 -344 run; proc print; title2 'The data as SAS sees it'; run; **********************************************************; * Draw `ls' and `lls' plots for the two groups to check visually for * fit to Exponential and Weibull distributions, and to check if the * LogRank and Wilcoxon tests are significant. * * Note that the `-2log(LR)' P-value is very different from the LogRank * and Wilcoxon P-values. This is consistent with the survival-time * distribution being far from exponential. **********************************************************; options ps=35; * For neater-looking lineprinter plots; proc lifetest plots=(s,ls,lls) notable lineprinter; title2 'S, LS, and LLS PLOTS AND NONPARAMETRIC TESTS'; strata group; time days*status(1); run; options ps=60; * Restore page size; **********************************************************; * Carry out a Weibull AFT model with lagtime=100. The Weibull * distribution is * * Prob(Y>=t) = S(t) = exp(-(lambda t)^alpha ) * * which is equivalent to (see the comments in lr1samp.sas) * * log Y = log(1/lambda) + (1/alpha)log(Exp(1)) * * where P(log(Exp(1))>=y) = exp(-exp(y)) has what is called an * ``extreme-value distribution''. * * Given a covariate or covariates X, the AFT (for ``Accelerated Failure * Time'') model for the Weibull distribution is * * (*) log Y_X = log(1/lambda) + beta X + (1/alpha)log(Exp(1)) * * exactly as in a classical regression, except that here the ``error * distribution'' log(Exp(1)) has an extreme-value rather than a normal * distribution. If beta>0 and X is increased, then Y_X (the failure * time) also increases. * * The effect of X in (*) is to change lambda-->lambda exp(-beta X), or * * S_X(t) = Prob(Y_X>=t) = exp(-[lambda exp(-beta X)t]^alpha ) * * The hazard rate for S_X(t) is * * (**) h_X(t) = alpha[lambda exp(-beta X)]^alpha t^{alpha-1} * * If beta>0, then increasing X decreases the hazard rate and thus * increases the lifetime, which is consistent with (*). * * The SAS AFT procedure `proc lifereg' with / dist=Weibull uses maximum * likelihood to estimate the parameters in the regression * * (***) log(Y) = INTERCEPT + BETA X + SCALE log(Exp(1)) * * As in lr1samp.sas, the fitted parameters INTERCEPT and SCALE can be * translated back to the Weibull parameters alpha and lambda by * * alpha=1/SCALE, lambda=exp(-INTERCEPT) * * SAS's `proc lifereg' output has `Weibull Scale' = 1/lambda * = exp(INTERCEPT) (as opposed to `Scale') instead of lambda. Lambda * has the units of rate, whereas `Weibull scale' has the units of time. * * The covariates X in (***) take on two values for the two different * samples. Thus the two-sample Weibull AFT model treats `log(days)' * like a two-sample t-test with `group' as a class variable, except * that (i) errors are extreme-value distributed rather than normally * distributed and (ii) parameters are estimated by maximum likelihood. * * The two groups of rats are assumed to have the same Weibull shape * parameter alpha, so that testing H_0:beta=0 is equivalent to * testing whether the two groups have the same Survival-time * distribution. * * The `lls' plots in proc lifetest suggest that the survival times of * both groups of rats have the same Weibull shape parameter alpha, * so that the Weibull AFT model is reasonable. We also carry out * a likelihood ratio test for the hypothesis that the two groups * have the same Weibull shape parameter. This test is nowhere near * being significant, which is also consistent with the same alpha * for the two groups. * * The default in `proc lifereg' is / dist=Weibull, so that this option * could have been left out. * **********************************************************; proc lifereg; title2 'FITTING A WEIBULL AFT MODEL TO BOTH GROUPS TOGETHER'; class group; model days*status(1) = group / dist=Weibull; run; **********************************************************; * Are the Weibull shape parameters the same between the two groups? * This can be determined by fitting different Weibull distributions to * the two groups and then carrying out a likelihood ratio test by hand * by comparing the maximum log likelihoods. * * Specifically, this can be done by (i) fitting independent Weibull models * within each group (`by group' below), (ii) adding the two fitted * maximum likelihood values for the two groups (from the lines * `Log Likelihood for WEIBULL' in the output), (iii) subtracting the * fitted maximum likelihood for the AFT Weibull model above, and * (iv) comparing the result with a chi-square distribution with one * degree of freedom. * * Note that the stratified model (i.e., independent Weibull distributions * in each group) has 4 parameters (two intercepts, two alphas) while * the Weibull AFT model has 3 parameters (intercept, beta, one alpha). * Thus the stratified model has one more parameter, so that the test * statistic is asymptotically chi-square with one degree of freedom. * * In this case, the difference (ii)-(iii) is 0.161, so that twice the log * likelihood difference is 0.322, which leads to P=0.5704 for a * chi-square distribution with one degree of freedom. In other words, * there is no evidence that the two groups of rats have different * Weibull shape parameters. **********************************************************; proc lifereg; title2 'FITTING WEIBULL DISTRIBUTIONS WITH TWO DIFFERENT SHAPE'; title3 ' PARAMETERS TO THE TWO GROUPS SEPARATELY'; by group; model days*status(1) = / dist=Weibull; run; **********************************************************; * Note that the two groups are (barely) significantly different using the * Weibull AFT model, whereas they were not significantly different for * the log rank and Wilcoxon tests (again, barely). * * For comparison, let's also try an exponential distribution. Comparing * log likelihoods will show that a Weibull distribution (with the same * shape parameter for the two groups) fits the data significantly better * than the exponential. Note that the effect of `group' is not significant * in the exponential model, and that the P-value in the exponential model * for GROUP is almost exactly the same as that for the `-2log(LR)' test * in `proc lifetest'. **********************************************************; proc lifereg; class group; title2 'FITTING AN EXPONENTIALLY DISTRIBUTED AFT MODEL'; model days*status(1) = group / dist=exponential; run;