BOSTON COLLEGE
LYNCH SCHOOL OF EDUCATION

ED/PY 668 - Multivariate Statistical Analysis
Spring 2001
 

Prof. Larry H. Ludlow

TEXTS:

TEXTS AS NEEDED:

COMPUTING REFERENCES:

COURSE DESCRIPTION:

This course is a continuation of ED/PY 667. It is assumed that you are familiar with: a) the General Linear Model approach to data analysis and variance partitioning (specifically with respect to simple and multiple regression models), b) diagnostic analyses of residuals, c) statistical effects due to multicollinearity, d) purposes and advantages of different categorical coding strategies, e) interpretive consequences due to different variable selection procedures (i.e. forced selection--whether as single variables or in blocks versus algorithmic selections--forward, backward, stepwise) and, e) the distinctions between confirmatory and exploratory models and their interpretations. Furthermore, it is assumed that you are familiar with SPSS on either the VAX/ALPHA computing system or on a personal computer version (Mac, IBM, etc).

This course takes those topics and computing skills and applies them to practical research situations involving multiple dependent variables. For example, one purpose of multivariate analysis is to reduce some large set of correlated variables to some smaller set of relatively uncorrelated variables--as would be done if you factor analyzed a measurement instrument you constructed for another course.

Relevant readings will draw upon the various texts and special purpose articles. This course assumes that you will display persistence in meeting computing assignments and creativity in interpreting computing results.

GRADING

Grading will be based on competency demonstrated through your written assignments, data analyses, and class participation. Each data analysis is designed to demonstrate: understanding of the statistical foundations of the particular technique; the extent to which model assumptions have been met; the extent to which the application of the model was appropriate; how to interpret the results; and how to coherently and thoroughly report the results.

The same data set might be appropriate for all analyses. Such data may have been collected by you for your own research, obtained from a project you are working on with someone else, or obtained from one of the many databases maintained by the BC computing center. Alternatively, I have seven data sets that you may use: 1) Massachusetts Tests for Educator Licensure (MTEL) data for Boston College and Wheelock College teacher candidates; 2) Massachusetts Comprehensive Assessment System (MCAS) data for grade-school students from districts throughout the Commonwealth; 3) horse racing data used for differentiating between winners and losers; 4) faculty course evaluation data; 5) academic anxiety data from high school students; and 6 & 7) masculine and feminine conformity-to-norms attitudinal data from college-level students and respondents.


COURSE SCHEDULE:

1. Introduction to Multivariate Statistical Analysis

1. Introduction to Multivariate Statistical Analysis
Review (1 night):

Data and variable types
Normality and homogeneity assumptions
The GLM
The ordinary least squares criterion
simple and multiple regression
multicollinearity and variable selection
Diagnostic residual analysis
Overview of multivariate procedures
the multivariate distribution
centroids

Required readings:
T&F: Ch 1-5 and G&Y(I): Ch 1

2. Analysis of Covariance
Lecture (1 night with SPSS output interpretation)
Purpose for which it was originally intended, in contrast to how it is usually used
Statistical model and its assorted assumptions
Homogeneity of regression
test for homogeneity
Johnson-Neyman technique as a solution to non-parallel slopes
Observed and adjusted means

Required Readings:
T&F: Ch 8
Ancova (Huitema-Ch 3)
SPSS.ancova chapter

Suggested Readings:
Wildt & Ahtola. Analysis of Covariance. Sage
Thompson, B. Editorial Comment: Misuse of ancova and related "statistical
control" procedures. (E&PM, 98)
Keselman, et al. Statistical practices of educational researchers: An analysis of
their ANOVA, MANOVA, and ANCOVA analyses. (RER, '98)
Russell, M. ANCOVA: A magic wand or agent of deceit.
Winer, B.J. (1991/1971). Statistical principles of experimental design. Ch 10-
Analysis of Covariance.
Elashoff, J.D. Analysis of covariance: A delicate instrument.
Evans, W & Anastasio, E. Misuse of analysis of covariance when treatment effect
and covariate are confounded.
Johnson, P.O, & Fay, L.C. The Johnson-Neyman technique, its theory and application.
Rogosa, D. On the relationship between the Johnson-Neyman region of significance and
statistical test of parallel within-group regression.
Example: ancova.example
Assignment


3. Logistic Regression
Lecture (2 nights with output interpretation)
The logistic regression criterion
Maximum likelihood estimation and Newton-Raphson iterations
Interpretation of probability, odds, log-odds, b coefficients, exp(b), -2LL, Goodness-of-fit, Nagelkerke R2 ;
Forced, forward, backward entry;
Wald, LR, Conditional selection criteria;
Specificity and sensitivity.

Required Reading:
T&F: Ch 12 and G&Y(I): Ch 7
Menard. Logistic Regression. Sage
SPSS Regression Models (Ch 8)
Ludlow Handout: Logistic regression and its statistical foundations.

Other References:
Hosmer, D.W. & Lemeshow, S. (1989). Applied Logistic Regression. Wiley.
Eliason, S. (1993). Maximum Likelihood Estimation. Sage #96. (pp.1-20).
Pedhazur, E.J. (1997). Multiple Regression in Behavioral Research. (Ch 17).
Walsh, A. (1987). Teaching, Understanding and Interpretation of Logit Regression.
Ludlow & Haley. Modeling ventilator recovery with logistic regression.
SPSS Logistic Regression: Statistical Algorithms.
Assignment.


4. Discriminant analysis and classification techniques
Lectures (3 nights with output interpretation of SPSS)
The discriminant function criterion
The discriminant function as a vector projection through a multivariate space
The relationships between eigenvalue, Wilk's lambda, Hotelling's T2 , Mahalanobis's D2 ,
and canonical correlation
Variable selection techniques:
Wilk's lambda, Rao's V, Mahalanobis D2 , maxF, minresid
Number of significant discriminant functions
Standardized and unstandardized coefficients, and structure matrix;
Classification procedures:
discriminant function cut-scores, distance techniques, and probability techniques.

Required Readings:
T&F: Ch 11 and G&Y (I): Ch 9,2
Ludlow Handout: Discriminant analysis and its statistical foundations.
Klecka. Discriminant Analysis. Sage.
Review your GLM regression notes.

Suggested Readings:
Tatsuoka, M. (1971). Multivariate Analysis. Pp 111-125, 157-165.
Ludlow. 2-group discriminant analysis.
Ludlow. An empirical cross-validation of alternative classification
strategies applied to harness racing data for win bets.
Rulon & Brooks. On statistical tests of group differences.
Thompson, B. Stepwise regression and stepwise discriminant analysis need not
apply here: A guidelines editorial.
Huberty & Lowman. Discriminant analysis via statistical packages. (note the www
locations for on-line documentation)
Dattalo,P. (1994). A comparison of discriminant analysis and logistic regression.
Overall & Klett. Decision procedures for assigning individuals among several groups.
Lachenbruch. Chapter from Discriminant Analysis.
Huberty. Chapter from Applied Discriminant Analysis.
Arnold. Undergraduate aspirations and career outcomes of academically talented women: A
discriminant analysis.

Readings of historical interest:
Fisher,R.A. (1936). The use of multiple measurements in taxonomic problems. Annals
of Eugenics, 7, 179-188.
Fisher, R.A. (1938). The statistical utilization of multiple measurements. Annals of
Eugenics, 8, 376-386.
Rulon, P.J. (1951). Distinctions between discriminant and regression analyses and a
geometric interpretation of the discriminant function. Harvard Ed. Review.
Tatsuoka & Tiedeman. (1954). Discriminant analysis. Review of Ed. Research.
Porebski. O. (1966). Discriminatory and canonical analysis of Technical College data.

Examples: descriptive DA.output, predictive DA.output
Assignment


5. Principal component analysis and common factor analysis
Lectures (3 to 4 nights), interpretation of SPSS output (1 night).
Purpose and difference between PC and FA
Estimation and factor extraction techniques
Number of components/factors
Interpretation of factor loadings and factors
Rotation strategies
Orthogonal and oblique

Matrix Algebra will be included as a special subtopic. In particular this will cover
Determinants as generalized variance and "ill-conditioned matrices"
Spearman's "tetrad difference"
Geometric representation of multivariate data
Eigenvalues as variances and eigenvectors as projections through n-space
Relationship between distance between pairs of points, correlations between variables, and angles between vectors

Required Readings:
T&F: Ch 13 and G&Y(I) Ch 4
Kim/Mueller. Factor Analysis. Sage
Ludlow Handout: The eigenvalue solution.

Suggested Readings:
Faria, Ludlow & Frankel. An alternative scaling model for the prediction of body density in female athletes.
Haley, Ludlow, et al. TAMP: An empirical approach to identifying motor performance
categories.
Ludlow & Guida. The Test Anxiety Scale for Children as a generalized measure of academic anxiety.
Ludlow & Guida. A cross-cultural study of test anxiety.
Cureton & D'Agustino. Factor Analysis: An Applied Approach
Veldman. Fortran Programming for the Behavior Sciences.
SPSS. LISREL 7. Confirmatory Factor Analysis
Thompson, B. & Daniel, L. Factor analytic evidence for the construct validity of scores: A
historical overview and some guidelines.
Gorsuch,R. Common factor analysis versus component analysis: Some well and little known facts.
Harman. H. Modern Factor Analysis. Ch 2.

Readings of historical interest:
Pearson, K. (1901) On lines and planes of closest fit to systems of points in space.
Dodd, S. (1928). The theory of factors, I and II.
Mulaik, S. (1987). A brief history of the philosophical foundations of exploratory factor
analysis.

Example
Assignment

6. Multivariate analysis of variance/repeated measures
Lectures (2 nights), interpretation of SPSS output (1 night)
Required readings:
T&F: Ch 9-10 and G&Y(I) Ch 8

Suggested Readings:
Bray/Maxwell. Multivariate Analysis of Variance. Sage.
Cooley & Lohnes. Multivariate Data Analysis.
Olson. On choosing a test statistic in MANOVA.
Huberty & Morris. Multivariate analysis versus multiple univariate analysis
Example

7. Multidimensional Scaling
Lecture, examples and interpretation of SPSS output (1 night)
Required readings:
G&Y(I) Ch 5
Kruskal/Wish. Multidimensional Scaling. Sage.

Suggested readings:
Ludlow. A multidimensional comparison of short and long-term shape imagery.
Ludlow & Levy. Personal space as a function of infant illness: An application of multidimensional scaling.
Ludlow & Howard. The Family Map: A graphical representation of family systems theory.
Example



STATISTICS REFERENCES

1. Discriminant Analysis:
Lachenbruch. Discriminant Analysis. Hafner Press, 1975.
Cacoullos. Discriminant Analysis and Applications. Academic Press, 1973.
Hand. Discrimination and Classification. Wiley, 1981.
Klecka. Discriminant Analysis. Sage, 1980.
Gordon. Classification. Chapman & Hall, 1981.
Huberty. Applied Discriminant Analysis. Wiley, 1994.

2. Factor Analysis:
Cattell. The Scientific Use of Factor Analysis. Plenum Press, 1978.
Mulaik. The Foundations of Factor Analysis. McGraw-Hill, 1972.
Fruchter. Introducation to Factor Analysis. D. Van Nostrand Co., 1954.
Gorsuch. Factor Analysis. L. Erlbaum Associates, 1983.
Horst. Factor Analysis of Data Matrices. HRW, 1965.
Thurstone. Multiple Factor Analysis. The University of Chicago Press, 1947.
Thurstone. The Vectors of Mind. The University of Chicago Press, 1935.
Kim & Mueller. Factor Analysis. Sage, 1978.
Cureton & D'Agostino. Factor Analysis: An Applied Approach. LEA, 1987.
Harman. H. Modern Factor Analysis. The University of Chicago Press, 1976.
Holzinger & Harman. Factor Analysis: A Synthesis of Factorial Methods.
The University of Chicago Press, 1941.
Rummel. Applied Factor Analysis. Northwestern University Press, 1970.

3. Multivariate Analysis of Variance:
Marascuilo & Levin. Multivariate Statistics in the Social Sciences. Brooks/Cole, 1983.
Gnanadesikan. Methods for Statistical Data Analysis of Multivariate Observations.
Wiley, 1977.
Cooley & Lohnes. Multivariate Data Analysis. Wiley, 1971.
Tatsuoka. Multivariate Analysis. McMillan, 1988.
Finn. A General Model for Multivariate Analysis. Holt, Rinehart & Winston, 1974.
Green. Mathematical Tools for Applied Multivariate Analysis. Academic Press, 1976.
Overall & Klett. Applied Multivariate Analysis. McGraw-Hill, 1972.
Morrison. Multivariate Statistical Methods. McGraw-Hill, 1976.
Anderson. An Introduction to Multivariate Statistical Analysis. Wiley, 1958.
Bock. Multivariate Statistical Methods in Behavioral Research. McGraw-Hill, 1975.
Nambodiri, Carter & Blalock. Applied Multivariate Analysis and Experimental
Designs. Mc Graw-Hill, 1975.
Chatfield & Collins. Introduction to Multivariate Analysis. Chapman & Hall, 1980.
Dunteman. Introduction to Multivariate Analysis. Sage, 1984.

4. Multidimensional Scaling:
Coombs. A Theory of Data. Mathesis Press, 1976.
Torgerson. Theory and Methods of Scaling. Wiley, 1958.
Maranell. SCALING: A Sourcebook for Behavioral Scientists. Aldine, 1974.
Shepard, Romney & Nerlove. Multidimensional Scaling, Volume I. Seminar Press, 1972.
Shepard, Romney & Nerlove. Multidimensional Scaling. Volume II. Seminar Press,1972.
Davies & Coxon. Key Texts in Multidimensional Scaling. Heinemann, 1982.
Borg. Multidimensional Data Representations: When & Why. Mathesis Press, 1981.
Lingoes, Roskam, & Borg. Geometric Representations of Relational Data. Mathesis Press, 1979.
Schiffman, Reynold & Young. Introduction to Multidimensional Scaling. Academic Press, 1981.
Davison. Multidimensional Scaling. Wiley, 1983.
Young & Hamer. Multidimensional Scaling: History, Theory and Applications. LEA, 1987.
Borg & Groenen. Modern Multidimensional Scaling: Theory and Applications. Springer, 1997.

5. Cluster Analysis:
Hartigan. Clustering Algorithms. Wiley, 1975.
Everitt. Cluster Analysis. Heinemann, 1980.


GENERAL STATISTICS REFERENCES:
*Atkinson, A. Plots, Transformations, and Regression. Clarendon Press.
*Barnett & Lewis. Outliers in Statistical Data. Wiley.
Belsley, Kuh & Welsch. Regression Diagnostics. Wiley.
Bryant, E & Atchley. Multivariate Statistical Methods: Within-Groups Covariation. DHR.
Cattell. R. Handbook of Multivariate Experimental Psychology. McNally.
Chatterjee & Price. Regression Analysis by Example. Wiley.
Cohen & Cohen. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. LEA.
Cook & Weisberg. Residuals and Influence in Regression. Chapman & Hall.
Daniel & Wood. Fitting Equations to Data. Wiley.
Edwards. Multiple Regression and the Analysis of Variance and Covariance. Freeman.
Gnanadesikan, R. Methods for Statististical Data Analysis of Multivariate Observations. Wiley.
**Green, P. Mathematical Tools for Applied Multivariate Analysis. Academic Press.
**Hammer, A. Elementary Matrix Algebra for Psychologists and Social Scientists. Pergamon Press.
Hair, Anderson, Tatham & Black. Multivariate Data Analysis. Prentice Hall.
**Hanushek & Jackson. Statistical Methods for Social Scientists. Academic Press.
Harris. A Primer of Multivariate Statistics. LEA
Hawkins. Identification of Outliers. Chapman & Hall.
Huitema, B.E. The Analysis of Covariance and Alternatives. Wiley.
Neter & Wasserman. Applied Linear Statistical Models. Irwin.
**Pedhazur, E.J. Multiple Regression in Behavioral Research. HRW.
Stevens, J. Applied Multivariate Statistics for the Social Sciences. LEA.
**Tatsuoka, M. Multivariate Analysis. Macmillan.
*Wickens, T.D. The Geometry of Multivariate Statistics. LEA.
Yates, A. Multivariate Exploratory Data Analysis. SUNY Press.