Uncategorized

a score of zero would mean there’s really no change there no better often they’re no worse off so those are our two outcomes we’re gonna measure what a fact the technique has on recovery time presumably certain techniques will have a faster recovery time than others and then we’re also measuring this global rate of change presumably certain techniques will have a greater level of improvement versus worsening and so we’re going to compare those two things and so we have one in this case categorical independent variable with three levels we have three different surgical techniques we have to in this case or we could have more continuous or quantitative outcome variables that we are going to presume are somewhat related to one another so it’s by intuition we would assume that recovery time and improvement or lack of improvement will be related to another one another we’re going to test that so we’ve got several assumptions that we need to tasked with manova and some of these are going to sound familiar from doing ANOVA’s but there are some new additional assumptions that we need to look at as well one of the assumptions the first assumption is sample size and I’ll talk about these a little more detail as we go through our example but we have sample size we need to test the normality of the outcome variables we also need to make sure we don’t have any outliers present we need to make sure our variables are linearly related we also need to make sure there is not any multicollinearity when I’ll explain that in a little bit and we also need to make sure we have homogeneity of variance as well as homogeneity of covariance okay so let’s look at the the assumptions first sample size is the first one and here we typically use a rule of thumb if we need to have more cases in each group or more subjects in each group then we have dependent variables so ideally we’ll have many more than this but this is kind of the absolute minimum so having a larger sample can help us kind of get away with violations of some of the other assumptions like normality or some of the other things we can look at so the minimum required number of cases in each in each group in this example would be 2 because we have two dependent or outcome variables now we have a total of 6 groups we have three levels of our independent variable surgery surgery technique number one two and three and then we have two dependent variables so we have a total of six groups here so the number of cases that we’re gonna see in each group is going to be provided as part of the manova output and then we’ll be able to examine whether or not we’ve met that assumption now the next is normality and so even though the significance tasks of Menno of our based on the multivariate normal distribution so in practice that’s it’s pretty robust and it’s pretty it’s able to deal with modest violations of normality except where the violations might be due to outliers so when we have a sample size of at least 20 in each group we kind of assume that’s going to be a robust enough that we can overcome any differences or any limitations of normality so we still need to check univariate normality of each outcome which we do through the explore function and looking at the skewness score and then we’re also going to check multivariate normality and I’ll show you how to do that a little bit later and so these procedures will also help us determine if we have any any outliers and we can use the outlier labeling technique as I discuss in one of the other videos to help us identify we have any potential outliers now talking more about outliers manova is very sensitive to outliers so if we’ve got data points or scores that are much different than the rest of the scores so we need to check for univariate outliers again using outlier labeling technique for each of our dependent variable separately and then we also are going to look for multivariate outliers so multivariate outliers are participants with a kind of strange unusual combination of scores on the various dependent variables so in other words they might have a very high score on one variable but a very low score on another variable so again we’re going to check for univariate outliers by using the explore function and then using the outlier labeling technique now in order to check for multivariate outliers I’m going to kind of walk you through the procedure to do that so to test for these four multivariate normality or outliers we’re going to use something called the Mahalanobis distance and spss

is going to help us calculate that and we’re going to be using a regression function to do that so this Mahalanobis distance is the distance of a particular case or score from the center of the centroid of the remaining cases where the centroid is basically the point created by all the means of all the variables so it’s kind of a center score of all created by all the scores so this analysis will help us pick up on any cases that have a strange pattern of scores across all or across both of the dependent variables so what’s going to happen is this procedure will create a new variable in my our data file and it’s going to be labeled NH underscore 1 and you’ll see that when we run the analysis so each person or subject receives a value on this variable that indicates the level or the degree to which their pattern of scores differ from the remainder of the sample so to decide whether a case is an outlier we need to compare this this distance value against a critical value and so we use a chi-square critical value table to do that and what I’ll do is in the description of this video I’ll post the table that you can use as a reference to help determine if you’ve got distances that are outside of the acceptable level so if an individual’s mah underscore one score exceeds this critical value it’s considered an outlier and so manova can tolerate a few outliers a handful of outliers particularly if their scores are not too extreme or not too far away from this critical value and if we have a reasonable sized data file in other words a reasonable sample size so if we’ve got close to you know 20 subjects in each cell and we’ve got a few outliers we’re probably going to need to delete those but if we’ve got 50 or 75 or a hundred subjects per cell we can easily tolerate a handful of outliers so if we have too many outliers or very extreme scores in other words a Mahalanobis score that is wait quite a bit above the critical value we’re probably going to need to consider deleting those cases or maybe transforming the variables that are involved so let’s go ahead and let me show you how to run this this procedure detect test for a multivariate normality so the first thing we want to do is go to the analyze menu we want to select regression and then linear so we want to enter the variables our first enter in the variable identifies each one of our cases which in this case the surgical technique that’s re our categorical or independent variable and we put that in the dependent box which seems counterintuitive but that’s how we run this analysis and then in the independence box we put our outcome variables in this case weeks until return to normal and then that global rating of change score and we click on the Save button and then we want to make sure we’ve checked under the distances box we check that mahalo Tobias distance option so you can see I’ve checked that hey once we’ve done that we click on the ok buy or click on the continue button excuse me and then we click on ok now if we go back to our data file you’ll see that we have created this new variable mah underscore one okay so that’s the variable that we’re going to use to help us determine if we have any outliers now if we go back to our output file ok the box that we are interested in their table we’re interested in is the very last one let’s label the residual statistics and what we want to look for is the maximum value listed under the Mahalanobis distance row so here we see here the MA a little B is distance and our maximum score is eleven point two one eight now we’re going to take note of that value in this case it’s eleven point two which doesn’t seem like a very high value so we’re going to compare this number to that critical value I mentioned earlier and so this critical value is again determined by using a chi-square table with the number of outcome or dependent variables we have serving as our degrees of freedom value and then the Alpha value our p-value we’ll use to determine our critical value is point zero zero one so again I put a table and then in the descriptor of for this video description section of this video and you’ll be able to see this table and I’ll include four studies up two up to ten outcomes ten dependent

variables so in this case we have two dependent variables so our critical value for determining if we have an outlier it’s going to be thirteen point eight two so if we compare that to our max score of eleven point two one eight we can see our max score does not exceed that critical value of thirteen point eight two so that means we do not have any multivariate outliers if this max score did exceed that thirteen a – then I would indicate we do have outliers and so if we wanted to define those outliers and take a look at what their scores really are relative to and how many there are relative to this critical value we would have to sort the cases let me show you how to do that just for demonstration purposes so in our example we did not exceed the critical value so we would determine that we met the assumption of no multivariate outliers but if our critic if this value did exceed the critical value we need to find those values and how many there are so go back to our data file and we can see here we’ve got our our value so what we want to do is sort this variable by the value okay so we’re gonna go to data and then sort cases and what it’ll do is it’ll rearrange the data so we have the highest mah one scores at the top and we can see what those scores are and how many there are so we move our new Val variable that we created this model it’ll be its distance and move that into sort by we want to choose a sort order of descending so I’ll list the highest scores first so again we go back to our data file now and we can see we’ve got our highest scores listed at the top and then descending to the lower scores so here’s where we’d be able to examine what those actual Mahalanobis distance scores are so we’d be able to identify number one how large that score is relative to the critical value and then how many subjects might have scores that exceed that critical value so again in this example we don’t have any but we this is where we’d be able to see this so for example if one of the subjects had a mile it’ll be its distance score that was maybe one or two times larger than our critical values and now that would be a large difference and I probably want you can you know figure out how I can either transform the variable maybe delete that individual if I maybe only had one or two individuals that that critical value I probably leave them if I had you know more than about two percent of my total subjects exceeding that critical value that I would probably have to consider removing those subjects or transforming the variable so that’s how I would deal with determining if I had these multivariate outliers but again in this case we don’t have any so we’ve we’ve met that assumption okay the next assumption we have to examine is the assumption of linearity and so this refers to the presence of a straight line or linear relationship between each pair in this case only one pair of dependent or outcome variables so if we have multiple more than two outcome variables we’d have multiple pairs of variables that we’d have to determine if they’re linear linearly related so we can assess this in a number of ways and probably the most straightforward of this buts but subject somewhat subjective is to generate a matrix of scatter plots between each pair of variables again in this case only one pair and then separate it out by our groups in this case type of surgery so how we do this we go to the graphing function select the legacy dialogues and then scatter slash dot so we’re going to determine a scatter plot here so we click on the matrix scatter and then click define and then all of our outcome variables we move into the box labeled matrix variables so our recovery time in weeks and then our global rating of change going to the matrix variables box and then in the rows box goes our independent variable in this case surgical technique okay the next thing we want to do is click on the options button and make sure that we have a list of exclude cases variable by by variable in our example here we don’t have any missing cases but if you did have missing cases you would then have them be excluded variable by variable so that those won’t be included in any analysis all right then we click on the continue button and then we click on OK

and so what we’re gonna examine down here at the bottom are these scatter plots and so what we’re looking for is to see if there’s a general linear relationship here and so if we were to kind of draw an imaginary line of best fit on each of these we can see that these generally trend from lower left to upper right now some of these look almost more like squares than bubbles that trend so that’s a little questionable that we have the true linear relationship what we’re really looking for are maybe a trend that might have more of a curvilinear relationship down books like upside down U or maybe right-side up you but in this case we have a generally linear relationship but it’s a little suspect because we have especially here we have kind of a sprat of scores it almost makes this more look more like a square than it does like an oblong object now this looks a little more linear spread out but it’s a little more linear this doesn’t quite look as linear so you could argue that we’ve met the assumption but we’re going to be a little conservative here and say it doesn’t look like we’ve met that assumption that’s okay we don’t need to stop but we are going to change how we perform or how at least we assess the outcomes of our actual manova and I’ll talk about that when we get to that point but this is how we assess linearity of our outcome variables now the next assumption we have to determine is looking at multicollinearity and manova works best when the dependent variables are moderately correlated we don’t want them to have weak or low correlations we don’t want them to be too highly correlated so for example if we have low correlations we should probably consider running separate univariate analysis of variance for each of our outcome variables if the dependent variables are too highly correlated this is referred to as multicollinearity now this can occur when when one of your variables is a combination of other variables so for example the total scores of a measurement scale that is made up of certain sub scales that are also being examined as outcome variables also if we’ve got two variables that are redundant in other words they’re measuring pretty much the same construct we can have this this very high cold multicollinearity so what we want to do is assess whether or not we have this either too low or too high of a relationship what we want it’s kind of that middle range of a relationship between these variables and so how we’re going to test this is we’re going to basically run a bivariate correlation so we want to go to the analyze menu click correlate and then bivariate and we want to move our to outcome variables into the variables box we want to examine the relationship between these two variables and then we click the ok button and so what we want to examine here is this R value so we can see that our value between this gives us an idea of the strength of the relationship between these two variables now our values can range between minus 1 and plus 1 so the closer the R value is to 1 the stronger the relationship between these two values whether it’s negative or positive so what we like to see what’s optimal is having an R value that ranges somewhere between 0.2 and around 0.8 if it’s less than 0.2 these two variables aren’t related enough if it’s greater than 0.8 or 0.9 depending on on who you you read that’s too much of a relationship so we like the value to be kind of in that middle range of 0.2 to 0.8 so in this case our value of our two variables is 0.34 9 which is in the middle there just as close to the bottom end but it sits in that middle range so these two variables are related to one another but not two related so we see appears we’ve met that assumption then of not having multicollinearity or of not having too little of a relationship which is we referred to as again multi : or not having enough relationship so our last assumption we need to determine then is this homogeneity of variance and covariance matrices so this is generated as part of our actual manual analysis and the test we use to assess this is known as boxes M test and so we’re gonna look at that when we look at our Manoa bi output to determine that last