Depending upon the nature of the health status outcome variable I would suggest either a simple ordinary least squares regression where the risk factors are basically dummy coded (already done if they are coded 0/1) or a logistic regression.
Post by Glenn Fleisig, Ph.D. on Apr 23, 2008 10:02:53 GMT -6
Let me explain a little more about the risk factor variables. Each one is actually a biomechanical measurement (Such as knee angle). We've previously established a "proper" range for each risk factor variable, and consider an athlete to have an injury risk isf his data is either below or above the proper range. The data is close to a normal distribution, so about 2/3 of the subjects fall in the "proper" or "normal" range for each variable.
So we have several choices for coding the risk factor data:
Dummy code: 0=not in normal range 1=within normal range
Dummy code: -1=below normal range 0=within normal range 1=above normal range
Data: Use actual measurement value (for example, knee angle in degrees). The difficulty with this is, once again, that we expect a two-tailed effect; a high value is bad and a low value is bad.
I would like to inquire a little more on the proposed risk factors. What are the six? Why only six?
In addition, how are you quantifying injury?? Is it simply either "injured or not injured?" Would there be some type of logical categorization for the inuries (e.g., elbow vs. shoulder)? Or, would it be appropriate to categorize injuries based on something like.... anterior vs posterior shoulder injuries? If there are, this may be a way to strengthen your statistical argument with the categorical data analyses that have been suggested. When I evaluate pitchers, I believe there may be biomechanical measures that hint at the idea of anterior vs. posterior injuries...
Within the injury data you have.. is there a possibility to quantify the severity of the injury in terms of some quantitative measure such as.... time lost to injury? If this is possible, then you may be able to use some type of statistical analysis that based on two continuous data measures (i.e. amount of deviation [above or below] one or two standard deviations from the mean and a continuous injury severity measure. This opens things up a little:-)
Just one note of caution - the discussion is focused on linear models and does not get at the notion of interaction effects - that is, the combination is additive, which may or likely is untrue. From the initial question, it may be that interaction terms are most appropriate - does not require changing the models, but does involve thinking about the strategy for including interactions.
Post by Glenn Fleisig, Ph.D. on May 5, 2008 10:59:17 GMT -6
Thank you for your input. This gives our group some good issues to consider. I think the first thing we will do is just look at the data, in a big spreadsheet, to see if any patterns or trends just jump out at us.
I will post here later what approach we take, and how it works out.
Post by Glenn Fleisig, Ph.D. on Sept 29, 2008 14:45:39 GMT -6
Each subject had a numerical value for six kinematic parameters. These parameters are referred to here as the "potential injury risk factors."
We did six chi-squared analyses (one for each potential injury risk factor). For each chi-squared analysis, the value for each person's data was categorized as "good" or "bad," and their health was categorized as "injured" or "healthy."
Then we did a logistic regression. This gave a equation for calculating the odds of injury, depending on the person's values for the six kinematic parameters. Some parameters in the regression were significant (p<0.05), and some were not.
For what it's worth, the findings of the logistic regression matched up well with the chi-squared analyses. However as pointed out above, there are several limitations and assumptions with the approach we used.
I'd consider multiple regression. The variables can be expressed either as 1) Absolute value of variance from mean, or 2) break each variable into 2 separate variables - one for positive variance from mean and the other for negative variance from mean. It might be instructive to do both of these - is it just the magnitude of variance from the mean or does the direction of variance from the mean matter also?
1. Lead foot position @ foot contact (stepping open or across one's body) 2. Throwing shoulder external rotation @ foot contact 3. Separation timing (the difference between the time of peak pelvis rotation velocity and peak upper trunk rotation velocity) 4. Maximum throwing shoulder horizontal adduction 5. Throwing shoulder abduction @ ball release 6. Lead knee extension (the difference between the lead knee's angle at foot contact and ball release)
Although Dr. Fleisig said that this study included 109 people, we actually were forced to reduce this group to 54 because of incomplete data and other exclusionary criteria. We had a difficult time tracking some people down and getting the necessary information from them. Unfortunately, this meant that our preliminary results were inconclusive, but we are hoping to build on to this with a stronger methodological approach in the future. We are working on developing a specific follow-up questionnaire to make sure we get the appropriate information from our subjects. We also have a few other ideas for modifying the methodology to better attack the questions we're asking.
One other very important thing that we continue to believe is that mechanics are one of a number of contributing factors to injury. Other significant "role players" are the amount of pitching, physical fitness (strength and conditioning), and anatomy and genetic pre-dispositions to injury. Our future research will try to control for some of these things to reduce the number of confounding factors.