Update 09/23/17: I am switching to two proportion Z tests. I am setting the population proportion to .5 to prevent an underestimation of variance.
This is a bit of a technical post, I will have a better explanation later.
Post election, I have been working on a paper and thinking about what to do next. I am really interested in breaking down voter behavior in the swing states. I have collected exit poll data from the 11 swing states. I want to test if voter behavior across the swing states was consistent with the national vote or the swing state average.
For phase 1 of this experiment, I will run Chi-Square Test of Homogeneity between a swing state compared to the average of the other swing state and the national vote. I will look at each category four different ways: Trump vs. not Trump, Clinton vs. not Clinton, Other vs Clinton and Turmp, and overall. This will probably be around 1500 tests. I will have an initial alpha level of 0.05. I will then run a two proportion z-tests on the tests were the p value was less than 0.05. I will do the z-tests on the direction that matches the data.
For phase 2, I will collect data from 2008 and 2012 in states that have a statistically significant portion of significant tests. Then I will compare voting behavior with Chi-Square Test of Homogeneity on: 2008 vs 2012, 2008 vs 2016, and 2012 vs 2016. Then significant results will be tested using a two proportion z-test.
I am going with the Chi-Square test first for two reasons. The Chi-Square test is not subject to errors in the direction of an effect, and the Chi-Square test is less sensitive than a two proportion z-test. I have to be very careful in my interpretation of the results since an analysis this large means that there is a big potential for false positives and false negatives. This analysis will probably take me most of next year. I’ll give an update on my progress in December.