Analysis of NES 2008 Data Set

138 views 5 pages ~ 1270 words
Get a Custom Essay Writer Just For You!

Experts in this subject field are ready to write an original essay following your instructions to the dot!

Hire a Writer

1) Using the NES 2008 data set, analyze the measures of central tendency for the “age” variable. In your interpretation of this variable, include an analysis of the mean and median, the standard deviation and the distribution of the data (Hint: obtain a histogram from SPSS). After analyzing the histogram and measures of central tendency do you have any concerns about the composition of the sample in regards to age? Are there age groups that are oversampled or under sampled in your opinion? 8 points.

The statistics tables for the age of respondent shows that there are 2280 valid and 43 missing values. Also, it shows the measure of central tendency: the mean value of the sample data is 46.85, the median is 46.00, and the standard deviation is 17.724. In the table below, the frequency column shows the number of observations for each age and the percentage column indicates the percentage of samples in that category. The histogram shows that the data is skewed and the distribution is not normal. The major concern with this age data is that you cannot use both mean and median as a measure of central tendency. There are no groups which are oversampled or undersampled because there is no bias in age group data collection.

2) Using the GSS 2008 data construct a Cross-Tab and interpret the results for whether or not education and partisanship play a role in people’s opinions about the role of women in society. Be sure to assess the table results and their statistical significance. The two variable names you will use are “educ4” and “femrole2” which provides a four-level measure of education and asks people whether they believe a woman’s place is in the home or the workplace. For the second Cross-Tab compare “partyid3” and “femrole2.” 12 points.

The case processing summary table indicates the proportion of the observations had valid and missing values for both education and female role. In the case, there are 1300 valid observations and 724 missing cases. The education table below contains the crosstab. Because the row variables contain Education: 4 categories and the column variables are the Female role: Home or Work? The row values will tell us what number of the four categories that believe a woman’s place is in the home or the workplace. From the table, the following conclusions can be made:

a) There is a total of 225 categories of 0-11 years. 158 and 67 of them believe that a woman’s place is at home and work respectively.

b) There is a total of 356 categories of 12 years. 194 and 162 of them believe that a woman’s place is at home and work respectively.

c) There is a total of 361 categories of 13-15 years. 145 and 216 of them believe that a woman’s place is at home and work respectively.

d) There is a total of 225 categories of 16-plus years. 129 and 229 of them believe that a woman’s place is at home and work respectively.

The case processing summary table for Party ID: 3 cats and Female role: Home or Work? Shows that there are 1276 valid observations and 747 missing cases. The crosstab table below indicates that:

a) The proportion of cats’ category Dem has a sample size of 475. 199 and 276 of them believe that a woman’s place is at home and work respectively.

b) The proportion of cats’ category Ind has a sample size of 468. 241 and 227 of them believe that a woman’s place is at home and work respectively.

c) The proportion of cats’ category Dem has a sample size of 333. 175 and 158 of them believe that a woman’s place is at home and work respectively.

3) Does liking the Democratic Party mean you dislike the Republicans? Assess the answer to this question using the NES 2008 data set to estimate a correlation model between the feeling thermometers towards Democrats and Republicans. Use the variables “dem_therm” and “rep_therm” Interpret your correlation coefficient and its statistical significance to answer this question. (10 points).

The descriptive statistics show that for a sample distribution of 2269, Feeling Thermometer Democratic Party has a mean of 57.16 and standard deviation of 25.536. Conversely, 2268 samples of Feeling Thermometer Republican Party have a mean of 47.74 and standard deviation of 25.736. The correlation of Feeling Thermometer Democratic Party with itself is 1 (r=1) for 2269 observations. The correlation of Feeling Thermometer Democratic and Republican Party is -4.75 (r=-0.475) for 2259 observations. The correlation of Feeling Thermometer Republican Party with itself is 1 (r=1). The correlation coefficient p between the two samples is 0.000 (p=0.000). Based on the results, there is a linear relationship between of Feeling Thermometer Democratic and Republican Party; hence, liking the Democratic Party does not mean you dislike the Republicans. Moreover, the direction of the relationship is positive.

4) In the last 20 years the negative connotation of being on welfare has increased, despite increasing wealth disparities and little to no growth in real household incomes. For this question, use the NES 2008 data set to run a linear regression model assessing what effect the following variables have on people’s opinions of welfare recipients. Your dependent variable is “welfare_therm” and your independent variables are “age” “hh_kids” “Hispanic” “income5” “gender” “partyid7” and “educ_r2” Explain the effects of the significant coefficients and assess the overall explanatory power of the model (e.g. the r-squared value and what that means). Remember you must know how the variables are measured to assess their impact. (20 points).

This linear regression model predicts the effect of a variable on people’s opinion of welfare. From the model summary table, the correlation between the predicted and observed feeling thermometer for people on welfare is given by the value of R. In this case, R= 0.29, which is very small; hence, our model does not predict the feeling thermometer for people on welfare precisely. The value of R squared is the square of the correlation, R, and it indicates that the proportion of variance in people on welfare that can be explained by the seven predictors (variables) is 0.083. This coefficient determination of 0.083 shows that the independent variable (“welfare_therm”) explains 8.3% of the variability of the dependent variables: “age,” “hh_kids,” “Hispanic,” “income5”, “gender”, “partyid7”, and “educ_r2”. The adjusted R square helps to give a more realistic indication of the predictive power of our regression model. In this case, the r-square adjusted is 0.079, which is much lower than the R Square of 0.083; thus, this regression model has quite a lot of shrinkage because it includes more predictors.

In the ANOVA table above, the F-ratio tests the suitability of the overall regression model. The data gives an F-ratio of 24.469, the degree of freedom (df) 7 and 1904, and the level of significance 0.000. Statistical representation of these results is given by F (7, 1904) =24.469, p<0.005. The data shows that the independent variables have a high level of significance in predicting the dependent variables. Therefore, this regression model is a good fit.

From the regression table above, the value of b coefficients tells us how many units of the dependent variable (Feeling thermometer: PEOPLE ON WELFARE) increases when the predictor increases for a single unit; for instance, a 1-point increase in age of respondent corresponds to 0.081point increase on the feeling thermometer. Therefore, we can predict people on welfare using the predictor scores as shown below;

The values of b coefficients contain a mixture of positive and negative values, indicating multicollinearity. Using a significance value of 0.05, the level of significance for Age of Respondent (0.04), R income quintile (0.000), summary Party ID (0.000), and RECODE of educ_r (0.002) shows that the values for b coefficients are statistically significant. However, predictors with a higher significance level show that the b coefficients are not statistically significant. They include Number of children in HH (0.892), Hispanic Ethnicity (0.359), and R gender (0.059).

September 25, 2023
Subcategory:

Identity

Subject area:

Gender

Number of pages

5

Number of words

1270

Downloads:

27

Writer #

Rate:

4.5

Expertise Gender
Verified writer

When you require an expert in social sciences, Tim1134 is the safest bet! Sharing my task for a paper revision, my writer understood every comment and kept my back safe. Professional attitude for every task that you may have!

Hire Writer

This sample could have been used by your fellow student... Get your own unique essay on any topic and submit it by the deadline.

Eliminate the stress of Research and Writing!

Hire one of our experts to create a completely original paper even in 3 hours!

Hire a Pro