SOCI332 Statistics for Social Science

 

SOCI332 Statistics for Social Science

American Public University System (APUS)

 

SOCI332 Week 1 Discussion Choosing a Topic

Welcome everyone!  This week's Discussion requires you to respond to both the introduction prompt (Part 1) and the project topic prompt (Part 2) to receive full credit.

PART 1: First, for your initial INTRODUCTION post, write a brief introduction about yourself using the prompts below.

  1. What name would you like to use in class? In what time zone are you currently located?
  2. Statistics can be an intimidating course for many students! The great thing about statistics is that we engage with them every day without really realizing it (Consumer Reports data, the studies that support our physician's health plans for us, comparison of schools in our district, and more). How might learning basic statistics connect with your educational or career goals (share your major and/or career field with us)?
  3. Read the Syllabus for this course. Please state that you have read the syllabus and understand the course policies, expectations, and due dates.

PART 2: Be sure to read the Content for Week 1 prior to responding.

Choosing a topic: There are so many things around us that it can be difficult to focus on just one for a research project. Here are a few things to think about to find yours. First, we are in a sociology class, so your topic has to be sociological in nature. Wondering if a new diet helps people lose weight, for instance, wouldn't work. Instead, think back on some of the topics you covered in other sociology classes (Intro Soc, Marriage and the Family, Soc Theory, etc.). Was there something in there that sparked your interest? You can also build on previous research that you have completed for a former class in the program or closely related field. This project will be the focus of your discussions for the next several weeks. It is highly recommended that you choose something that is of interest to you and can keep your attention for that long.

We will be using General Social Survey (GSS) 2018 data set for Weekly Discussions, Assignment 1 and the Final Project (paper and presentation). You should NOT collect your own data. All variables and data are required to be from GSS 2018 data set. To access and download the data, please read through the Week 1 Overview (Content tab - Week 1). To learn more about the GSS, you may visit its main website http://gss.norc.org/. You can find the GSS variables online via GSS Data Explorer. See the attached handout.

The point of this discussion is to share your topic idea for your project, specifying the two GSS variables you want to analyze, so that other students will ask you questions or make suggestions that may help you define your project better. Your instructor will also interact with each of you individually in this module to help you refine your topic. Remember to check your thread regularly!

As you present your topic in this discussion, think about how you would study it. What is your research question and your theory behind it? After writing your introduction, tell the class what your topic is, phrasing it as a research question. Your research question should preferably be more general and open-ended than a hypothesis. (For example, what affects people's happiness?) Then, identify two variables found in the GSS 2018 dataset. You are choosing one independent variable and one dependent variable. Be sure to identify each variable name AND the questions asked in the survey. See screenshots tutorial (attached) for more details. Wrap up by explaining why you chose these variables for your project and why you think there is a correlation or a relationship. Be sure to reference at least one academic source that relates to your topic.

In your replies to at least two posts from your classmates, think critically about what they are trying to do with their project, and offer them constructive feedback. This can be asking for clarification about their proposed topic, suggesting a direction for their research, suggesting sources they may want to check, or contributing your personal experience about this topic. Be sure to also answer at least one peer who responded to your initial post (and interact with the instructor as needed).

Reiteration: For your Week 1 "Choose a topic" initial posting, please list everything in the following list:

Describe what your topic is, phrasing it as a research question. (You might say: Does _______ affect __________ ? For example, does the number of children people have affect their happiness?)

  • Identify variables (one DV, one IV) that you have found in the GSS dataset (see the attachment below). All variables in your project MUST come from this 2018 data set.
    • identify variable names; for example, "childs" is a variable name. It stands for "Number of children."
    • identify the question related to this variable that was asked in the survey (in verbatim). For example, GSS survey question for variable "childs" is as follows (in verbatim):

How many children have you ever had? Please count all that were born alive at any time (including any you had from a previous marriage).

  • Explain why you chose these variables for your project;
  • Explain why you think there is a correlation or a relationship.
  • Include a reference (including link) to an academic source related to your topic.

        

 

SOCI332 Week 2 Discussion Frequency Tables and Charts

This week's main Discussion requires you to complete three tasks. 

Task I: Frequency table

Now that you have imported GSS 2018 dataset into your SPSS and have learned how to use GSS data explorer to find out GSS variable information, you are going to create and post a frequency table of your variables.  Complete the following steps:

Give your discussion title a unique label specific to your study/variables.  Post a brief explanation of your topic which includes a bit of information about your variables: level of measurement, answer categories (yes/no, strongly agree, disagree, etc.), as well as the survey question used to collect data for this particular variable (refer to Discussion 1 discussion). Include a frequency table for each of your variables. Since you have two variables, one DV and one IV, you need to run frequency table for BOTH of your variables. When you are done, explain your outputs in no more than 5 sentences for each variable. Cite numbers in the outputs to support your conclusion. When you cite %, use the % reported in "valid percent" column. This column deletes all missing values, thus is "clean."

To create a frequency table in SPSS

  1. Open SPSS and open your GSS data file
  2. Select Analyze
  3. Select Descriptive Statistics
  4. Select Frequencies
  5. select open Statistics
  6. Make sure that mean, median, mode, standards deviation, and variance are chosen and select "Continue"
  7. Choose the variable that you want to make a frequency table of and click the arrow (this will move it into the right 'Variable' box)
  8. Select OK

Task II. Describe the measures of central tendency (mean, median, mode) and dispersion (variance, standard deviation) for each of your variables.

Based on what you have learned in the readings and lessons this week, identify the measures for each variable and explain what they tell us. Keep in mind that the mean is more meaningful for interval/ratio variables, the median or mode for ordinal variables, and the mode for nominal variables. What do these measures summarize for us about the variable's data?

Task III. Create charts (bar chart, pie chart, or histogram depending on your variables' level of measurement)

Presenting your data in graphic form is also important when conducting quantitative research. Based on what you have learned from the reading and the weekly lesson, create a graphic representation of your data. Your choice of graphing tool is purely based on a variable's level of measurement. When you are done, explain your outputs in no more than 5 sentences for each variable. It is OK if your explanation is similar to the frequency table interpretation, since chart is a different data presentation on the SAME variable. Cite numbers in the outputs to support your conclusion.

Basic rules:

Nominal: bar chart or pie chart

Ordinal: bar chart or histogram

Interval/Ratio: histogram or line chart

To Create a Chart

  1. Follow steps 1-4 above (without worrying about the statistics).
  2. Select Charts
  3. Select choice of format (depending on your variable's level of measurement)
  4. Select OK
  5. Continue with steps 5-6

Copy all of the frequency tables and charts by copy and pasting them into a document (PDF, MS Word) and attach to discussion. If your table/chart does not fit to the page, choose "copy special" and then "images." Paste images to the word document and the problem will be solved.


 

 

SOCI332 Week 3 Assignment 1

Assignment 1: Research Guidelines

Complete the following assignment by filling in all pertinent areas of research.  You will need to utilize SPSS and the GSS dataset specified in the class for this assignment. You should complete this assignment using the variables and topic that you have chosen for your Final Portfolio Project.  You will then be able to follow this as a guide, as well as a check-point, with your instructor.  It is essential that you read through all of the feedback regardless of your score.  You will be required to submit:

  1. This word document with blanks filled and SPSS outputs inserted.  Throughout the assignment you will see places where your tables, charts, and graphs can be placed. 
  2. An SPSS output file (spv) with this assignment for credit.

You may need to go back through the document to address formatting issues that shift as you begin to input your data. Points will be deducted for sloppiness. Use a different, but legible, color font for your responses.

This assignment is to completed and submitted no later than the Sunday of Week 3 by 11:55pm ET.  This assignment is worth 100 points.  Save the word file as follows [your last name_SOCI332_A1] and submit it to Assignments for feedback. Label the SPV file as [your last name_SOCI332_A1output]

 

(A)   My Purpose (research question) (10 pts)

My research question is: __________________________________________________­_____.

I chose this topic because ____________________________________________________________________________________________________________________________________________________________.

APA citation of an academic resource that relates to your topic:

____________________________________________________________________________________________________________________________________________________________

 

(B)   All About the GSS (10 pts) ***Reference Lesson 1 and http://gss.norc.org/faq***

1.      Who are the participants? ___________________________________________________________________________

2.      What population does the sample represent? ___________________________________________________________________________

3.      Who is funding the research? ___________________________________________________

4.      When is the data collected? __________________________________________________

5.      How is the data collected? ___________________________________________________

 

(C)   Variables (You are expected to have only one dependent variable (DV) and one independent variable (IV). (15 pts)


My IV: Provide information for
the IV using the format below.

IV Variable name in SPSS: ___________________

IV Question (as asked to the respondent verbatim) __________________________________________
__________________________________________________________________________________________________________________________________________________________________________

IV Answer categories: ___________________________________________________________________
__________________________________________________________________________________________________________________________________________________________________________

IV Level of Measurement (nominal, ordinal, interval/ratio): ___________________

 

My DV: Provide information for the DV using the format below.

DV variable name in SPSS:  ______________________________

DV Question (as asked to the respondent verbatim)- __________________________________________
__________________________________________________________________________________________________________________________________________________________________________

DV Answer categories: _____________________________________________________________________
__________________________________________________________________________________________________________________________________________________________________________

DV Level of Measurement (nominal, ordinal, interval/ratio): ­­­­­­­­­­­­­­­­­­­­­­­­­­­__________________________

 

(D)   Frequency Tables (15 pts)

Run frequencies for each variable listed above.  Summarize your findings in a paragraph or two below.  What do the counts and valid percents tell you about each variable? Cite numbers in the frequency tables to support your conclusion. Be sure to insert your tables (copy and paste from SPSS) into this document.

[Insert SPSS frequency tables here]

 

 

 

 

(E)    Graphs and Charts (10 pts)

Run the appropriate graphs/charts for each of your variables listed above.  Summarize your findings briefly in a paragraph or two.  Cite numbers in the graph/charts to support your conclusion. How does the visual representation help us understand the data? Include a title on each of your charts and other pertinent labels. 

[Insert SPSS graphs/charts here]

 

 

 

 

(F)    Measures of Central Tendency and Dispersion (15 pts)   

Run the measures of central tendency (mean, median, mode) and dispersion (variance, standard deviation) for each of your variables. Summarize your findings briefly in a paragraph or two. Which measures are appropriate for nominal, ordinal, or interval/ratio variables? What do these measures tell us about each variable?

[Insert SPSS output here]

 

(G)   Recoding (15 pts)   

Choose one of your variables to recode. If you have an interval/ratio variable, you may recode it into an ordinal variable. If you have two nominal/ordinal variables, recode the one with the most categories into fewer categories, or check with your instructor on the best option.

[Insert the following items: SPSS syntax for the recoding process; the frequency table for the original variable; and the frequency table for the recoded variable] 

 

 

 

 

(H)   Included SPV file (SPSS output of all syntax, tables and charts) – (10 pts)   

 

SOCI332 Week 4 Discussion Crosstabs

This week's main Discussion requires you to answer the question completely and correctly to receive full credit. 

This week we talk about the uses of a crosstabulation (crosstabs) and the benefits of creating this "snapshot" of your data.

For this forum, provide a brief introduction to your study to remind your classmates what we are reading about here.  Include:

  1. Your overall research question
  2. The research hypothesis and null hypothesis

Next, create a crosstab for your data and include it in the post.  Be sure to explain your findings, including a description of the data, a calculation of the epsilons, and a discussion of the 10% rule. The epsilons in short are the differences between the highest and lowest column % in any given row.  As long as one epsilon makes the 10% threshold, we'll deem two variables have "enough" going on to with each other to warrant further statistical analysis.

  

 

SOCI332 Week 5 Assignment 2

Assignment 2: Tests of Significance

 

Throughout this assignment you will review six mock studies. Follow the step-by-step instructions:   

 

a.  Mock Studies 1 – 3 require you to enter data from scratch. You need to create a data set for each of the three mock studies by yourself. (Refresh the data entry skill acquired in Week 1.)

b. Mock Studies 4 – 6 require you to use the GSS 2018 dataset. The variables are specified in each Mock Study.

c. Go through the five steps of hypothesis testing (below) for EVERY mock study.  

d.  All calculations should be coming from your SPSS. You will need to submit the SPSS output file (.spv) to get credit for this assignment. 

 

The five steps of hypothesis testing when using SPSS are as follows:

  1. State your research hypothesis (H1) and null hypothesis (H0).
  2. Identify your significance level (alpha) at .05 or .01, based on the mock study. In Mock Study One, you are required to use BOTH .05 and .01 to test your hypotheses. For the remaining mock studies, you only need to use ONE level of significance (either .05 or .01) as specified in the instructions.
  3. Conduct your analysis using SPSS.
  4. Look for the valid score for comparison.  This score is usually under 'Sig 2-tail' or 'Sig. 2' or 'Asymptotic Sig.'  We will call this "p."
  5. Compare the two and apply the following rule:
    1. If "p" is < or = alpha, then you reject the null.
    2. Please explain what this decision means in regards to this mock study. (Ex: Will you recommend counseling services?)

 

Please make sure your answers are clearly distinguishable.  Perhaps you could bold your font or use a different color.

 

This assignment is due no later than Sunday of Week 5 by 11:55 pm ET.  Save this Word file in the following format: [your last name_SOCI332_A2].  Your spv (SPSS output) file should be labeled [your last name_SOCI332_A2Output].

 

t-Tests  (50 points)

Mock Study 1: t-Test for a Single Sample (20 points)

 

  1. Researchers are interested in whether depressed people undergoing group therapy will perform a different number of activities of daily living (ADL) after group therapy than the average for depressed people. More ADL is a positive outcome. The researchers randomly selected 15 depressed clients to undergo a 6-week group therapy program.

 

Use the five steps of hypothesis testing to determine whether the average number of activities of daily living (shown below in the table) obtained after therapy is significantly different from a mean number of activities of 17 that is typical for depressed people. (Clearly list each step).

 

Test the difference at both the .05 and .01 levels of significance.

 

As part of Step 5, indicate whether the behavioral scientists should recommend group therapy for all depressed people based on evaluation of the null hypothesis at both levels of significance (.05 and .01).

 

Data to be entered in SPSS (instructions below)

 

CLIENT

AFTER THERAPY ADL

A

18

B

14

C

11

D

25

E

24

F

17

G

14

H

10

I

23

J

11

K

22

L

19

M

15

N

17

O

23

 

Step 1: Data managing

 

1. Open a blank SPSS data file: Fileà Newà Data

2. In the blank SPSS data file, create your SPSS data set by entering the number of activities of daily living performed by the depressed clients (numbers listed under AFTER THERAPY - see above) in the Data View window.

3. In the Variable View window, change the variable name to "ADL." Set the decimals to zero.

 

Step 2: SPSS execution

 

a. Click: Analyze à Compare Means à One-Sample T test à use the arrow to move "ADL" to the Variable(s) window on the right.

b. Enter the population mean (17) in "Test Value"

c. Click OK.

 

  1. Researchers are interested in whether depressed people undergoing group therapy will perform a different number of activities of daily living before and after group therapy. The researchers randomly selected 10 depressed clients in a 6-week group therapy program.

 

Use the five steps of hypothesis testing to determine whether the observed differences in the numbers of activities of daily living obtained before and after therapy are statistically significant at .05 level of significance. (Clearly list each step).

 

As part of Step 5, indicate whether the researchers should recommend group therapy for all depressed people based on evaluation of the null hypothesis.

 

      Data to be entered in SPSS (instructions below)

 

CLIENT

BEFORE THERAPY

AFTER THERAPY

A

11

17

B

7

12

C

10

12

D

13

21

E

11

12

F

12

15

G

9

16

H

8

17

I

13

17

J

12

8

 

 

Step 1: Managing data

 

1. Open a blank SPSS data file: FileàNewàData

2. In the blank SPSS data file, create your SPSS data set by entering the number of activities of daily living performed by the depressed clients (see above) in the Data View window. Enter the "before therapy" scores in the first column and the "after therapy" scores in the second column.

3. In the Variable View window, change the variable name for the first variable to "ADLPRE" and the second variable to "ADLPOST." Set the decimals for both variables to zero.

 

Step 2: SPSS execution

 

a. Click: Analyze à Compare Means àPaired-Samples t-Test à use the arrow to move ADLPRE under "variable 1" inside Paired Variable(s) windowà and then use the arrow to move ADLPOST under "variable 2" inside Paired Variable(s) window.

b. Click OK.

 

Mock Study 3: t-Test for Independent Samples (15 points)

 

  1. Six months after an industrial accident, a researcher has been asked to compare the job satisfaction of employees who participated in counseling sessions with those who chose not to participate. The job satisfaction scores for both groups are reported in the table below.

 

Use the five steps of hypothesis testing to determine whether the job satisfaction scores of the group that participated in counseling session are statistically different from the scores of employees who chose not to participate in counseling sessions at .01 level of significance. (Clearly list each step).

 

As part of Step 5, indicate whether the researcher should recommend counseling as a method to improve job satisfaction following industrial accidents based on evaluation of the null hypothesis. 

 

Data to be entered in SPSS (instructions below)

 

PARTICIPATED IN COUNSELING

DID NOT PARTICIPATE IN COUNSELING

36

38

39

36

41

36

36

32

37

30

35

39

37

41

39

35

42

33

 

 

Step 1: Data managing

 

1. Open a blank SPSS data file: Fileà Newà Data

2. In the blank SPSS data file, create your SPSS data set by entering the number of activities of daily living performed by those who participated/did not participate in the counseling sessions (reported on previous page). Please create two columns. Column one is the test variable, where you enter ALL the 18 scores in the table. Column 2 is the grouping variable, where you use "1" to indicate if a score is from someone who participated in the counseling sessions; and "0" to indicate if a score is from someone who chose not to participate in the counseling sessions. The data set will look like this in SPSS Data View window:

 

36    1

39    1

……….

38    0

36    0

……….

 

3. After data entry, go to Variable View window, change the name of the first variable (test variable) to "ADL" and the second variable (grouping variable) as "group." Set decimals for both variables to zero.

 

Step 2: SPSS execution

 

  1. Click: Analyzeà Compare MeansàIndependent-Samples T Testà use arrow to move ADL to "Test Variable" à use arrow to move "group" to "Grouping Variable" àwhen two (? ?) appear, click Define Groups. On the next pop up window, enter "1" for "Group 1" and "0" to "Group 2."
  2. Click OK.

 

ANOVA (15 points)

Mock study 4: One-Way ANOVA

 

  1. An advertising firm has been hired to assess whether different demographics have different rates of TV watching to help determine their advertising strategy. Using the GSS 2018 data, determine whether hours of tv watched differs by race.

 

Use the five steps of hypothesis testing to determine whether the observed differences in the number of hours watching TV across three groups are statistically significant at .05 level of significance. (Clearly list each step).

As part of Step 5, indicate whether the advertising firm should target each racial group differently (if their habits differ) based on evaluation of the null hypothesis. 

 

Variables from GSS 2018 dataset to be used (instructions below):

 

RACE – race of respondent
1 = WHITE

2 = BLACK

3 = OTHER

 

TVHOURS – hours per day watching TV

 

 

Step 1: Data managing

 

1. Open a blank SPSS data file: Fileà Open Dataà GSS2018.sav (from wherever you have it saved)

 

Step 2: SPSS execution

 

  1. Click: Analyze à Compare Means à One-Way ANOVA à use arrow to move TVHOURS to "Dependent Variable list" à use arrow to move RACE to "Factor," which instructs SPSS to conduct the analysis of variance on the number of activities performed by therapy type.
  2. Click: Options à Descriptive (to obtain descriptive statistics).
  3. Click: Continue
  4. Click: OK.

 

 

Additional question based on Mock Study 4

 

  1. Describe the circumstances under which you should use ANOVA instead of t-Tests. Explain why t-Tests are inappropriate in these circumstances.

 

Chi-Square (20 points)

Mock study 5-1: Chi-Square Test for Goodness of Fit

 

  1. Researchers are interested in whether US adults have different levels of confidence in Congress (legislative branch of the federal government).

 

Following the five steps of hypothesis testing, conduct "goodness of fit" chi-square test to determine whether the observed frequencies are significantly different from the expected frequencies at the .01 level of significance. (Clearly list each step).

 

As part of Step 5, indicate whether the observed frequency is significantly different from the expected frequency when equal number of adults in each confidence category is assumed (100%/3=33%), and what does this mean in regard to this mock study.

 

Variable from GSS 2018 dataset to be used (instructions below):

 

CONLEGIS – confidence in congress
1 = A GREAT DEAL

2 = ONLY SOME

3 = HARDLY ANY

 

 

Step 1: Data managing

 

1. Open a blank SPSS data file: Fileà Open Dataà GSS2018.sav (from wherever you have it saved)

 

Step 2: SPSS execution

 

  1. Click: Analyze à Non-Parametric Tests à Legacy Dialogs à Chi-Square à use the arrow to move CONLEGIS to "Test Variable list."

·       This procedure instructs SPSS that the chi-square for goodness of fit should be performed on the confidence in congress variable. Note that "All categories equal" is the default selection in the "Expected Values" box, which means that SPSS will conduct the goodness of fit test using equal expected frequencies for each of the different levels of confidence; in other words, SPSS will assume that the proportions of adults in each level are equal.

  1. Click OK.

 

Mock study 5-2: Chi-Square Test for Independence

 

2.               Next, researchers categorized the same group from the previous study based on the level of confidence in Congress and how strongly that person identifies with a specific political party. These data are presented below.

 

Following the five steps of hypothesis testing, conduct chi-square test for independence at the .05 level of significance.  (Clearly list each step).

 

As part of Step 5, indicate whether the observed frequency is significantly different from the expected frequency, and what that means in regard to this mock study. In other words, does political party affiliation effect one's confidence in Congress?

 

Variables from GSS 2018 dataset to be used (instructions below):

 

CONLEGIS – confidence in congress (legislative branch of government)
1 = A GREAT DEAL

2 = ONLY SOME

3 = HARDLY ANY

 

PARTYID – political party affiliation

0 = STRONG DEMOCRAT

1 = NOT STR DEMOCRAT

2 = IND NEAR DEMOCRAT

3 = INDEPENDENT

4 = IND NEAR REPUBLICAN

5 = NOT STR REPUBLICAN

6 = STRONG REPUBLICAN

7 = OTHER PARTY

 

 

Step 1: Data managing

1. Continue to work on the data set already opened in Mock Study 5-1: goodness of fit Chi-square test.

 

Step 2: SPSS execution

 

  1. Click: Analyze à Descriptive Statistics à Crosstabs à use arrow to move "PARTYID" to "Column(s)"à use arrow to move "CONLEGIS" to "Row(s)." (Recall in crosstab, DV is always in the row and IV is always in the column.)
  2. Click: Statistics à check "Chi-Square."
  3. Click: Continue.
  4. Click: Cellsà check "Expected."
  5. Click: Continue.
  6. Click: OK.

 

Regression (15 points)

Mock study 6: Linear Regression

 

  1. Researchers in the field of gerontology are researching the effects of age on mental health. They are using GSS data to gather some preliminary findings.

 

Following the five steps of hypothesis testing, conduct a linear regression analysis to determine whether age affects number of poor mental health days at the .05 level of significance. (Clearly list each step).

 

As part of Step 5, indicate whether there is a significant relationship between age and mental health at the .05 level and what does this mean in regard to this mock study. Should the researchers continue their study?

 

Variables from GSS 2018 dataset to be used (instructions below):

 

AGE – age of respondent

 

MNTLHLTH – Days of poor mental health past 30 days

 

 

Step 1: Data managing

 

2. Open a blank SPSS data file: Fileà Open Dataà GSS2018.sav (from wherever you have it saved)

 

Step 2: SPSS execution

 

  1. Click: Analyze à Regression à Linear à use arrow to move MNTLHLTH to "Dependent list" à use arrow to move AGE to "Independent," which instructs SPSS to conduct the linear regression on the relationship of age to poor mental health.
  2. Click: OK.

 

          

 

 

No comments:

Post a Comment

ECO561 Economics