Week 4: Project Assignment Paper.
Week 4: Project Assignment Paper.
Regression and Correlation Methods: Correlation, ANOVA, and Least Squares This is another way of assessing the possible association between a normally distributed variable y and a categorical variable x. These techniques are special cases of linear regression methods Week 4: Project Assignment Paper. The purpose of the assignment is to demonstrate methods of regression and correlation analysis in which two different variables in the same sample are related.
ORDER A PLAGIARISM-FREE PAPER HERE
The following are three important statistics, or methodologies, for using correlation and regression: Pearson’s correlation coefficient ANOVA Least squares regression analysis In this assignment, solve problems related to these three methodologies. Part 1: Pearson’s Correlation Coefficient For the problem that demonstrates the Pearson’s coefficient, you will use measures that represent characteristics of entire populations to describe disease in relation to some factor of interest, such as age; utilization of health services; or consumption of a particular food, medication, or other products. To describe a pattern of mortality from coronary heart disease (CHD) in year X, hypothetical death rates from ten states were correlated with per capita cigarette sales in dollar amount per month. Death rates were highest in states with the most cigarette sales, lowest in those with the least sales, and intermediate in the remainder. Observation contributed to the formulation of the hypothesis that cigarette smoking causes fatal CHD. The correlation coefficient, denoted by r, is the descriptive measure of association in correlational studies. Table 1: Hypothetical Analysis of Cigarette Sales and Death Rates Caused by CHD State Cigarette sales Death rate 1 102 5 2 149 6 3 165 6 4 159 5 5 112 3 6 78 2 7 112 5 8 174 7 9 101 4 10 191 6 Using the Minitab statistical procedure: Calculate Pearson’s correlation coefficient. Create a two-way scatter plot. In addition to the above: Explain the meaning of the resulting coefficient, paying particular attention to factors that affect the interpretation of this statistic, such as the normality of each variable. Provide a written interpretation of your results in APA format. Refer to the Assignment Resources: Dot Plots and Correlation and Resources: Performing Regression Analysis to view an example of Pearson’s correlation coefficient. This same resources are also available under lecture Correlation and Regression Methods. Submission Details: Name your minitab Part 2: ANOVA Let’s take hypothetical data presenting blood pressure and high fat intake (less than 3 grams of total fat per serving) or low fat intake (less than 1 gram of saturated fat) of an individual. Table 2: Blood Pressure and Fat Intake Individual Blood Pressure Fat Intake 1 135 1 2 130 1 3 135 1 4 128 0 5 121 0 6 133 0 7 145 1 8 137 1 9 148 1 10 134 0 11 150 0 12 121 0 13 117 1 14 128 1 15 121 0 16 124 1 17 132 0 18 121 0 19 120 0 20 124 0 Using the Minitab statistical procedure: Calculate a one-way ANOVA to test the null hypothesis that the mean of each group is the same. Use different variables as grouping variables (fat intake high 1; fat intake low 0) and compare the results. Calculate an F-test for an overall comparison of means to see whether any differences are significant. In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format. Visit the media Resources: One-Way ANOVA on lecture Correlation and Regression Methods to view an example of ANOVA. Submission Details: Name your Minitab output file pdf Submit your document to the Submissions Area by the due date assigned. Part 3: Least Squares The following are hypothetical data on the number of doctors per 10,000 inhabitants and the rate of prematurely delivered newborns for different countries of the world. Table 3: Number of Doctors Verses the Rate of Prematurely Delivered Newborns Country Doctors per 100,000 Early births per 100,000 1 3 92 2 5 88 3 5 85 4 6 86 5 7 89 6 7 75 7 7 70 8 8 68 9 8 69 10 10 50 11 12 45 12 12 41 13 15 38 14 18 35 15 19 30 16 23 6 Using the Minitab statistical procedure: Apply least squares analysis to fit a regression line to the data. Calculate an F-test and a t-test to test for the significance of the regression. Test for goodness of fit using R2. In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format. Submission Details: Name your Minitab output file pdf Submit your document to the Submissions Area by the due date assigned. Additional Materials Dot Plots and Correlation Performing Regression Analysis
Table 1 presents the result of a study to determine the link between death caused by coronary heart disease (CHD) and cigarette sales. The data was collected from ten states. A review of the presented data shows that high volumes of cigarette sells would typically be accompanied by high death rate as a result of CHD. The reverse relationship is also true. Based on the collected data, the null hypothesis is presented that: cigarette smoking increases the fatalities from CHD. The alternative hypothesis is presented that: cigarette smoking does not influence fatalities from CHD. The hypothesis test has been conducted using Pearson’s correlation test with the focus on testing the direction and strength of any existing relationship with the intention of determining the statistical significance of the relationship between the sale of cigarettes and number of deaths resultant of CHD (Goos&Meintrup, 2016). The results from the Pearson’s coefficient test presented the test coefficient r=0.826 (p=0.003). Based on the coefficient value, it can be accepted that there is a strong positive linear correlation between cigarettes use and CHD fatalities. These results are statistically significant since the p-value is less than 0.05. These result allow for the null hypothesis to be accepted with the conclusion that cigarette smoking increases the fatalities from CHD.
Part 2
Table 2. Blood pressure and fat intake
Individual | Blood Pressure | Fat Intake |
1 | 135 | 1 |
2 | 130 | 1 |
3 | 135 | 1 |
4 | 128 | 0 |
5 | 121 | 0 |
6 | 133 | 0 |
7 | 145 | 1 |
8 | 137 | 1 |
9 | 148 | 1 |
10 | 134 | 0 |
11 | 150 | 0 |
12 | 121 | 0 |
13 | 117 | 1 |
14 | 128 | 1 |
15 | 121 | 0 |
16 | 124 | 1 |
17 | 132 | 0 |
18 | 121 | 0 |
19 | 120 | 0 |
20 | 124 | 0 |
One-way ANOVA: Blood pressure versus Fat intake
Method
Null hypothesis All means are equal
Alternative hypothesis At least one mean is different
Significance level α = 0.05
Equal variances were assumed for the analysis.
Factor Information
Factor Levels Values
Fat intake 2 0, 1
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Fat intake 1 149.5 149.46 1.68 0.211
Error 18 1599.7 88.87
Total 19 1749.2
Model Summary
S R-sq R-sq(adj) R-sq(pred)
9.42732 8.54% 3.46% 0.00%
Means
Fat
intake N Mean StDev 95% CI
0 11 127.73 9.14 (121.76, 133.70)
1 9 133.22 9.77 (126.62, 139.82)
Pooled StDev = 9.42732
The data presented in table 2 presents the blood pressure and fat intake values for 20 individuals. In this case, the fat intake is the independent variable while the blood pressure is the dependent variable. For that matter, the intention of the study was to determine if there is a relationship between fat intake and blood pressure, and if that relationship can be mathematically modelled. To facilitate efforts to model the relationship between the two variables, aone-way ANOVA test was conducted (Vik, 2014).The null hypothesis for the test is presented that the two means are equal while the alternative hypothesis is present that the two means are not equal. The test results determined that f=1.68 (p=0.211). These test results show that there is no statistical significance in the difference in the blood pressure results noted for the high and low fat intakes. As such, the null hypothesis is not accepted since all the means are not equal. Instead, the alternative hypothesis that at least one mean is different. These results are visualized in the interval plot. In conclusion, there is not statistically significant difference between groups for fat intake and high blood pressure (F(1, 18) = 1.68, p = .211).
Week 4: Project Assignment
Regression and Correlation Methods: Correlation, ANOVA, and Least Squares
This is another way of assessing the possible association between a normally distributed variable y and a categorical variable x. These techniques are special cases of linear regression methods. The purpose of the assignment is to demonstrate methods of regression and correlation analysis in which two different variables in the same sample are related Week 4: Project Assignment Paper.
The following are three important statistics, or methodologies, for using correlation and regression:
- Pearson’s correlation coefficient
- ANOVA
- Least squares regression analysis
In this assignment, solve problems related to these three methodologies.
Part 1: Pearson’s Correlation Coefficient
For the problem that demonstrates the Pearson’s coefficient, you will use measures that represent characteristics of entire populations to describe disease in relation to some factor of interest, such as age; utilization of health services; or consumption of a particular food, medication, or other products. To describe a pattern of mortality from coronary heart disease (CHD) in year X, hypothetical death rates from ten states were correlated with per capita cigarette sales in dollar amount per month. Death rates were highest in states with the most cigarette sales, lowest in those with the least sales, and intermediate in the remainder. Observation contributed to the formulation of the hypothesis that cigarette smoking causes fatal CHD. The correlation coefficient, denoted by r, is the descriptive measure of association in correlational studies.
Table 1: Hypothetical Analysis of Cigarette Sales and Death Rates Caused by CHD
State | Cigarette sales | Death rate |
1 | 102 | 5 |
2 | 149 | 6 |
3 | 165 | 6 |
4 | 159 | 5 |
5 | 112 | 3 |
6 | 78 | 2 |
7 | 112 | 5 |
8 | 174 | 7 |
9 | 101 | 4 |
10 | 191 | 6 |
Using the Minitab statistical procedure:
- Calculate Pearson’s correlation coefficient.
- Create a two-way scatter plot.
In addition to the above:
- Explain the meaning of the resulting coefficient, paying particular attention to factors that affect the interpretation of this statistic, such as the normality of each variable.
- Provide a written interpretation of your results in APA format.
Refer to the Assignment Resources: Dot Plots and Correlation and Resources: Performing Regression Analysis to view an example of Pearson’s correlation coefficient. This same resources are also available under lecture Correlation and Regression Methods Week 4: Project Assignment Paper.
Submission Details:
- Name your Minitab output file mtw.
- Name your document SU_PHE5020_W4_A2b_LastName_FirstInitial.doc.
- Submit your document to the Submissions Areaby the due date assigned.
Part 2: ANOVA
Let’s take hypothetical data presenting blood pressure and high fat intake (less than 3 grams of total fat per serving) or low fat intake (less than 1 gram of saturated fat) of an individual.
Table 2: Blood Pressure and Fat Intake
Individual | Blood Pressure | Fat Intake |
1 | 135 | 1 |
2 | 130 | 1 |
3 | 135 | 1 |
4 | 128 | 0 |
5 | 121 | 0 |
6 | 133 | 0 |
7 | 145 | 1 |
8 | 137 | 1 |
9 | 148 | 1 |
10 | 134 | 0 |
11 | 150 | 0 |
12 | 121 | 0 |
13 | 117 | 1 |
14 | 128 | 1 |
15 | 121 | 0 |
16 | 124 | 1 |
17 | 132 | 0 |
18 | 121 | 0 |
19 | 120 | 0 |
20 | 124 | 0 |
Using the Minitab statistical procedure:
- Calculate a one-way ANOVA to test the null hypothesis that the mean of each group is the same.
- Use different variables as grouping variables (fat intake high 1; fat intake low 0) and compare the results.
- Calculate an F-test for an overall comparison of means to see whether any differences are significant.
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Visit the media Resources: One-Way ANOVA on lecture Correlation and Regression Methods to view an example of ANOVA.
Submission Details:
- Name your Minitab output filemtw.
- Name your documentdoc.
- Submit your document to the Submissions Areaby the due date assigned.
Part 3: Least Squares
The following are hypothetical data on the number of doctors per 10,000 inhabitants and the rate of prematurely delivered newborns for different countries of the world Week 4: Project Assignment Paper.
Table 3: Number of Doctors Verses the Rate of Prematurely Delivered Newborns
Country | Doctors per 100,000 | Early births per 100,000 |
1 | 3 | 92 |
2 | 5 | 88 |
3 | 5 | 85 |
4 | 6 | 86 |
5 | 7 | 89 |
6 | 7 | 75 |
7 | 7 | 70 |
8 | 8 | 68 |
9 | 8 | 69 |
10 | 10 | 50 |
11 | 12 | 45 |
12 | 12 | 41 |
13 | 15 | 38 |
14 | 18 | 35 |
15 | 19 | 30 |
16 | 23 | 6 |
Using the Minitab statistical procedure:
- Apply least squares analysis to fit a regression line to the data.
- Calculate an F-test and a t-test to test for the significance of the regression.
- Test for goodness of fit using R2.
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Submission Details:
- Name your Minitab output file mtw.
- Name your documentdoc.
- Submit your document to the Submissions Areaby the due date assigned.
Additional Materials
Week 4: Project Assignment Paper.