-
Notifications
You must be signed in to change notification settings - Fork 1
Hw06
Multiple Regression of 4 independent variables + 1 dependent variable
Download the following iris data set. Make sure the file is unchanged and is named "iris.data"
You will have to do some manipulations to the file on the fly. When you extract the 5 column of data from the file, map each of the Iris classes to a numerical value. Here are the values that I used in this assignment:
- Iris-setosa -> 0
- Iris-versicolor -> 1
- Iris-virginica -> 2
Here are the descriptions of the first four attributes of the file:
- sepal length in cm
- sepal width in cm
- petal length in cm
- petal width in cm
Your goal in this regression problem is to compute the regression coefficients that (when rounded to the nearest integer), map to your Iris Class mappings. Compute the regression coefficients and the adjusted R^2 value for each of the following combinations of attributes:
- sepal length and petal length
- sepal length and sepal width
- sepal length, petal length, petal width
- sepal length, sepal width, petal length, petal width
Which arrangement produces the best R^2 values?
Take the results of your best regression equation. Round them to the nearest integer (you can use "np.round()" to do the trick). Count how many match the actual results and print your score out of 150 Iris classifications.
Your instructor was able to create an equation to get 146 correct. Using this approach, this is the best possible answer. There are no graphs required for this homework assignment.