# Clustering Analysis of Homeowner 1 and Homeowner 2

Average revolving utilization = 0.531304348 = 53.13%

The cluster analysis shows that homeowner 2 has a better chance of credits due to the lower average years of credit history (10.09) than that of homeowner 1 (12.67). Cluster analysis also shows that homeowner 2 has more revolving balance (19523.33333) than homeowner 1 (14063.7037). Cluster analysis also indicates that the revolving utilization factor for homeowner two is higher (53.13%) than that of homeowner 1, hence the former has a better credit security. The average Euclidean distances for the two, as indicated by the cluster analysis cannot be used for further clustering of the data, yet the final results can be obtained– the final results are for the two sets of data (homeowner 1 and homeowner 2). Hence, in this case, cluster analysis is efficient and applicable.

Q13. K-NN method.

Dif 1 = Xi value in cluster i – X(i+1) value in cluster i.

Dif 2 = Xj value in cluster j – X(j+1) value in cluster j.

Dif 3 = Xk value in cluster k – X(k+1) value in cluster k.

Dif 4 = Xl value in cluster l – X(i) value in cluster i.

Sorted Di in ascending order

Homeowner

Credit Score

Years of Credit History

Revolving Balance

Revolving Utilization

1

742

15

\$16,700.00

18%

0

520

1

\$4,000.00

90%

1

650

10

\$8,500.00

25%

0

602

7

\$16,300.00

70%

0

549

2

\$2,500.00

90%

1

700

8

\$21,000.00

15%

Homeowner

Dif 1

Dif 2

Dif 3

Dif 4

1

-742

-15

-16700

-0.18

0

130

9

4500

-0.65

1

-48

-3

7800

0.45

0

-53

-5

-13800

0.2

0

151

6

18500

-0.75

1

700

8

21000

0.15

Homeowner

Di

1

16715.98304

0

4502.38612

1

7800.648284

0

13799.60268

0

18501.11718

1

21012.16467

K = 3

Top 3 rows are indicated in blue.

Most frequent class in the rows = (650-742), considering the credit scores, and 10-15 for the ears of credit.

From the results using K-NN analysis, homeowner 1 is best suited for the credits.

Q15. Logistic regression

P = eß0 + ß/ {eß0 + ß + 1}

y = ß0 + ß

p = ey / y + 1

= 1/(1+e- ß0 + ß)

Mean = 11.4

a= 11.4

-0.03

Values

P

0.173773943

0.203773943

0.233773943

0.263773943

0.293773943

0.326773943

0.359773943

0.392773943

0.425773943

0.458773943

0.491773943

0.524773943

0.558273943

0.591773943

0.625273943

0.658773943

0.859773943

0.893273943

0.895273943

0.897273943

0.899273943

0.901273943

0.903273943

0.905273943

0.907273943

0.909273943

0.911273943

0.913273943

0.915273943

0.917273943

0.919273943

0.921273943

0.923273943

0.925273943

0.927273943

0.929273943

0.931273943

0.931473943

0.931673943

0.942973943

0.954273943

0.965573943

0.976873943

0.988173943

0.999473943

0.558273943

0.591773943

0.625273943

0.658773943

0.692273943

0.725773943

0.759273943

0.792773943

0.826273943

Average value of credit score = 652.28

Average years of credit history = 11.24

Tryfos, P. (2001). Cluster Analysis. London : Oxford University .

