ANALYST/SCIENTIST

Machine Learning

Unsupervised and Supervised Learning

Clustering -K means and Gaussian Mixture Models

 
 
FICO SCORE v. Margin in BPS Un-clustered Plot

FICO SCORE v. Margin in BPS Un-clustered Plot

FICO SCORE v. Margin in bps Clustered, Applied Linear Equation and Black Stars to Represent the Center of the Clusters

FICO SCORE v. Margin in bps Clustered, Applied Linear Equation and Black Stars to Represent the Center of the Clusters

Clustering

This project was used an unsupervised way to distinctly measure by product type the true FICO scores by division.  I found that K-means Cluster wasn't the best way to cluster the data, but that it followed a more linear distribution with outliers.  However, I still completed the project and tested a Gaussian method to clustering the data points and followed with a silhouette analysis on cluster size.  Ultimately the silhouette analysis found that for n_clusters = 3 The average silhouette_score is : 0.50, which means there are 3 clusters in FICO comparison of margin to the 9 that were previous considered for this product.  

 

Libraries used: matplotlib.pyplot, numpy, numpy.polynomial,sklearn.cluster,sklearn.metrics, sklearn.misture, matplotlib.patches

Figure_2.png
Figure_3.png
 
Figure_4.png
Figure_5.png
Figure_6.png