Machine Learning

Unsupervised and Supervised Learning

Clustering -K means and Gaussian Mixture Models

FICO SCORE v. Margin in BPS Un-clustered Plot

FICO SCORE v. Margin in bps Clustered, Applied Linear Equation and Black Stars to Represent the Center of the Clusters

Clustering

This project was used an unsupervised way to distinctly measure by product type the true FICO scores by division. I found that K-means Cluster wasn't the best way to cluster the data, but that it followed a more linear distribution with outliers. However, I still completed the project and tested a Gaussian method to clustering the data points and followed with a silhouette analysis on cluster size. Ultimately the silhouette analysis found that for n_clusters = 3 The average silhouette_score is : 0.50, which means there are 3 clusters in FICO comparison of margin to the 9 that were previous considered for this product.

Libraries used: matplotlib.pyplot, numpy, numpy.polynomial,sklearn.cluster,sklearn.metrics, sklearn.misture, matplotlib.patches

Python Code

ANALYST/SCIENTIST

Machine Learning

Clustering -K means and Gaussian Mixture Models

Clustering

Libraries used: matplotlib.pyplot, numpy, numpy.polynomial,sklearn.cluster,sklearn.metrics, sklearn.misture, matplotlib.patches