Hierarchical Clustering Using Python Programming
Name: Tejas Sahoo
Roll No: K057
Branch: BTech Cyber Security
Aim:
To implement Hierarchical clustering algorithm
Introduction:
Clustering is a technique that groups similar objects, and hierarchical clustering creates a tree structure to represent the relationships between clusters.
Types of Clustering:
- Agglomerative Hierarchical Clustering: Bottom-up approach starting from individual data points.
- Divisive Hierarchical Clustering: Top-down approach starting with a single large cluster.
Advantages:
- Provides a clear dendrogram for analysis.
- Effective for determining natural groupings in data.
Output
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from sklearn.cluster import AgglomerativeClustering
import scipy.cluster.hierarchy as sch
from sklearn import datasets
iris = datasets.load_iris()
iris_data = pd.DataFrame(iris.data)
iris_data.columns = iris.feature_names
iris_data['flower_type']=iris.target
iris_data.head(10)
Output:
iris_X = iris_data.iloc[:, [0,1,2,3]].values
iris_Y = iris_data.iloc[:,4].values
iris_X
iris_Y
Output:
import matplotlib.pyplot as plt
plt.figure(figsize=(15,7))
plt.scatter(iris_X[iris_Y == 0,0],iris_X[iris_Y == 0,1],s=100,c='blue',label='Type 1')
plt.scatter(iris_X[iris_Y == 1,0],iris_X[iris_Y == 1,1],s=100,c='yellow',label='Type 2')
plt.scatter(iris_X[iris_Y == 2,0],iris_X[iris_Y == 2,1],s=100,c='red',label='Type 3')
Output:
import scipy.cluster.hierarchy as sc
plt.figure(figsize=(25,10))
plt.title("Dendrogram")
sc.dendrogram(sc.linkage(iris_X,method='ward'))
plt.title('Dendrogram')
plt.xlabel('Data Points')
plt.ylabel('Euclidean Distance')
Output:
Conclusion:
Hierarchical clustering was successfully implemented in Python.