ZBLOG

读取数据moon.csv, 部分数据如下 X1 X2 Label 0.682988 -0.3452 1 2.04099 -0.13161 1 -0.13975 0.454391 1 0.885748 0.286342 0 1.182459 -0.49771 1 1.019208 -0.42547 1 0.389362 -0.24712 1 -0.56038 0.712681 0 0.901844 0.053931 0 0.206748 0.006208 1 1.323523 -0.2891...

以下是Python代码的示例,用于实现您描述的任务。请确保已经安装了相关的库(如pandas、matplotlib、sklearn等)。

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans, AgglomerativeClustering, DBSCAN
from sklearn.metrics import silhouette_score

# 1. 读取数据并绘制散点图
data = pd.read_csv('moon.csv')
labels = data['Label']
plt.scatter(data['X1'], data['X2'], c=labels)
plt.xlabel('X1')
plt.ylabel('X2')
plt.title('Scatter Plot of Data')
plt.show()

# 2. 使用K-means进行聚类,并统计正确分类比例和轮廓系数
kmeans = KMeans(n_clusters=2)
kmeans.fit(data[['X1', 'X2']])
new_labels_kmeans = kmeans.labels_
data['New_Label_KMeans'] = new_labels_kmeans

correct_ratio_kmeans = sum(new_labels_kmeans == labels) / len(labels)
silhouette_coef_kmeans = silhouette_score(data[['X1', 'X2']], new_labels_kmeans)

print(f'Correct ratio (K-means): {correct_ratio_kmeans:.4f}')
print(f'Silhouette coefficient (K-means): {silhouette_coef_kmeans:.4f}')

# 3. 使用层次聚类进行聚类,并统计正确分类比例和轮廓系数
hierarchical_clustering = AgglomerativeClustering(n_clusters=2)
hierarchical_clustering.fit(data[['X1', 'X2']])
new_labels_hierarchical = hierarchical_clustering.labels_
data['New_Label_Hierarchical'] = new_labels_hierarchical

correct_ratio_hierarchical = sum(new_labels_hierarchical == labels) / len(labels)
silhouette_coef_hierarchical = silhouette_score(data[['X1', 'X2']], new_labels_hierarchical)

print(f'Correct ratio (Hierarchical): {correct_ratio_hierarchical:.4f}')
print(f'Silhouette coefficient (Hierarchical): {silhouette_coef_hierarchical:.4f}')

# 4. 使用DBSCAN进行聚类,并统计正确分类比例和轮廓系数
dbscan = DBSCAN(eps=0.3, min_samples=5)
dbscan.fit(data[['X1', 'X2']])
new_labels_dbscan = dbscan.labels_
data['New_Label_DBSCAN'] = new_labels_dbscan

correct_ratio_dbscan = sum(new_labels_dbscan == labels) / len(labels)
silhouette_coef_dbscan = silhouette_score(data[['X1', 'X2']], new_labels_dbscan)

print(f'Correct ratio (DBSCAN): {correct_ratio_dbscan:.4f}')
print(f'Silhouette coefficient (DBSCAN): {silhouette_coef_dbscan:.4f}')

# 将新的标签写回原始数据文件
data.to_csv('moon.csv', index=False)

注意:在使用层次聚类和DBSCAN时,您可能需要调整参数以获得更好的结果。以上代码中给出的参数仅供参考。


内容由零声教学AI助手提供,问题来源于学员提问

本站部分文章来源于网络,版权归原作者所有,如有侵权请联系站长删除。
转载请注明出处:https://golang.0voice.com/?id=14434

分享:
扫描分享到社交APP
上一篇
下一篇
发表列表
游客 游客
此处应有掌声~
评论列表

还没有评论,快来说点什么吧~

联系我们

在线咨询: 点击这里给我发消息

微信号:3007537140

上班时间: 10:30-22:30

关注我们
x

注册

已经有帐号?