ZBLOG

读取数据moon.csv, 部分数据如下 X1 X2 Label 0.682988 -0.3452 1 2.04099 -0.13161 1 -0.13975 0.454391 1 0.885748 0.286342 0 1.182459 -0.49771 1 1.019208 -0.42547 1 0.389362 -0.24712 1 -0.56038 0.712681 0 0.901844 0.053931 0 0.206748 0.006208 1 1.323523...

为了完成这些任务,你需要使用一些数据分析和机器学习的库,如pandas、matplotlib、sklearn等。以下是一个可能的解决方案:

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans, AgglomerativeClustering, DBSCAN
from sklearn.metrics import silhouette_score

# 读取CSV文件
data = pd.read_csv("moon.csv")

# 绘制散点图
plt.scatter(data[data['Label'] == 0]['X1'], data[data['Label'] == 0]['X2'], color='blue', label='Label 0')
plt.scatter(data[data['Label'] == 1]['X1'], data[data['Label'] == 1]['X2'], color='red', label='Label 1')
plt.xlabel('X1')
plt.ylabel('X2')
plt.legend()
plt.show()

# 使用K-means进行聚类分析并生成新标签
kmeans = KMeans(n_clusters=2)
kmeans.fit(data[['X1', 'X2']])
new_labels_kmeans = kmeans.labels_
data['New_Label_KMeans'] = new_labels_kmeans

# 统计正确分类比例及轮廓系数(K-means)
correct_count_kmeans = sum(data['New_Label_KMeans'] == data['Label'])
accuracy_kmeans = correct_count_kmeans / len(data)
silhouette_coef_kmeans = silhouette_score(data[['X1', 'X2']], new_labels_kmeans)

# 使用层次聚类进行聚类分析并生成新标签
hierarchical_clustering = AgglomerativeClustering(n_clusters=2)
new_labels_hierarchical = hierarchical_clustering.fit_predict(data[['X1', 'X2']])
data['New_Label_Hierarchical'] = new_labels_hierarchical

# 统计正确分类比例及轮廓系数(层次聚类)
correct_count_hierarchical = sum(data['New_Label_Hierarchical'] == data['Label'])
accuracy_hierarchical = correct_count_hierarchical / len(data)
silhouette_coef_hierarchical = silhouette_score(data[['X1', 'X2']], new_labels_hierarchical)

# 使用DBSCAN进行聚类分析并生成新标签
dbscan = DBSCAN(eps=0.3, min_samples=5)
new_labels_dbscan = dbscan.fit_predict(data[['X1', 'X2']])
data['New_Label_DBSCAN'] = new_labels_dbscan

# 统计正确分类比例及轮廓系数(DBSCAN)
correct_count_dbscan = sum(data['New_Label_DBSCAN'] == data['Label'])
accuracy_dbscan = correct_count_dbscan / len(data)
silhouette_coef_dbscan = silhouette_score(data[['X1', 'X2']], new_labels_dbscan)

# 将新的聚类标签写入原始CSV文件
data.to_csv("moon.csv", index=False)

print(f"K-means accuracy: {accuracy_kmeans}, Silhouette coefficient: {silhouette_coef_kmeans}")
print(f"Hierarchical clustering accuracy: {accuracy_hierarchical}, Silhouette coefficient: {silhouette_coef_hierarchical}")
print(f"DBSCAN accuracy: {accuracy_dbscan}, Silhouette coefficient: {silhouette_coef_dbscan}")

请确保你已经安装了所需的库,并将代码和数据文件放在同一个目录下。运行这段代码后,你将得到散点图以及各种聚类算法的正确分类比例和轮廓系数。


内容由零声教学AI助手提供,问题来源于学员提问

本站部分文章来源于网络,版权归原作者所有,如有侵权请联系站长删除。
转载请注明出处:https://golang.0voice.com/?id=14432

分享:
扫描分享到社交APP
上一篇
下一篇
发表列表
游客 游客
此处应有掌声~
评论列表

还没有评论,快来说点什么吧~

联系我们

在线咨询: 点击这里给我发消息

微信号:3007537140

上班时间: 10:30-22:30

关注我们
x

注册

已经有帐号?