Siirry suoraan sisältöön
Parallel K-Means Algorithm based on Hadoop-MapReduce for Mining
Tallenna

Parallel K-Means Algorithm based on Hadoop-MapReduce for Mining

pokkari, 2025
englanti
This work aimed to investigate the use of a parallel K-Means clustering algorithm, based on the MapReduce programming model, to improve the response time of data mining. The algorithm's performance was evaluated in terms of SpeedUp and ScaleUp. To this end, experiments were performed on a Hadoop cluster consisting of six computers with standard hardware. The clustered data are measurements from flow towers in agricultural regions and belong to Ameriflux. The experiments were performed using 3, 4, and 6 machines, respectively. The results showed that with the increase in the number of machines, there was a gain in performance, with the best time obtained using six machines, reaching a SpeedUp of 3.25. It was found that the application scales well with the equivalent increase in data size and number of machines in the cluster, achieving similar performance in the tests.
ISBN
9786209114083
Kieli
englanti
Paino
86 grammaa
Julkaisupäivä
17.10.2025
Sivumäärä
56