Clustering example
We now turn our attention to some of the analytics algorithms in RevoScaleR
, starting with rxKmeans
which is a parallel implementation of the k-means algorithm. Unlike the kmeans
function, rxKmeans
can run on an XDF file, a flat file, or data stored on HDFS or on SQL Server.
As we will see, when working with large datasets, tuning algorithms can have a great effect on performance.