Package org.apache.sedona.core.utils
Class RDDSampleUtils
java.lang.Object
org.apache.sedona.core.utils.RDDSampleUtils
The Class RDDSampleUtils.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic intgetSampleNumbers(int numPartitions, long totalNumberOfRecords, int givenSampleNumbers) Returns the number of samples to take to partition the RDD into specified number of partitions.
-
Constructor Details
-
RDDSampleUtils
public RDDSampleUtils()
-
-
Method Details
-
getSampleNumbers
public static int getSampleNumbers(int numPartitions, long totalNumberOfRecords, int givenSampleNumbers) Returns the number of samples to take to partition the RDD into specified number of partitions.Number of partitions cannot exceed half the number of records in the RDD.
Returns total number of records if it is < 1000. Otherwise, returns 1% of the total number of records or twice the number of partitions whichever is larger. Never returns a number > Integer.MAX_VALUE.
If desired number of samples is not -1, returns that number.
- Parameters:
numPartitions- the num partitionstotalNumberOfRecords- the total number of recordsgivenSampleNumbers- the given sample numbers- Returns:
- the sample numbers
- Throws:
IllegalArgumentException- if requested number of samples exceeds total number of records or if requested number of partitions exceeds half of total number of records
-