Class DuplicatesFilter<U extends org.locationtech.jts.geom.Geometry,T extends org.locationtech.jts.geom.Geometry>

java.lang.Object
org.apache.sedona.core.joinJudgement.DuplicatesFilter<U,T>
Type Parameters:
U -
T -
All Implemented Interfaces:
Serializable, org.apache.spark.api.java.function.Function2<Integer,Iterator<org.apache.commons.lang3.tuple.Pair<U,T>>,Iterator<org.apache.commons.lang3.tuple.Pair<U,T>>>

public class DuplicatesFilter<U extends org.locationtech.jts.geom.Geometry,T extends org.locationtech.jts.geom.Geometry> extends Object implements org.apache.spark.api.java.function.Function2<Integer,Iterator<org.apache.commons.lang3.tuple.Pair<U,T>>,Iterator<org.apache.commons.lang3.tuple.Pair<U,T>>>
Provides optional de-dup logic. Due to the nature of spatial partitioning, the same pair of geometries may appear in multiple partitions. If that pair satisfies join condition, it will be included in join results multiple times. This duplication can be avoided by (1) choosing spatial partitioning that doesn't allow for overlapping partition extents and (2) reporting a pair of matching geometries only from the partition whose extent contains the reference point of the intersection of the geometries.
See Also:
  • Constructor Details

    • DuplicatesFilter

      public DuplicatesFilter(org.apache.spark.broadcast.Broadcast<DedupParams> dedupParamsBroadcast)
  • Method Details

    • call

      public Iterator<org.apache.commons.lang3.tuple.Pair<U,T>> call(Integer partitionId, Iterator<org.apache.commons.lang3.tuple.Pair<U,T>> geometryPair) throws Exception
      Specified by:
      call in interface org.apache.spark.api.java.function.Function2<Integer,Iterator<org.apache.commons.lang3.tuple.Pair<U extends org.locationtech.jts.geom.Geometry,T extends org.locationtech.jts.geom.Geometry>>,Iterator<org.apache.commons.lang3.tuple.Pair<U extends org.locationtech.jts.geom.Geometry,T extends org.locationtech.jts.geom.Geometry>>>
      Throws:
      Exception