spark.timeseries

BucketRDD

class BucketRDD [T] extends RDD[Array[T]]

A class that represents an RDD of Arrays of a specified type. Each Array is a partition or "bucket" of items that belong together. Usage:

import spark._
import spark.SparkContext._
import spark.timeseries._

val sc = TimeSeriesSpark.init("local[1]", "default")

val file = sc.textFile("fileName")
val mapped = file.map(_.split(" ", 8))

var det = new HourBucketDetector()
val buckets = new BucketRDD(mapped, det)

buckets.cache()

val sizes = buckets.map(_.length).collect()

for(k <- sizes)
  println(k)

See spark.timeseries.BucketLogsByHour for a more real example.

Accepts a BucketDetector to assign data items into buckets.

Linear Supertypes
RDD[Array[T]], Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. Hide All
  2. Show all
  1. BucketRDD
  2. RDD
  3. Serializable
  4. Serializable
  5. AnyRef
  6. Any
Visibility
  1. Public
  2. All

Instance Constructors

  1. new BucketRDD (prev: RDD[T], detector: BucketDetector[T])(implicit arg0: ClassManifest[T])

Type Members

  1. type _$1

    Attributes
    abstract
    Definition Classes
    RDD

Value Members

  1. def != (arg0: AnyRef): Boolean

    Attributes
    final
    Definition Classes
    AnyRef
  2. def != (arg0: Any): Boolean

    Attributes
    final
    Definition Classes
    Any
  3. def ## (): Int

    Attributes
    final
    Definition Classes
    AnyRef → Any
  4. def ++ (other: RDD[Array[T]]): RDD[Array[T]]

    Definition Classes
    RDD
  5. def == (arg0: AnyRef): Boolean

    Attributes
    final
    Definition Classes
    AnyRef
  6. def == (arg0: Any): Boolean

    Attributes
    final
    Definition Classes
    Any
  7. def asInstanceOf [T0] : T0

    Attributes
    final
    Definition Classes
    Any
  8. def cache (): RDD[Array[T]]

    Definition Classes
    RDD
  9. def cartesian [U] (other: RDD[U])(implicit arg0: ClassManifest[U]): RDD[(Array[T], U)]

    Definition Classes
    RDD
  10. def clone (): AnyRef

    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws()
  11. def collect (): Array[Array[T]]

    Definition Classes
    RDD
  12. def compute (split: Split): Iterator[Array[T]]

    Assigns log messages to buckets, using the spark.timeseries.BucketDetector[T] given.

    Assigns log messages to buckets, using the spark.timeseries.BucketDetector[T] given.

    Definition Classes
    BucketRDD → RDD
  13. def context : SparkContext

    Definition Classes
    RDD
  14. def count (): Long

    Definition Classes
    RDD
  15. val dependencies : List[OneToOneDependency[T]]

    Definition Classes
    BucketRDD → RDD
  16. def eq (arg0: AnyRef): Boolean

    Attributes
    final
    Definition Classes
    AnyRef
  17. def equals (arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  18. def filter (f: (Array[T]) ⇒ Boolean): RDD[Array[T]]

    Definition Classes
    RDD
  19. def finalize (): Unit

    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws()
  20. def first (): Array[T]

    Definition Classes
    RDD
  21. def flatMap [U] (f: (Array[T]) ⇒ Traversable[U])(implicit arg0: ClassManifest[U]): RDD[U]

    Definition Classes
    RDD
  22. def foreach (f: (Array[T]) ⇒ Unit): Unit

    Definition Classes
    RDD
  23. def getClass (): java.lang.Class[_]

    Attributes
    final
    Definition Classes
    AnyRef → Any
  24. def glom (): RDD[Array[Array[T]]]

    Definition Classes
    RDD
  25. def groupBy [K] (f: (Array[T]) ⇒ K)(implicit arg0: ClassManifest[K]): RDD[(K, Seq[Array[T]])]

    Definition Classes
    RDD
  26. def groupBy [K] (f: (Array[T]) ⇒ K, numSplits: Int)(implicit arg0: ClassManifest[K]): RDD[(K, Seq[Array[T]])]

    Definition Classes
    RDD
  27. def hashCode (): Int

    Definition Classes
    AnyRef → Any
  28. val id : Int

    Definition Classes
    RDD
  29. def isInstanceOf [T0] : Boolean

    Attributes
    final
    Definition Classes
    Any
  30. def iterator (split: Split): Iterator[Array[T]]

    Attributes
    final
    Definition Classes
    RDD
  31. def map [U] (f: (Array[T]) ⇒ U)(implicit arg0: ClassManifest[U]): RDD[U]

    Definition Classes
    RDD
  32. def mapPartitions [U] (f: (Iterator[Array[T]]) ⇒ Iterator[U])(implicit arg0: ClassManifest[U]): RDD[U]

    Definition Classes
    RDD
  33. def ne (arg0: AnyRef): Boolean

    Attributes
    final
    Definition Classes
    AnyRef
  34. def notify (): Unit

    Attributes
    final
    Definition Classes
    AnyRef
  35. def notifyAll (): Unit

    Attributes
    final
    Definition Classes
    AnyRef
  36. val partitioner : Option[Partitioner]

    Definition Classes
    RDD
  37. def pipe (command: Seq[String]): RDD[String]

    Definition Classes
    RDD
  38. def pipe (command: String): RDD[String]

    Definition Classes
    RDD
  39. def preferredLocations (split: Split): Seq[String]

    Definition Classes
    RDD
  40. val prevSplits : Array[Split]

  41. def reduce (f: (Array[T], Array[T]) ⇒ Array[T]): Array[T]

    Definition Classes
    RDD
  42. def sample (withReplacement: Boolean, fraction: Double, seed: Int): RDD[Array[T]]

    Definition Classes
    RDD
  43. def saveAsObjectFile (path: String): Unit

    Definition Classes
    RDD
  44. def saveAsTextFile (path: String): Unit

    Definition Classes
    RDD
  45. def splits : Array[Split]

    Definition Classes
    BucketRDD → RDD
  46. def synchronized [T0] (arg0: ⇒ T0): T0

    Attributes
    final
    Definition Classes
    AnyRef
  47. def take (num: Int): Array[Array[T]]

    Definition Classes
    RDD
  48. def toArray (): Array[Array[T]]

    Definition Classes
    RDD
  49. def toString (): String

    Definition Classes
    AnyRef → Any
  50. def union (other: RDD[Array[T]]): RDD[Array[T]]

    Definition Classes
    RDD
  51. def wait (): Unit

    Attributes
    final
    Definition Classes
    AnyRef
    Annotations
    @throws()
  52. def wait (arg0: Long, arg1: Int): Unit

    Attributes
    final
    Definition Classes
    AnyRef
    Annotations
    @throws()
  53. def wait (arg0: Long): Unit

    Attributes
    final
    Definition Classes
    AnyRef
    Annotations
    @throws()

Inherited from RDD[Array[T]]

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any