Class

com.intersys.spark

DataFrameReaderEx

Related Doc: package spark

Permalink

implicit class DataFrameReaderEx extends AnyRef

Extends the given reader with IRIS specific methods.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DataFrameReaderEx
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DataFrameReaderEx(reader: DataFrameReader)

    Permalink

    reader

    A DataFrame reader.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def address(url: String, user: String = "", password: String = ""): DataFrameReader

    Permalink

    Specifies the connection details of the cluster to read from.

    Specifies the connection details of the cluster to read from.

    Overrides the default cluster specified in the Spark configuration for the duration of this read operation.

    url

    A string of the form "Cache://host:port/namespace" that specifies the cluster to read from.

    user

    The user account with which to make the connection to the cluster named in the "url" option above.

    password

    The password for the given user account.

    returns

    The same DataFrameReader on which this method was invoked.

  5. def address(address: Address): DataFrameReader

    Permalink

    Specifies the connection details of the cluster to read from.

    Specifies the connection details of the cluster to read from.

    Overrides the default cluster specified in the Spark configuration for the duration of this read operation.

    address

    The connection details of the cluster to read from.

    returns

    The same DataFrameReader on which this method was invoked.

  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  12. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  13. def iris(text: String, column: String, lo: Long, hi: Long, partitions: ): DataFrame

    Permalink

    Executes a query on the given cluster to compute a suitably partitioned DataFrame.

    Executes a query on the given cluster to compute a suitably partitioned DataFrame.

    This enables one to write, for example:

    spark.read.iris("SELECT * FROM Owls","column",0,10000,2)

    as a convenient shorthand for the more explicit:

    spark.read
         .format("com.intersys.spark")
         .option("query","SELECT * FROM Owls")
         .option("paftitionCol","column")
         .option("lowerBound",0)
         .option("upperBound",10000)
         .option("numPartitions",2)
         .load()

    The following options affect how the operation is performed:

    • url: A string of the form "Cache://host:port/namespace" that specifies the cluster from which the data is to be read. If omitted, the default cluster specified via the "spark.isc.master.url" configuration setting is used instead.
    • user: The account with which to make the connection to the cluster named in the "url" option above.
    • password: The password for the given user account.
    • fetchsize: The number of rows to fetch per server round trip.

    Default = 1000.

    text

    The text of a query to be executed on the cluster or the name of an existing table in the cluster to load.

    column

    The name of the integral valued column in the result set with which to further partition the query.

    lo

    The lower bound of the partitioning column.

    hi

    The upper bound of the partitioning column.

    partitions

    The number of partitions per instance to create.

    returns

    The results of the query in the form of a suitably partitioned DataFrame.

    Exceptions thrown

    SQLException if a database access error occurs.

    See also

    JDBC to Other Databases for more on the semantics of the column, lo, hi, and partitions parameters.

  14. def iris(text: String, mfpi: = 1): DataFrame

    Permalink

    Executes a query on the given cluster to compute a suitably partitioned DataFrame.

    Executes a query on the given cluster to compute a suitably partitioned DataFrame.

    This enables one to write, for example:

    spark.read.iris("SELECT * FROM table",2)

    as a convenient shorthand for the more explicit:

    spark.read
         .format("com.intersys.spark")
         .option("query","SELECT * FROM table")
         .option("mfpi",2)
         .load()

    The following options affect how the operation is performed:

    • url: A string of the form "Cache://host:port/namespace" that specifies the cluster from which the data is to be read. If omitted, the default cluster specified via the "spark.isc.master.url" configuration setting is used instead.
    • user: The account with which to make the connection to the cluster named in the "url" option above.
    • password: The password for the given user account.
    • fetchsize: The number of rows to fetch per server round trip.

    Default = 1000.

    text

    The text of a query to be executed on the cluster or the name of an existing table in the cluster to load.

    mfpi

    The maximum number of factors per distinct instance to include in the factorization implicitly performed by the server, or 0 if no limit is necessary.

    returns

    The results of the query in the form of a suitably partitioned DataFrame.

    Exceptions thrown

    SQLException if a database access error occurs.

  15. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  16. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  17. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  18. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  19. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  20. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  21. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped