Document Overview
Due to continued improvements in both Apache Spark and the InterSystems JDBC driver, the InterSystems Spark Connector no longer provides significant advantages over the standard Spark JDBC connector. The Spark Connector is deprecated, and will be discontinued in an upcoming 2022 release.
See the Table of Contents for a detailed listing of the subjects covered in this document.
The InterSystems Spark Connector (com.intersystems.spark) is a Scala module that implements a plug-compatible replacement for the standard Spark jdbc data source. This allows the Spark data processing engine to make optimal use of the InterSystems IRIS® data platform and its distributed data capabilities.
The following topics are discussed in this document:
-
Introduction — introduces Apache Spark and the InterSystems Spark Connector, and describes how to configure the Spark Connector on your system.
-
Spark Connector Data Source Options — provides a detailed description of all Spark Connector data source options.
-
Using Spark Connector Extension Methods — demonstrates the Spark Connector method interface.
-
Spark Connector Best Practices — describes ways to optimize Spark Connector hardware and software
-
Spark Connector Internals — useful information not otherwise covered.
-
Spark Connector Quick Reference — provides a quick reference to InterSystems-specific Scala extension methods provided by the Spark Connector API.
The following documents contain related material:
-
“Apache Spark Support” in the Implementation Reference for Third Party APIs provides some technical details about the InterSystems Spark Connector.
-
The “Apache Spark” InterSystems overview page lists other resources for the InterSystems Spark Connector.
-
Using Java with InterSystems Software provides an overview of all InterSystems Java technologies enabled by the InterSystems JDBC driver, and describes how to use the driver.