Python

Starting Jupyter with this mode will configure the Python Kernel.

Development in Jupyter

Jupyter notebook will behave identical to how it would if you start the notebook server locally on your machine using a python kernel.

Experiments

Starting Jupyter with this mode will configure the PySpark Kernel.

Experiment Parallel Experiments Distributed Training

Experiment

Run python wrapper functions on PySpark to run parallel hyperparameter optimization or distributed training orchestrated on PySpark executors.

The simple Experiment abstraction corresponds to a single Python experiment, for example any hyperparameters or other configuration is hard-coded in the code itself.

Want to learn more? See an example and docs.

Parallel Experiments

Hyperparameter optimization is critical to achieve the best accuracy for your model. With HopsML, hyperparameter optimization is easier than ever. We provide grid-search or state-of-the-art evolutionary optimization which will automatically learn what hyperparameters are the best and iteratively improve metrics such as model accuracy.

Want to learn more? See an example and docs.

Distributed Training

Compared to Experiment and Parallel Experiments, Distributed Training involves making use of multiple machines with potentially multiple GPUs per machine in order to train the model.

HopsML supports all the Distribution Strategies in TensorFlow. Making distributed training with TensorFlow or Keras as simple as invoking a function with your code in order to setup the cluster and start the training.

Want to learn more? See an example and docs.

Spark

Starting Jupyter with this mode will configure the Spark, PySpark and SparkR Kernel.

Spark (Static) Spark (Dynamic)

Spark Static

Spark is a general-purpose distributed data processing engine.

When using Spark static you need to set a fixed number of executors. This means that your application will keep the resources it is allocated even if they are not utilized by the application.

Spark Dynamic

With Spark dynamic you can set minimum and maximum number of executors. This means that your application may give resources back to the cluster if they are no longer used and request them again later when there is demand.