Installation

Requirements

  • Debian Linux hosts
  • 120 GB of disk in each cluster machine (can increase relying on dataset size)
  • 8 GB of RAM in each cluster machine (can increase relying on dataset size)
  • Docker Engine must be installed in all instances of your cluster
  • Cluster configured in swarm mode, check creating a swarm
  • Docker Compose must be installed in the manager instance of your cluster

Ensure that your cluster environment does not block any traffic such as firewall rules in your network or in your hosts.

If in case, you have firewalls or other traffic-blockers, add learningOrchestra as an exception.

Ex: In Google Cloud Platform each of the VMs must allow both http and https traffic.

Deployment

In the manager Docker swarm machine, clone the repo using:

git clone https://github.com/riibeirogabriel/learningOrchestra.git

Navigate into the learningOrchestra directory and run:

cd learningOrchestra
sudo ./run.sh

That's it! learningOrchestra has been deployed in your swarm cluster!

Cluster State

CLUSTER_IP:8000 - To visualize cluster state (deployed microservices and cluster's machines). CLUSTER_IP:8080 - To visualize spark cluster state.

  • CLUSTER_IP is the external IP of a machine in your cluster.

Spark Microservices

Some learningOrchestra features use the Spark microservice to work.

By default, this microservice has only one instance. In case your data processing requires computing power, you need scale this microservice.

To do this, with learningOrchestra already deployed, run the following in the manager machine of your Docker swarm cluster:

docker service scale microservice_sparkworker=NUMBER_OF_INSTANCES

* NUMBER_OF_INSTANCES is the number of Spark microservice instances which you require. Choose it according to your cluster resources and your resource requirements.