Apache Spark WordCount scala example
Apache Spark is an open source cluster computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance.
1) A machine with Ubuntu 14.04 LTS operating system
2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)
3) Apache Spark 1.6.1 pre installed (How to install Spark on Ubuntu 14.04)
Spark WordCount Scala Example
Step 1 - Change the directory to /usr/local/spark/sbin.
Step 2 - Start all spark daemons.
Step 3 - The JPS (Java Virtual Machine Process Status Tool) tool is limited to reporting information on JVMs for which it has the access permissions.
Step 4 - Create a jar file.
Step 5 - Run application.
Please share this blog post and follow me for latest updates on
Previous Post Next Post
Labels : Spark Standalone Mode Installation Spark Cluster Mode Installation Spark With YARN Configuration Spark WordCount Java Example Spark submit-script Usage Spark Shell Usage Spark Shell Scala Examples