IntelliJ IDEA中 搭建spark开发环境 和样例测试

运行 bulid.sbt  就好了
 
name := "sparkstreaming"

version := "1.0"

scalaVersion := "2.10.5"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"

libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.6.0"

libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.6.0"

libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.6.0"

libraryDependencies += "org.apache.spark" %% "spark-streaming-kafka" % "1.6.0"

 

 

测试样例:

 

/**
* Created by yangze on 2017/7/6.
*/

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
def main(args: Array[String]) {
if (args.length<1){
System.err.println("please give the correct params")
System.exit(1)
}

val logFile =args(0).toString
//val logFile = "E:\\code\\sparkstreaming\\file\\text.log" // Should be some file on your system
val conf = new SparkConf().setAppName("Simple Application")//.setMaster("local")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}

 

 

--打包

 

1.菜单:File->project stucture

2.在弹窗最左侧选中Artifacts->"+",选jar,选择from modules with dependencies,然后会有配置窗口出现,配置完成后,勾选Build on make >ok保存

image

 

 

服务器端提交执行:

 

sudo -u saprk spark-submit --class SimpleApp --deploy-mode client --num-executors 1 --executor-memory 2g --executor-cores 1 --driver-memory 5g --master yarn /mobankerdata1/yz/sparkstreaming.jar  text.log

您可以选择一种方式赞助本站