spark第四周作业

image

 

 

image

 

image

作业一:简述sparkSQL的运行过程

 

 

作业二:使用sparkSQL对data.tar.gz中的订单交易数据进行下列计算
1:所有订单每年的总金额

image

hiveContext.hql("select c.theyear,sum(b.amount) from tblStock a join tblStockDetail b on a.ordernumber=b.ordernumber join tbldate c on a.dateid=c.dateid group by c.theyear order by c.theyear").collect().foreach(println)

结果:

image

 

2:所有订单月销售额前十名

第一步:

hiveContext.hql("select c.theyearmonth,b.itemid,sum(b.amount) as sumofamount from tblStock a join tblStockDetail b on a.ordernumber=b.ordernumber join tbldate c on a.dateid=c.dateid group by c.theyearmonth,b.itemid").collect().foreach(println)

image

 

hiveContext.hql("select c.theyearmonth,b.itemid,sum(b.amount) as sumofamount from tblStock a join tblStockDetail b on a.ordernumber=b.ordernumber join tbldate c on a.dateid=c.dateid group by c.theyearmonth,b.itemid order by sumofamount desc limit 10").collect().foreach(println)

image

 

3:所有订单每年的订单数

hiveContext.hql("select c.theyear,sum(b.amount) as sumofamount from tblStock a join tblStockDetail b on a.ordernumber=b.ordernumber join tbldate c on a.dateid=c.dateid group by c.theyear").collect().foreach(println)

image

您可以选择一种方式赞助本站