
SparkSql要符合sql标准,在两个表作union的时候,字段的顺序需要一一对应,否则结果会错误。同样spark sql也要遵守。两个parquet作union的时候字段一定要对应上。否则最后结果会是错误的。
下面两个例子:
1.两个parquet的字段顺序一致
val conf = new SparkConf().setAppName("UDAF").setMaster("local")
val idtype = "idfa:imei"
val sc = new SparkContext(conf)
sc.setLogLevel("ERROR")
val sqlContext = new SQLContext(sc)
import org.apache.spark.sql.functions._
import sqlContext.implicits._
val names = Array((1L,"peter","peter",""),(2L,"Leo","Leo",""),
(3L,"Marry","","Marry"), (4L,"Jack","","Jack"),
(5L,"Tom","Tom",""), (6L,"id1","id1",""),
(5L,"Tom","Tom",""), (2L,"Leo","","Leo"),
(2L,"Leo","Leo",""))
val numsDF = sc.parallelize(names, 1).toDF("offset","mcId","idfa","imei")
val names2 = Array((11L,"peter2","peter2",""),(12L,"Leo2","Leo2",""))
val numsDF2 = sc.parallelize(names2, 1).toDF("offset","mcId","idfa","imei")
numsDF2.show(20,false)
numsDF.unionAll(numsDF2).show(20,false)
+------+------+------+----+
offsetmcId idfa imei
+------+------+------+----+
11 peter2peter2
12 Leo2 Leo2
+------+------+------+----+
+------+------+------+-----+
offsetmcId idfa imei
+------+------+------+-----+
1 peter peter
2 Leo Leo
3 Marry Marry
4 Jack Jack
5 Tom Tom
6 id1 id1
5 Tom Tom
2 Leo Leo
2 Leo Leo
11 peter2peter2
12 Leo2 Leo2
+------+------+------+-----+
2.两个parquet字段的顺序
val names = Array((1L,"peter","peter",""),(2L,"Leo","Leo",""),
(3L,"Marry","","Marry"), (4L,"Jack","","Jack"),
(5L,"Tom","Tom",""), (6L,"id1","id1",""),
(5L,"Tom","Tom",""), (2L,"Leo","","Leo"),
(2L,"Leo","Leo",""))
val numsDF = sc.parallelize(names, 1).toDF("offset","mcId","idfa","imei")
val names2 = Array((10L,"peter2","","peter2"),(11L,"Leo2","","Leo2"))
val numsDF2 = sc.parallelize(names2, 1).toDF("offset","idfa","imei","mcId")
numsDF2.show(20,false)
numsDF.unionAll(numsDF2).show(20,false)
输出:
+------+------+----+------+
offsetidfa imeimcId
+------+------+----+------+
10 peter2 peter2
11 Leo2 Leo2
+------+------+----+------+
发生错误的地方在下面。
+------+------+-----+------+
offsetmcId idfa imei
+------+------+-----+------+
1 peter peter
2 Leo Leo
3 Marry Marry
4 Jack Jack
5 Tom Tom
6 id1 id1
5 Tom Tom
2 Leo Leo
2 Leo Leo
10 peter2 peter2
11 Leo2 Leo2
+------+------+-----+------+
本文来自网络,不代表「专升本要什么条件_专升本要几年_成人高考专升本_山东专升本信息网」立场,转载请注明出处:http://www.sdzsb8.cn/zsxx/95066.html