启动hadoop,启动Spark。
造一份简单的测试数据customers.txt,为了方便,我把它放在了spark/bin目录:
100, John Smith, Austin, TX, 78727 200, Joe Johnson, Dallas, TX, 75201 300, Bob Jones, Houston, TX, 77028 400, Andy Davis, San Antonio, TX, 78227 500, James Williams, Austin, TX, 78727
启动Spark-SQL:
./spark-sql.sh
将数据映射成数据库表:
create table Customer( id string, name string, city string, company string, num string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,‘;
加载数据:
load data local inpath ‘./customers.txt‘ overwrite into table Customer;
查询数据:
select * from Customer;
完!
时间: 2024-09-28 21:29:19