Greenplum+Hadoop学习笔记-14-定义数据库对象之创建与管理表空间

6.2.创建与管理表空间

表空间建立在文件空间之上,文件空间建立在一系列文件系统之上。关于gpfilespace的所有说明如下所示:


[[email protected] gpfs]$ gpfilespace --help

COMMAND NAME: gpfilespace

Creates a filespace using a configuration file that defines per-segment file system locations. Filespaces describe the physical file system resources to be used by a tablespace.

*****************************************************

SYNOPSIS

*****************************************************

gpfilespace [<connection_option> ...] [-l <logfile_directory>]  [-o [<output_fs_config_file>]]

gpfilespace [<connection_option> ...] [-l <logfile_directory>]    -c <fs_config_file>

gpfilespace movetempfilespace {<filespace_name>|default}

gpfilespace movetransfilespace {<filespace_name>|default}

gpfilespace --showtempfilespace

gpfilespace --showtransfilespace

gpfilespace -v | -?

*****************************************************

DESCRIPTION

*****************************************************

A tablespace requires a file system location to store its database files. In Greenplum Database, the master and each segment (primary and mirror) needs its own distinct storage location. This collection of file system locations for all components in a Greenplum
system is referred to as a filespace. Once a filespace is defined, it can be used by one or more tablespaces.

When used with the -o option, the gpfilespace utility looks up your system configuration information in the Greenplum Database catalog tables and prompts you for the appropriate file system locations needed to create the filespace. It then outputs a configuration
file that can be used to create a filespace. If a file name is

not specified, a gpfilespace_config_<#> file will be created in the current directory by default.

Once you have a configuration file, you can run gpfilespace with the -c option to create the filespace in Greenplum Database.

You will need to create a filespace before you can use the gpfilespace --movetempfilespace or --movetransfilespace option to move your temporary or transaction files to the new location.

Use either gpfilespace --showtempfilespace or --showtransfilespace options to show the name of the filespace currently associated with temporary or transaction files.

Note: If segments are down due to a power or nic failure, you may see inconsistencies during filespace creation. You may not be able to bring up the Greenplum Database.

*****************************************************

OPTIONS

*****************************************************

-c | --config <fs_config_file>

A configuration file containing:

* An initial line denoting the new filespace name. For example: filespace:myfs

* One line each for the master, the primary segments, and the    mirror segments. A line describes a file system location that  a particular segment database instance should use as its data  directory location to store database files associated with a  tablespace.
Each line is in the format of:

<hostname>:<dbid>:/<filesystem_dir>/<seg_datadir_name>

-l | --logdir <logfile_directory>

The directory to write the log file. Defaults to ~/gpAdminLogs.

-o | --output <output_file_name>

The directory location and file name to output the generated  filespace configuration file. You will be prompted to enter a  name for the filespace, a master file system location, the primary segment file system locations, and the mirror segment file system
locations. For example, if your configuration has  2 primary and 2 mirror segments per host, you will be prompted for a total of 5 locations (including the master). The file system locations must exist on all hosts in your system prior  to running the gpfilespace
utility. The utility will designate  segment-specific data directories within the location(s) you  specify, so it is possible to use the same location for multiple  segments. However, primaries and mirrors cannot use the same  location. After the utility creates
the configuration file, you  can manually edit the file to make any required changes to the  filespace layout before creating the filespace in Greenplum Database.

--movetempfilespace {<filespace_name>|default}

Moves temporary files to a new filespace or to the default location.

--movetransfilespace {<filespace_name>|default}

Moves transaction files to a new filespace or to the default location.

--showtempfilespace

Show the name of the filespace currently associated with temporary  files. This option checks that all primary and mirror segments, master  and master standby are using the same filespace or temporary files.You will receive a warning message and an email
if any inconsistencies  exist.

--showtransfilespace

Show the name of the filespace currently associated with transaction  files. This option checks that all primary and mirror segments, master  and master standby are using the same filespace or transaction files. You will receive a warning message and an
email if any inconsistencies exist.

-v | --version (show utility version)

Displays the version of this utility.

-? | --help (help)

Displays the utility usage and syntax.

****************************

CONNECTION OPTIONS

****************************

-h host | --host host

The host name of the machine on which the Greenplum master  database server is running. If not specified, reads from  the environment variable PGHOST or defaults to localhost.

-p port | --port port

The TCP port on which the Greenplum master database server is listening for connections. If not specified, reads from  the environment variable PGPORT or defaults to 5432.

-U username | --username superuser_name

The database superuser role name to connect as. If not specified, reads from the environment variable PGUSER or  defaults to the current system user name. Only database  superusers are allowed to create filespaces.

-W | --password

Force a password prompt.

*****************************************************

EXAMPLES

*****************************************************

Create a filespace configuration file. You will be prompted to enter a name for the filespace, a master file system location, the primary segment file system locations, and the mirror segment file system locations. For example, if your configuration has
2 primary and 2 mirror segments per host, you will be prompted for a total of 5 locations (including the master). The file system locations must exist on all hosts in your system prior to running the gpfilespace utility:

$ gpfilespace -o .

Enter a name for this filespace

> fastdisk

Checking your configuration:

Your system has 2 hosts with 2 primary and 2 mirror segments per host.

Configuring hosts: [sdw1, sdw2]

Please specify 2 locations for the primary segments, one per line:

primary location 1> /gp_pri_filespc

primary location 2> /gp_pri_filespc

Please specify 2 locations for the mirror segments, one per line:

mirror location 1> /gp_mir_filespc

mirror location 2> /gp_mir_filespc

Enter a file system location for the master:

master location> /gp_master_filespc

***************************************

Example filespace configuration file:

filespace:fastdisk

mdw:1:/gp_master_filespc/gp-1

sdw1:2:/gp_pri_filespc/gp0

sdw1:3:/gp_mir_filespc/gp1

sdw2:4:/gp_mir_filespc/gp0

sdw2:5:/gp_pri_filespc/gp1

Execute the configuration file to create the filespace

in Greenplum Database:

$ gpfilespace -c gpfilespace_config_1

*****************************************************

SEE ALSO

*****************************************************

CREATE FILESPACE, CREATE TABLESPACE

6.2.1.创建文件空间:在所有GP HOST主机上准备好逻辑文件系统


[[email protected] greenplum]# gpssh -f seg_hosts -e ‘mkdir -p /gpfs2/seg‘;

[slave1] mkdir -p /gpfs2/seg

[slave2] mkdir -p /gpfs2/seg

[[email protected] greenplum]# gpssh -f seg_hosts -e ‘chown gpadmin /gpfs2/seg‘;

[slave1] chown gpadmin /gpfs2/seg

[slave2] chown gpadmin /gpfs2/seg

[[email protected] greenplum]# mkdir -p /gpfs/master

[[email protected] greenplum]#  chown gpadmin /gpfs/master/

[[email protected] greenplum]# ls -ltr /gpfs/master

total 0

使用gpfilespace创建文件系统步骤如下:

  • 创建文件系统的配置文件:$gpfilespace –o gpfilespace_config
  • 根据提示输入文件空间名称和位置
  • 生成配置文件,检查配置内容
  • 基于配置文件创建文件空间:$gpfilespace –c gpfilespace_config

[[email protected] ~]$ gpfilespace -o gpfilespace_config

20150327:23:40:17:004809 gpfilespace:master:gpadmin-[INFO]:-

A tablespace requires a file system location to store its database

files. A filespace is a collection of file system locations for all components

in a Greenplum system (primary segment, mirror segment and master instances).

Once a filespace is created, it can be used by one or more tablespaces.

20150327:23:40:17:004809 gpfilespace:master:gpadmin-[INFO]:-getting config

Enter a name for this filespace

> fs01

Checking your configuration:

Your system has 2 hosts with 1 primary and 0 mirror segments per host.

Your system has 1 hosts with 0 primary and 0 mirror segments per host.

Configuring hosts: [slave1, slave2]

Please specify 1 locations for the primary segments, one per line:

primary location 1> /gpfs2/seg

Configuring hosts: [master]

Enter a file system location for the master

master location> /gpfs/master

20150327:23:42:31:004809 gpfilespace:master:gpadmin-[INFO]:-Creating configuration file...

20150327:23:42:31:004809 gpfilespace:master:gpadmin-[INFO]:-[created]

20150327:23:42:31:004809 gpfilespace:master:gpadmin-[INFO]:-

To add this filespace to the database please run the command:

gpfilespace --config /home/gpadmin/gpfilespace_config

[[email protected] ~]$ cat gpfilespace_config

filespace:fs01

master:1:/gpfs/master/gpseg-1

slave1:2:/gpfs2/seg/gpseg0

slave2:3:/gpfs2/seg/gpseg1

[[email protected] ~]$ gpfilespace -c gpfilespace_config

20150327:23:43:32:005042 gpfilespace:master:gpadmin-[INFO]:-

A tablespace requires a file system location to store its database

files. A filespace is a collection of file system locations for all components

in a Greenplum system (primary segment, mirror segment and master instances).

Once a filespace is created, it can be used by one or more tablespaces.

20150327:23:43:32:005042 gpfilespace:master:gpadmin-[INFO]:-getting config

Reading Configuration file: ‘gpfilespace_config‘

20150327:23:43:32:005042 gpfilespace:master:gpadmin-[INFO]:-Performing validation on paths

..............................................................................

20150327:23:43:33:005042 gpfilespace:master:gpadmin-[INFO]:-Connecting to database

20150327:23:43:33:005042 gpfilespace:master:gpadmin-[INFO]:-Filespace "fs01" successfully created

6.2.2.转移临时文件或事务文件的位置

6.2.2.1.作用

改善DB的查询性能.备份性能.连续存储数据的性能;

6.2.2.2.关于临时文件和事务文件
  • 只有具备超级管理员权限的用户才可以移动位置,只有gpfilespace工具可以写该文件;
  • 只能为临时文件或事务文件指定一个文件空间;在生产系统中规划存储时,需要专门为临时文件或者事务文件创建固定的分区;
  • 如果文件空间被临时文件使用,不能删除;
6.2.2.3.使用gpfilespace移动临时文件
  • 确保文件空间存在,且与存储其他用户数据的文件空间不同;但如果空间有限也可以相同;
  • 停止GPDB;
  • 将GPDB启动为限制模式:$gpfilespace –movetempfilespace test_fs,限制模式是为了保证没有任何用户可以连接到GP数据库中;movetempfilespace参数会检查当前GP数据库是否为限制模式,如果是启动状态会重新启动到限制模式;如果是停止状态则启动到限制模式;
  • 注意:临时文件位置在Instance中配合共享内存使用,在创建、打开、删除临时文件用到;

[[email protected] gpfs]$ psql -d testdw             连接到任一数据库中并查看当前所有的文件空间

psql (8.2.15)

Type "help" for help.

testdw=# select * from pg_filespace;

fsname   | fsowner

-----------+---------

pg_system |      10

fs01      |      10

(2 rows)

testdw=# \q

[[email protected] gpfs]$ gpfilespace --movetempfilespace fs01;

20150329:16:52:03:008118 gpfilespace:master:gpadmin-[INFO]:-

A tablespace requires a file system location to store its database

files. A filespace is a collection of file system locations for all components

in a Greenplum system (primary segment, mirror segment and master instances).

Once a filespace is created, it can be used by one or more tablespaces.

20150329:16:52:03:008118 gpfilespace:master:gpadmin-[INFO]:-Database was started in NORMAL mode

20150329:16:52:03:008118 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database

20150329:16:52:06:008118 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database

20150329:16:52:12:008118 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database

20150329:16:52:15:008118 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database in master only mode

20150329:16:52:16:008118 gpfilespace:master:gpadmin-[INFO]:-Checking if filespace fs01 exists

20150329:16:52:16:008118 gpfilespace:master:gpadmin-[INFO]:-Checking if filespace is same as current filespace

20150329:16:52:16:008118 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database in master only mode

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Checking for connectivity

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Obtaining current filespace information

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Obtaining current filespace entries used by TEMPORARY_FILES

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Obtaining segment information ...

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Creating RemoteOperations list

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Moving TEMPORARY_FILES filespace from pg_system to fs01 ...

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database

testdw=# \db              查看当前默认的文件空间

List of tablespaces

Name    |  Owner  | Filespae Name

------------+---------+---------------

pg_default | gpadmin | pg_system

pg_global  | gpadmin | pg_system

(2 rows)

6.2.2.4.使用gpfilespace移动事务文件
  • 确保文件空间存在,且与存储其他用户数据的文件空间不同
  • 停止GPDB
  • 将GPDB启动为限制模式:$gpfilespace –movetransfilespace test_fs
  • 注意:事务文件位置在Instance中配合共享内存使用,在创建.打开.删除事务文件时用到。

[[email protected] gpfs]$ gpfilespace --movetransfilespace fs01;

20150329:16:57:20:008777 gpfilespace:master:gpadmin-[INFO]:-

A tablespace requires a file system location to store its database files. A filespace is a collection of file system locations for all components

in a Greenplum system (primary segment, mirror segment and master instances).

Once a filespace is created, it can be used by one or more tablespaces.

20150329:16:57:20:008777 gpfilespace:master:gpadmin-[INFO]:-Database was started in NORMAL mode

20150329:16:57:20:008777 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database

20150329:16:57:23:008777 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database

20150329:16:57:30:008777 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database

20150329:16:57:33:008777 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database in master only mode

20150329:16:57:34:008777 gpfilespace:master:gpadmin-[INFO]:-Checking if filespace fs01 exists

20150329:16:57:34:008777 gpfilespace:master:gpadmin-[INFO]:-Checking if filespace is same as current filespace

20150329:16:57:34:008777 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database in master only mode

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Checking for connectivity

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Obtaining current filespace information

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Obtaining current filespace entries used by TRANSACTION_FILES

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Obtaining segment information ...

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Creating RemoteOperations list

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Moving TRANSACTION_FILES filespace from pg_system to fs01 ...

20150329:16:57:44:008777 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database

6.2.3.创建表空间

  • 必须是由超级用户创建,且在Master节点进行创建;
  • 使用CREATETABLESPACE命令;
  • 创建完成后授予使用权限给普通用户;

testdw=# \h create tablespace

Command:     CREATE TABLESPACE

Description: define a new tablespace

Syntax:

CREATE TABLESPACE tablespace_name [OWNER username] FILESPACE filespace_name

testdw=# \db

List of tablespaces

Name    |  Owner  | Filespae Name

------------+---------+---------------

pg_default | gpadmin | pg_system

pg_global  | gpadmin | pg_system

(2 rows)

testdw=# CREATE TABLESPACE testspace FILESPACE fs01;

CREATE TABLESPACE

testdw=# \db

List of tablespaces

Name    |  Owner  | Filespae Name

------------+---------+---------------

pg_default | gpadmin | pg_system

pg_global  | gpadmin | pg_system

testspace  | gpadmin | fs01

(3 rows)

testdw=# grant create on tablespace testspace to admin;

GRANT

6.2.4.使用表空间存储DB对象

  • 使用指定表空间创建对象;
  • 使用缺省表空间参数default_tablespace来设定:SETdefault_tablespace = testspace后创建表,此时不需要指定表空间;
  • 如果一个表空间与DB关联,那么其将存储所有该DB的系统日志.临时文件等;

testdw=# create table tb01(id int) tablespace testspace;

NOTICE:  Table doesn‘t have ‘DISTRIBUTED BY‘ clause -- Using column named ‘id‘ as the Greenplum Database data distribution key for this table.

HINT:  The ‘DISTRIBUTED BY‘ clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.

CREATE TABLE

testdw=# \d

List of relations

Schema |    Name     | Type  |  Owner  | Storage

--------+-------------+-------+---------+---------

public | tb01        | table | gpadmin | heap

public | tb1_test_01 | table | gpadmin | heap

(2 rows)

testdw=# set default_tablespace=testspace;

SET

testdw=# create table tb02(id int);

NOTICE:  Table doesn‘t have ‘DISTRIBUTED BY‘ clause -- Using column named ‘id‘ as the Greenplum Database data distribution key for this table.

HINT:  The ‘DISTRIBUTED BY‘ clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.

CREATE TABLE

testdw=# select * from pg_tables where tablespace=‘testspace‘;

schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers

------------+-----------+------------+------------+------------+----------+-------------

public     | tb01      | gpadmin    | testspace  | f          | f        | f

public     | tb02      | gpadmin    | testspace  | f          | f        | f

(2 rows)

  • 查看现有的表空间和文件空间

2个缺省表空间和1个缺省文件空间:

pg_global:存储系统日志信息

pg_default:存储template1和template0模版DB

pg_system:系统初始化时使用的数据目录

获取文件空间的信息,查询语句如下所示:


SELECT spcname as tblspc, fsname as filespc, fsedbid as seg_dbid, fselocation as datadir

FROM pg_tablespace pgts, pg_filespace pgfs, pg_filespace_entry pgfse

WHERE pgts.spcfsoid=pgfse.fsefsoid

AND pgfse.fsefsoid=pgfs.oid

ORDER BY tblspc, seg_dbid;

  • 删除表空间和文件空间

对象被删除前,表空间不能被删除,表空间不能进行级联删除,需要先删除该表空间中的对象;

表空间被删除前,文件空间不能被删除;

通过DROP TABLESPACE命令删除表空间;


testdw-# \h DROP TABLESPACE

Command:     DROP TABLESPACE

Description: remove a tablespace

Syntax:

DROP TABLESPACE [ IF EXISTS ] tablespacename

通过DROP FILESPACE 命令删除文件空间=#DROP FILESPACE test_fs;


testdw-# \h drop filespace

Command:     DROP FILESPACE

Description: remove a filespace

Syntax:

DROP FILESPACE [IF EXISTS] filespacename

时间: 2024-10-02 09:37:23

Greenplum+Hadoop学习笔记-14-定义数据库对象之创建与管理表空间的相关文章

Greenplum+Hadoop学习笔记-14-定义数据库对象之创建与管理表

6.4.创建与管理表 6.4.1.创建表 通过查询CREATETABLE命令帮助如下所示: Command:     CREATE TABLE Description: define a new table Syntax: CREATE [[GLOBAL | LOCAL] {TEMPORARY | TEMP}] TABLE table_name (     -->指定表类型:全局|本地临时 [ { column_name data_type [ DEFAULT default_expr ]   

Greenplum+Hadoop学习笔记-14-定义数据库对象之创建与管理数据库

6.定义数据库对象 6.1.创建与管理数据库 通过\h命令查看创建数据库的语法,如下所示: testdw-# \h create database Command:     CREATE DATABASE Description: create a new database Syntax: CREATE DATABASE name [ [ WITH ] [ OWNER [=] dbowner ] [ TEMPLATE [=] template ] [ ENCODING [=] encoding

Greenplum+Hadoop学习笔记-14-定义数据库对象之创建与管理模式

6.3.创建与管理模式 概述:DB内组织对象的一种逻辑结构,一个DB内可以有多个模式,在未指定模式时默认放置在public中,可以通过"\dn"方式查看数据库中现有模式: testdw=# \dn List of schemas Name        |  Owner --------------------+--------- gp_toolkit         | gpadmin information_schema | gpadmin pg_aoseg           |

Greenplum+Hadoop学习笔记-14-定义数据库对象之创建与管理序列、索引以及视图

6.5.创建与管理序列 序列常用于在新增记录时自动生成唯一标识符,序列的管理包括创建序列.使用序列.修改序列以及删除序列. 6.5.1.创建序列 使用CREATESEQUENCE命令来创建并初始化一个给定名称的单列序列表: devdw=# \h CREATE SEQUENCE                        查看创建序列的帮助 Command:     CREATE SEQUENCE Description: define a new sequence generator Synt

Greenplum中定义数据库对象之创建与管理模式

创建与管理模式 概述:DB内组织对象的一种逻辑结构.一个DB内能够有多个模式.在未指定模式时默认放置在public中.能够通过"\dn"方式查看数据库中现有模式. testdw=# \dn List of schemas Name        |  Owner --------------------+--------- gp_toolkit         | gpadmin information_schema | gpadmin pg_aoseg           | gpa

数据库对象的创建和管理

--数据库对象的创建和管理 DDL(数据定义语言) --表(table): 数据库存储的基本单元; --约束条件(constraint):用来确保数据库中数据的完整性,确保数据满足某些特定的商业规则 --视图(view):一个或多个表的逻辑表示或虚拟表示,主要用于简化查询操作 --索引(index):用于加速数据访问数据库对象,提高访问效率 --序列(sequence):用于生成唯一数字值的数据库对象,序列的生成机制会自动生成顺序递增的数字,可以用来作为数据表的主键值 --同义词(synonym

Greenplum+Hadoop学习笔记-15-管理数据

7.管理数据 7.1.关于GP的并发控制 1)        GP使用多版本控制模型(Mutltiversion Concurrency Control/MVCC)保持数据一致性: 可将MVCC看成行级别锁的一种妥协,它在许多情况下避免了使用锁,同时可以提供更小的开销.根据实现的不同,它可以允许非阻塞式读,在写操作进行时只锁定必要的记录.MVCC会保存某个时间点上的数据快照.这意味着事务可以看到一个一致的数据视图,不管他们需要跑多久.这同时也意味着不同的事务在同一个时间点看到的同一个表的数据可能

JDBC学习笔记(14):数据库的元数据信息与参数的元数据信息

数据库的元数据信息: 1 package com.xxyh.jdbc; 2 import java.sql.DatabaseMetaData; 3 import java.sql.SQLException; 4 public class DBMD { 5 6 public static void main(String[] args) throws SQLException { 7 java.sql.Connection conn = JdbcUtils.getConnection(); 8 D

Greenplum+Hadoop学习笔记-10-Greenplum安装

2.1.评估存储能力 2.1.1.计算可用的空间 步骤1:初始存储能力=硬盘大小*硬盘数 步骤2:配置RAID10,格式化磁盘空间=(初始存储能力*0.9)/2 步骤3:可用磁盘空间=格式化磁盘空间*0.7 步骤4:用户数据使用空间 使用镜像:(2*用户数据)+用户数据/3=可用磁盘空间 不使用镜像:用户数据+用户数据/3=可用磁盘空间 2.1.2.计算用户数据大小 平均来说,实际占用磁盘空间大小=用户数据*1.4 页面开销:32KB页面需要20 bytes 行开销:每行24 bytes,'ap