Kettle——shell交互命令
在kettle上开发了job或transform可以以单独的文件存在,也可以存放在资源库中。调用这些程序可以通过shell脚本调用,记录下:
资源库中的job:
./kitchen.sh -rep ZYFS_REP -user admin -pass admin -param:file_name=/home/etluser/etl_data/test/etl_test.csv -dir /test -job JB_ETL_TEST
单个文件的job:
./kitchen.sh -file /home/rdb/JB_QFPD.kjb
单个文件的transform:
./pan.sh -file /home/rdb//TR_QFPD.ktr
kitchen.sh 相关参数说明
Options: -rep = Repository name -user = Repository username -pass = Repository password -job = The name of the job to launch -dir = The directory (dont forget the leading /) -file = The filename (Job XML) to launch -level = The logging level (Basic, Detailed, Debug, Rowlevel, Error, Minimal, Nothing) -logfile = The logging file to write to -listdir = List the directories in the repository -listjobs = List the jobs in the specified directory -listrep = List the available repositories -norep = Do not log into the repository -version = show the version, revision and build date -param = Set a named parameter <NAME>=<VALUE>. For example -param:FILE=customers.csv -listparam = List information concerning the defined parameters in the specified job. -export = Exports all linked resources of the specified job. The argument is the name of a ZIP file. -custom = Set a custom plugin specific option as a String value in the job using <NAME>=<Value>,for example: -custom:COLOR=Red -maxloglines = The maximum number of log lines that are kept internally by Kettle. Set to 0 to keep all rows (default) -maxlogtimeout = The maximum age (in minutes) of a log line while being kept internally by Kettle. Set to 0 to keep all rows indefinitely (default)
pan.sh 相关参数说明
Options: -rep = 资源库名称 -user = 资源库用户名 -pass = 资源库密码 -trans = 要启动的转换名称 -dir = 目录(不要忘了前缀 /) -file = 要启动的文件名(转换所在的 XML 文件) -level = 日志等级 (基本, 详细, 调试, 行级, 错误, 没有) -logfile = 要写入的日志文件 -listdir = 列出资源库里的目录 -listtrans = 列出指定目录下的转换 -listrep = 列出可用资源库 -exprep = 将资源库里的所有对象导出到 XML 文件中 -norep = 不要将日志写到资源库中 -safemode = 安全模式下运行: 有额外的检查 -version = 显示版本,校订和构建日期 -param = Set a named parameter <NAME>=<VALUE>. For example -param:FOO=bar -listparam = List information concerning the defined named parameters in the specified transformation. -metrics = Gather metrics during execution -maxloglines = The maximum number of log lines that are kept internally by Kettle. Set to 0 to keep all rows (default) -maxlogtimeout = The maximum age (in minutes) of a log line while being kept internally by Kettle. Set to 0 to keep all rows indefinitely (default)
数据库资源库和文件资源库的比较:
使用数据库来管理,更容易跨平台和共享,但是在版本控制方面不如文件资源库好,并且数据库需要走网络,网络连接的异常也会导致job失败
使用文件资源库比较麻烦的是跨平台,一般和svn等版本控制的工具结合使用。
原文地址:https://www.cnblogs.com/jnba/p/10677722.html
时间: 2024-10-09 15:15:04