Shell脚本-l良好的习惯 / 憋错料

原文：	http://www.javacodegeeks.com/2013/10/shell-scripting-best-practices.html#BP11
翻译：	aven

大多数编程语言都有一系列使用该语言编码需要遵循良好的编程习惯。然而，对于shell脚本我没有找到一个比较全面的，所以我决定编写一个我自己的基于我多年编写shell经验的编程习惯。

移植性的注意：自从主要编写shell脚本在安装了Bash 4.2的系统上运行，我从来不担心可移植性，你也不需要担心！下面的列表都是使用Bash 4.2（和其他现代化的shell）编写的。如果你要编写一个可移植的脚本，有些点可能不适用。无需多说，在你按照这个列表改变之后，你应该进行充足的测试。

下面是我关于shell脚本的良好的编程习惯（没有特殊的顺序）：

1. 使用函数

除非你编写非常小的脚本，使用函数将你的代码模块化，并使它易读、可重复使用和好维护。我的所有脚本使用的模板如下。如你所见，所有代码写在函数里面。脚本以main的调用开头。

#!/bin/bash    
set -e    

usage() {    
}    

my_function() {    
}    

main() {    
}    

main "[email protected]"

2. 为函数编写注释

为你的函数添加充足的文档，指定它们是敢什么的，调用它们需要哪些参数。

下面是一个例子:

# Processes a file.    
# $1 - the name of the input file    
# $2 - the name of the output file    
process_file(){    
}

3. 使用shift读取函数参数

不是使用$1,$2来获取函数参数，使用shift如下所示。这会更容易记录参数，如果你后来改变了想法。

# Processes a file.    
# $1 - the name of the input file    
# $2 - the name of the output file    
process_file(){    
    local -r input_file="$1";  shift    
    local -r output_file="$1"; shift    
}

4. 声明你的变量

如果你的变量是整型，像下面这样声明。还有，使你的变量只读，除非后面你想要在你的脚本里面修改它的值。对函数里面声明的变量使用local。这会有助于表达你的意图。如果考虑可移植性，使用typeset而不是delcare。下面是一些例子：

declare -r -i port_number=8080    
declare -r -a my_array=( apple orange )    

my_function() {    
    local -r name=apple    
}

5. 为所有变量扩展加双引号

为了防止word-splitting和文件扩展，你必须为所有变量扩展加上双引号。尤其是当你需要处理包含空白字符（或其他特殊字符）的文件名时，你必须这么做。考虑这个例子；

# create a file containing a space in its name
touch "foo bar"

declare -r my_file="foo bar"

# try rm-ing the file without quoting the variable
rm  $my_file
# it fails because rm sees two arguments: "foo" and "bar"
# rm: cannot remove `foo‘: No such file or directory
# rm: cannot remove `bar‘: No such file or directory

# need to quote the variable
rm "$my_file"

# file globbing example:
mesg="my pattern is *.txt"
echo $mesg
# this is not quoted so *.txt will undergo expansion
# will print "my pattern is foo.txt bar.txt"

# need to quote it for correct output
echo "$msg"

为你所有的变量加上双引号是一个好习惯。如果你需要word-splitting，考虑数组替换。参考下一个点。

6. 适当的时候使用数组

不要在字符串里面存储元素的集合。使用一个数组替换。例如：

# using a string to hold a collection
declare -r hosts="host1 host2 host3"
for host in $hosts  # not quoting $hosts here, since we want word splitting
do
    echo "$host"
done

# use an array instead!
declare -r -a host_array=( host1 host2 host3 )
for host in "${host_array[@]}"
do
    echo "$host"
done

7. 使用"[email protected]"引用所有变量

不要使用$*。参考我之前的文章 Difference between $*, [email protected], “$*” and “[email protected]”。这里是一个例子：

main() {
    # print each argument
    for i in "[email protected]"
    do
        echo "$i"
    done
}
# pass all arguments to main
main "[email protected]"

8. 仅对环境变量使用大写

我的个人偏好是所有变量都使用小写，除非是环境变量。例如：

declare -i port_number=8080

# JAVA_HOME and CLASSPATH are environment variables
"$JAVA_HOME"/bin/java -cp "$CLASSPATH" app.Main "$port_number"

9. 倾向于使用shell自带命令而不是扩展程序

shell有能力去处理字符串和简单的算术，所以你不必调用像cut和sed这样的程序。这里是一些例子：

declare -r my_file="/var/tmp/blah"

# instead of dirname, use:
declare -r file_dir="{my_file%/*}"

# instead of basename, use:
declare -r file_base="{my_file##*/}"

# instead of sed ‘s/blah/hello‘, use:
declare -r new_file="${my_file/blah/hello}"

# instead of bc <<< "2+2", use:
echo $(( 2+2 ))

# instead of grepping a pattern in a string, use:
[[ $line =~ .*blah$ ]]

# instead of cut -d:, use an array:
IFS=: read -a arr <<< "one:two:three"

注意，当出来大文件或输入的时候，扩展程序将会表现的更好。

10. 避免非必须管道

管道会增加而外的开销，所以尽量保持你的管道数量最少。通常没用的例子是cat和echo，如下所示：

1. 避免非必须的cat

如果你不熟悉没用cat的判定，看这里。cat命令只应该被使用于一系列的文件，而不是把输出发送给另一个命令。

# instead of
cat file | command
# use
command < file

2. 避免使用非必须的echo

仅当你想要输出文本到stdout、stderr、文件等。如果你想要发送文本到另外一个命令，不要使用echo通过管道发送。使用一个here-string替换。注意here-string不是可移植的（但是大多数现代shell支持它们）。所以使用一个注释，如果编写一个可移植的脚本。（参考我之前的文章：Useless Use of Echo。）

# instead of
echo text | command
# use
command <<< text

# for portability, use a heredoc
command << END
text
END

3. 避免非必须的grep

从grep到awk或sed的管道是没必要的。自从awk和sed都能grep，你就不必再用管道来grep了。（参考我之前的文章：Useless Use of Grep）

# instead of
grep pattern file | awk ‘{print $1}‘
# use
awk ‘/pattern/{print $1}‘

# instead of
grep pattern file | sed ‘s/foo/bar/g‘
# use
sed -n ‘/pattern/{s/foo/bar/p}‘ file

4. 其他非必要的管道

这里是一些例子：

# instead of
command | sort | uniq
# use
command | sort -u

# instead of
command | grep pattern | wc -l
# use
command | grep -c pattern

11. 避免解析ls

ls的问题是以新的行输出文件名，所以如果你的文件名包含一个换行字符，你将不能正确的解析它。如果ls能够输出没有分隔符的文件名就更好了，不幸的是，它不能。除了ls，使用文件扩展或者一个能输出没有分隔符的替换命令，比如find -print0.

12. 使用globbing

Globbing（或文件扩展）是shell产生匹配模式的一系列文件的方法。在bash，你可以通过打开扩展模式匹配符使用extglob选项来使globbing更加强大。还有，你可以打开nullglob，来得到一个空列表如果没有匹配找到。在一些场合可以使用globbing而不是find，再次声明，不要解析ls！这里是一些例子：

shopt -s nullglob
shopt -s extglob

# get all files with a .yyyymmdd.txt suffix
declare -a dated_files=( *.[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].txt )

# get all non-zip files
declare -a non_zip_files=( !(*.zip) )

13. 尽可能使用无分隔输出

为了正确处理包含空白字符和换行符的文件名，你需要使用无分隔输出，每一行以NUL (00)分隔而不是换行符。大多数程序支持这个。例如，find -pirnt0 输出以null字符结尾的文件名，还有xargs -0读取以null字符分隔的参数。

# instead of
find . -type f -mtime +5 | xargs rm -f
# use
find . -type f -mtime +5 -print0 | xargs -0 rm -f

# looping over files
find . -type f -print0 | while IFS= read -r -d $‘‘ filename; do
    echo "$filename"
done

14. 不要使用反斜杠

使用$(command)而不是`command`，因为更容易嵌套多个命令和更易读。这是一个简单的例子：

# ugly escaping required when using nested backticks
a=`command1 \`command2\``

# $(...) is cleaner
b=$(command1 $(command2))

15. 使用进程替换而不是使用临时文件

大多数情况下，如果命令使用一个文件作为输入，文件可以被替代为另一个命令使用：<(command).这将使你避免写到一个临时文件，传递临时文件到一个命令，最后删除临时文件。如下所示：

# using temp files
command1 > file1
command2 > file2
diff file1 file2
rm file1 file2

# using process substitution
diff <(command1) <(command2)

16. 使用mktemp如果必须创建临时文件

尽量避免创建临时文件。如果必须的话，使用mktemp来创建临时文件夹，然后把文件写进里面。保证在你执行完后目录被移除。

# set up a trap to delete the temp dir when the script exits
unset temp_dir
trap ‘[[ -d "$temp_dir" ]] && rm -rf "$temp_dir"‘ EXIT

# create the temp dir
declare -r temp_dir=$(mktemp -dt myapp.XXXXXX)

# write to the temp dir
command > "$temp_dir"/foo

17. 对于判断条件使用((和[[

使用[[ ... ]]而不是[ ... ]，因为它更安全，并且提供更丰富的特性。对于算术条件使用(( ... ))，因为它运行你使用更相似的数学操作符比如<和>,而不是-lt和-gt. 注意，如果你想要为可移植设计，你必须保留旧的方式[ ... ].这是一些例子：

[[ $foo == "foo" ]] && echo "match"  # don‘t need to quote variable inside [[
[[ $foo == "a" && $bar == "a" ]] && echo "match"

declare -i num=5
(( num < 10 )) && echo "match"       # don‘t need the $ on $num in ((

18. 在判断条件里面使用命令，而不是使用退出状态

如果你想要检查一个命令在做其他事情之前是否返回成功，直接在判断条件里面使用命令，而不是检查命令的退出状态。

# don‘t use exit status
grep -q pattern file
if (( $? == 0 ))
then
    echo "pattern was found"
fi

# use the command as the condition
if grep -q pattern file
then
    echo "pattern was found"
fi

19. 使用set -e

把这个放在你脚本的最上面。这个是告诉shell脚本尽快地退出如果任何语句返回非0的退出码

20. 将错误信息输出到stderr

错误信息属于stderr而不是stdout

echo "An error message" >&2

时间： 2025-01-15 16:07:21

Shell脚本-l良好的习惯