Today , I writed a awk script,it can get http address. The http address only contains number, alphabet,backslash and point. My awk script is following:
#! /bin/awk -f
{
httpIndex=index($0,"http://")
if ( httpIndex > 0 )
{
match($0,/http:\/\/[[:alnum:]\.\//)
httpstr=substr($0,RSTART,RSTART + RLENGTH -1 )
match(httpstr,/http:/\/\/[[:alnum:]\.\/]/) // line 8
httpstr=substr(httpstr,RSTART,RSTRAT + RLENGTH -1 ) // line 9
print httpstr
}
}
I test this script by 1 000 line texts, I found I need add line 8 and line 9 code, otherwise some http address always behand by some special character ,such as space.
时间: 2024-10-15 03:41:00