Recently, I study the package net/url of Golang.
I was puzzled about the escape and unescape of url string.
then I find a clear and accurate answer at: http://www.sislands.com/coin70/week6/encoder.htm
the code in golang:
1 // unescape unescapes a string; the mode specifies 2 // which section of the URL string is being unescaped. 3 func unescape(s string, mode encoding) (string, error) { 4 // Count %, check that they‘re well-formed. 5 n := 0 6 hasPlus := false 7 for i := 0; i < len(s); { 8 switch s[i] { 9 case ‘%‘: 10 n++ 11 if i+2 >= len(s) || !ishex(s[i+1]) || !ishex(s[i+2]) { 12 s = s[i:] 13 if len(s) > 3 { 14 s = s[0:3] 15 } 16 return "", EscapeError(s) 17 } 18 i += 3 19 case ‘+‘: 20 hasPlus = mode == encodeQueryComponent 21 i++ 22 default: 23 i++ 24 } 25 } 26 27 if n == 0 && !hasPlus { 28 return s, nil 29 } 30 31 t := make([]byte, len(s)-2*n) 32 j := 0 33 for i := 0; i < len(s); { 34 switch s[i] { 35 case ‘%‘: 36 t[j] = unhex(s[i+1])<<4 | unhex(s[i+2]) 37 j++ 38 i += 3 39 case ‘+‘: 40 if mode == encodeQueryComponent { 41 t[j] = ‘ ‘ 42 } else { 43 t[j] = ‘+‘ 44 } 45 j++ 46 i++ 47 default: 48 t[j] = s[i] 49 j++ 50 i++ 51 } 52 } 53 return string(t), nil 54 }
After a period of home work, Now I am confident about implimenting the project: gocrawel by myself.
时间: 2024-12-28 08:34:27