whatweb.rb 未完待续


#!/usr/bin/env ruby                #表示ruby的执行环境
=begin # ruby中用=begin来表示注释的开始

.$$$ $. .$$$ $.
$$$$ $$. .$$$ $$$ .$$$$$$. .$$$$$$$$$$. $$$$ $$. .$$$$$$$. .$$$$$$.
$ $$ $$$ $ $$ $$$ $ $$$$$$. $$$$$ $$$$$$ $ $$ $$$ $ $$ $$ $ $$$$$$.
$ `$ $$$ $ `$ $$$ $ `$ $$$ $$‘ $ `$ `$$ $ `$ $$$ $ `$ $ `$ $$$‘
$. $ $$$ $. $$$$$$ $. $$$$$$ `$ $. $ :‘ $. $ $$$ $. $$$$ $. $$$$$.
$::$ . $$$ $::$ $$$ $::$ $$$ $::$ $::$ . $$$ $::$ $::$ $$$$
$;;$ $$$ $$$ $;;$ $$$ $;;$ $$$ $;;$ $;;$ $$$ $$$ $;;$ $;;$ $$$$
$$$$$$ $$$$$ $$$$ $$$ $$$$ $$$ $$$$ $$$$$$ $$$$$ $$$$$$$$$ $$$$$$$$$‘

WhatWeb - Next generation web scanner.
Author: Andrew Horton aka urbanadventurer

Homepage: http://www.morningstarsecurity.com/research/whatweb

Copyright 2009-2012 Andrew Horton <andrew at morningstarsecurity dot com>

This file is part of WhatWeb.

WhatWeb is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 2 of the License, or
(at your option) any later version.

WhatWeb is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with WhatWeb. If not, see <http://www.gnu.org/licenses/>.
=end # ruby中用=end 表示注释的结束

#require ‘profile‘
require ‘getoptlong‘
require ‘pp‘
require ‘net/http‘
require ‘open-uri‘
require ‘cgi‘
require ‘thread‘
require ‘tempfile‘
require ‘rbconfig‘ # detect environment, e.g. windows or linux
require ‘resolv‘
require ‘resolv-replace‘ # asynchronous DNS

### ruby 1.9 changes
if RUBY_VERSION =~ /^1\.9/ || RUBY_VERSION =~ /^2\.0/ #ruby 中使用=~ 运算符根据正则表达式测试一个字符串,这里用正则判断ruby的版本
require ‘digest/md5‘
else
require ‘md5‘
require ‘net/https‘
end

### gem detection & loading
def gem_available?(gemname)
# gem_available_new_rubygems?(gemname) or gem_available_old_rubygems?(gemname)
if defined?(Gem::Specification) and defined?(Gem::Specification.find_by_name)
gem_available_new_rubygems?(gemname)
else
gem_available_old_rubygems?(gemname)
end
end

# Needed by Ruby 1.8.7 (2010-08-16 patchlevel 302) used as stable in Debian in Aug2012.
def gem_available_old_rubygems?(gemname)
Gem.available?(gemname)
end

def gem_available_new_rubygems?(gemname)
begin
true if Gem::Specification.find_by_name(gemname)
rescue LoadError
false
end
end

gems = %w|json mongo rchardet | #### gems = ["json", "mongo", "rchardet"]

gems.each do |thisgem| #### 遍历gems,加载json,mongo,rchardet
begin
require ‘rubygems‘ # rubygems is optional
if gem_available?(thisgem) #
require thisgem
else
end
rescue LoadError
# that failed.. no big deal
raise if $WWDEBUG==true
end
end

## set up load paths - must be before loading lib/ files ####设置加载路径,必须在加载 lib 下的文件之前设置
# add the directory of the file currently being executed to the load path
$LOAD_PATH.unshift(File.expand_path(File.dirname(__FILE__))) unless
$:.include?(File.dirname(__FILE__)) || $LOAD_PATH.include?(File.expand_path(File.dirname(__FILE__)))
$LOAD_PATH << "/usr/share/whatweb/"

# if __FILE__ is a symlink then follow *every* symlink ####如果文件是链接形式存在,要找到文件的真实路径
if File.symlink?(__FILE__)
require ‘pathname‘
$LOAD_PATH << File.dirname( Pathname.new(__FILE__).realpath )
end

require ‘lib/target.rb‘
require ‘lib/plugins.rb‘
require ‘lib/output.rb‘
require ‘lib/colour.rb‘
require ‘lib/tld.rb‘
require ‘lib/extend-http.rb‘
require ‘lib/version_class.rb‘

#### 加载插件目录
# look through LOAD_PATH for the following plugin directories. Could be in same dir as whatweb or /usr/share/whatweb, etc
PLUGIN_DIRS=[ "plugins", "my-plugins"].map {|x| $LOAD_PATH.map {|y| y+"/"+x if File.exists?(y+"/"+x) } }.flatten.compact

# nothing says pro-developer like using global variables
$VERSION="0.4.8-dev" #### 程序版本
$WWDEBUG = false # raise exceptions in plugins, etc #### 是否开启调试模式
$verbose=0 # $VERBOSE is reserved in ruby #### 冗长模式,在ruby中是预留的
$USE_EXAMPLE_URLS=false #### 使用测试URL
$use_colour="auto" #### 使用颜色
$USER_AGENT="WhatWeb/#{$VERSION}" #### HTTP 中的User-Agent
$MAX_THREADS=25 #### 最大线程数为25
$AGGRESSION=1 #### 攻击性
$FOLLOW_REDIRECT="always" #### 跟踪重定向
$MAX_REDIRECTS=10 #### 重定向深度
$USE_PROXY=false #### 使用代理
$PROXY_HOST=nil
$PROXY_PORT=8080
$PROXY_USER=nil
$PROXY_PASS=nil
$URL_PREFIX=""
$URL_SUFFIX=""
$URL_PATTERN=nil
$NO_THREADS=false
$HTTP_OPEN_TIMEOUT=15
$HTTP_READ_TIMEOUT=30
$WAIT=nil #### 延时
$OUTPUT_ERRORS=nil
$QUIET=false
$CUSTOM_HEADERS={}
$BASIC_AUTH_USER=nil
$BASIC_AUTH_PASS=nil
$PLUGIN_TIMES=Hash.new(0) #### 创建hash表
$NO_ERRORS=false

### matching

# fuzzy matching ftw
def make_tag_pattern(b)
# remove stuff between script and /script
# don‘t bother with !--, --> or noscript and /noscript
inscript=false;

b.scan(/<([^\s>]*)/).flatten.map {|x| x.downcase!; r=nil;
r=x if inscript==false
inscript=true if x=="script"
(inscript=false; r=x) if x=="/script"
r
}.compact.join(",")
end

def decode_html_entities(s)
t = s.dup
html_entities = { "&quot;"=>‘"‘, "&apos;"=>"‘", "&amp;"=>"&", "&lt;"=>"<", "&gt;"=>">" }
html_entities.each_pair {|from,to| t.gsub!(from,to) }
t
end

def certainty_to_words(p)
case p
when 0..49
"maybe"
when 50..99
"probably"
when 100
"certain"
end
end

# some plugins want a random string in URLs
def randstr
rand(36**8).to_s(36)
end

def match_ghdb(ghdb, body, meta, status, base_uri)
# this could be made faster by creating code to eval once for each plugin

pp "match_ghdb",ghdb if $verbose > 2

# take a GHDB string and turn it into code to be evaluated
matches=[] # fill with true or false. succeeds if all true
s = ghdb

# does it contain intitle?
if s =~ /intitle:/i
# extract either the next word or the following words enclosed in "s, it can‘t possibly be both
intitle = (s.scan(/intitle:"([^"]*)"/i) + s.scan(/intitle:([^"]\w+)/i)).to_s
matches << ((body =~ /<title>[^<]*#{Regexp.escape(intitle)}[^<]*<\/title>/i).nil? ? false : true)
# strip out the intitle: part
s=s.gsub(/intitle:"([^"]*)"/i,‘‘).gsub(/intitle:([^"]\w+)/i,‘‘)
end

if s =~ /filetype:/i
filetype = (s.scan(/filetype:"([^"]*)"/i) + s.scan(/filetype:([^"]\w+)/i)).to_s
# lame method: check if the URL ends in the filetype
unless base_uri.nil?
unless base_uri.path.split("?")[0].nil?
matches << ((base_uri.path.split("?")[0] =~ /#{Regexp.escape(filetype)}$/i).nil? ? false : true)
end
end
s=s.gsub(/filetype:"([^"]*)"/i,‘‘).gsub(/filetype:([^"]\w+)/i,‘‘)
end

if s =~ /inurl:/i
inurl = (s.scan(/inurl:"([^"]*)"/i) + s.scan(/inurl:([^"]\w+)/i)).flatten
# can occur multiple times.
inurl.each {|x| matches << ((base_uri.to_s =~ /#{Regexp.escape(x)}/i).nil? ? false : true) }
# strip out the inurl: part
s=s.gsub(/inurl:"([^"]*)"/i,‘‘).gsub(/inurl:([^"]\w+)/i,‘‘)
end

# split the remaining words except those enclosed in quotes, remove the quotes and sort them

remaining_words = s.scan(/([^ "]+)|("[^"]+")/i).flatten.compact.each {|w| w.delete!(‘"‘) }.sort.uniq

pp "Remaining GHDB words", remaining_words if $verbose > 2

remaining_words.each do |w|
# does it start with a - ?
if w[0..0] == ‘-‘
# reverse true/false if it begins with a -
matches << ((body =~ /#{Regexp.escape(w[1..-1])}/i).nil? ? true : false)
else
w = w[1..-1] if w[0..0] == ‘+‘ # if it starts with +, ignore the 1st char
matches << ((body =~ /#{Regexp.escape(w)}/i).nil? ? false : true)
end
end

pp matches if $verbose > 2

# if all matcbhes are true, then true
if matches.uniq == [true]
true
else
false
end
end

### targets

def make_target_list(cmdline_args, inputfile=nil,pluginlist = nil)
url_list = cmdline_args

# read each line as a url, skipping lines that begin with a #
if !inputfile.nil? and File.exists?(inputfile)
pp "loading input file: #{inputfile}" if $verbose > 2
url_list += File.open(inputfile).readlines.each {|line| line.strip! }.delete_if {|line| line =~ /^#.*/ }.each {|line| line.delete!("\n") }
end

# add example urls to url_list if required. plugins must be loaded already
if $USE_EXAMPLE_URLS
url_list += pluginlist.map {|name,plugin| plugin.examples unless plugin.examples.nil? }.compact.flatten
end

genrange=url_list.map do |x|
range=nil
if x =~ /^[0-9\.\-*\/]+$/ and not x =~ /^[\d\.]+$/
# check for nmap
error "Target ranges require nmap to be in the path" if `which nmap` == ""
range=`nmap -n -sL #{x} 2>&1 | egrep -o "([0-9]{1,3}\\.){3}[0-9]{1,3}"`
range=range.split("\n")
end
range
end.compact.flatten

url_list= url_list.select {|x| not x =~ /^[0-9\.\-*\/]+$/ or x =~ /^[\d\.]+$/ }
url_list += genrange unless genrange.empty?

#make urls friendlier, test if it‘s a file, if test for not assume it‘s http://
# http, https, ftp, etc
url_list=url_list.map do |x|
if File.exists?(x)
x
else
# use url pattern
if $URL_PATTERN
x = $URL_PATTERN.gsub(‘%insert%‘,x)
end
# add prefix & suffix
x=$URL_PREFIX + x + $URL_SUFFIX

if x =~ (/^[a-z]+:\/\//)
x
else
x.sub(/^/,"http://")
end
end
end

url_list=url_list.flatten #.sort.uniq
end

def next_target
t=nil

puts "Target List:" + $targets.inspect if $verbose > 2
#
while $recent_targets.include?(t) or t.nil?
t=$targets.shift
puts "# t at the end of the $targets list" if $verbose > 2
# t at the end of the $targets list
if t.nil?
puts "t is nil" if $verbose > 2
if $targets.empty?
if Thread.list.size > 1
if $verbose > 2
puts "Thread list size: #{Thread.list.size}"
Thread.list.each do |thread|
puts "Thread: #{thread.inspect} is #{thread.status}"
end
end
#sleep 1
Thread.pass
else
puts "breaking now" if $verbose > 2
break
end
end
end
end

puts "Recent Targets:" + $recent_targets.join(",") if $verbose > 2

$recent_targets.push t
$recent_targets.pop if $recent_targets.size > 100 # we dont need to care about mroe than 100

t
end

# backwards compatible convenience method for plugins to use
def open_target(url)
newt = Target.new(url)
newt.open
[newt.status,newt.uri,newt.ip,newt.body,newt.headers,newt.raw_headers]
end

### output

def error(s)
return if $NO_ERRORS
if defined?($semaphore)
# We want the output mutex locked.
# Has our current thread already locked the Mutex?
begin
$semaphore.lock
rescue ThreadError
# we‘re already locked. This was expected.
end
end
if ($use_colour=="auto") or ($use_colour=="always")
STDERR.puts red(s)
else
STDERR.puts s
end
STDERR.flush
unless $OUTPUT_ERRORS.nil?
$OUTPUT_ERRORS.out(s)
end
$semaphore.unlock if defined?($semaphore)
end

# takes a string and returns an array of lines. used by plugin_info
def word_wrap(s,width=10)
ret=[]
line=""
s.split.map {|x|
word=x
if line.size + x.size + 1 <= width
line += x + " "
else
if word.size > width
ret << line
line = ""
w=word.clone
while w.size > width
ret << w[0..(width-1)]
w=w[width.to_i..-1]
end
ret << w unless w.size == 0
else
ret << line
line=x + " "
end
end
}
ret << line
ret
end

### core

def run_plugins(target)
results=[]
$plugins_to_use.each do |name,plugin|
begin
while plugin.locked?
#sleep 0.1
puts "Waiting for plugin:#{name} to unlock" if $verbose > 2
Thread.pass
end
plugin.lock

plugin.init(target)

# eXecute the plugin
#start_time = Time.now
result=plugin.x
#end_time = Time.now
#$PLUGIN_TIMES[name] += end_time - start_time

plugin.unlock

rescue Exception => err
error("ERROR: Plugin #{name} failed for #{target.to_s}. #{err}")
plugin.unlock
raise if $WWDEBUG==true
end
results << [name, result] unless result.nil? or result.empty?
end
results
end

def usage()
puts"
.$$$ $. .$$$ $.
$$$$ $$. .$$$ $$$ .$$$$$$. .$$$$$$$$$$. $$$$ $$. .$$$$$$$. .$$$$$$.
$ $$ $$$ $ $$ $$$ $ $$$$$$. $$$$$ $$$$$$ $ $$ $$$ $ $$ $$ $ $$$$$$.
$ `$ $$$ $ `$ $$$ $ `$ $$$ $$‘ $ `$ `$$ $ `$ $$$ $ `$ $ `$ $$$‘
$. $ $$$ $. $$$$$$ $. $$$$$$ `$ $. $ :‘ $. $ $$$ $. $$$$ $. $$$$$.
$::$ . $$$ $::$ $$$ $::$ $$$ $::$ $::$ . $$$ $::$ $::$ $$$$
$;;$ $$$ $$$ $;;$ $$$ $;;$ $$$ $;;$ $;;$ $$$ $$$ $;;$ $;;$ $$$$
$$$$$$ $$$$$ $$$$ $$$ $$$$ $$$ $$$$ $$$$$$ $$$$$ $$$$$$$$$ $$$$$$$$$‘

"

puts "WhatWeb - Next generation web scanner version #{$VERSION}.\nDeveloped by Andrew Horton aka urbanadventurer and Brendan Coles"
puts "Homepage: http://www.morningstarsecurity.com/research/whatweb"
puts
puts "Usage: whatweb [options] <URLs>"
puts "
TARGET SELECTION:
<URLs>\t\tEnter URLs, filenames or nmap-format IP ranges.
\t\t\tUse /dev/stdin to pipe HTML directly
--input-file=FILE, -i\tIdentify URLs found in FILE. You can pipe
\t\t\thostnames or URLs directly with -i /dev/stdin

TARGET MODIFICATION:
--url-prefix\t\tAdd a prefix to target URLs
--url-suffix\t\tAdd a suffix to target URLs
--url-pattern\t\tInsert the targets into a URL. Requires --input-file,
\t\t\teg. www.example.com/%insert%/robots.txt

AGGRESSION:
The aggression level controls the trade-off between speed/stealth and
reliability.
--aggression, -a=LEVEL Set the aggression level. Default: 1
Aggression levels are:
1. Stealthy\tMakes one HTTP request per target. Also follows redirects.
2. Unused
3. Aggressive\tCan make a handful of HTTP requests per target. This triggers
\t\taggressive plugins for targets only when those plugins are
\t\tidentified with a level 1 request first.
4. Heavy\tMakes a lot of HTTP requests per target. Aggressive tests from
\t\tall plugins are used for all URLs.

HTTP OPTIONS:
--user-agent, -U=AGENT Identify as AGENT instead of WhatWeb/#{$VERSION}.
--header, -H\t\tAdd an HTTP header. eg \"Foo:Bar\". Specifying a default
\t\t\theader will replace it. Specifying an empty value, eg.
\t\t\t\"User-Agent:\" will remove the header.
--follow-redirect=WHEN Control when to follow redirects. WHEN may be `never‘,
\t\t\t`http-only‘, `meta-only‘, `same-site‘, `same-domain‘
\t\t\tor `always‘. Default: #{$FOLLOW_REDIRECT}
--max-redirects=NUM\tMaximum number of contiguous redirects. Default: 10

AUTHENTICATION:
--user, -u=<user:password> HTTP basic authentication
--cookie, -c=COOKIES\tProvide cookies, e.g. ‘name=value; name2=value2 ‘

PROXY:
--proxy\t\t<hostname[:port]> Set proxy hostname and port
\t\t\tDefault: #{$PROXY_PORT}
--proxy-user\t\t<username:password> Set proxy user and password

PLUGINS:
--list-plugins, -l\tList all plugins
--plugins, -p=LIST\tSelect plugins. LIST is a comma delimited set of
\t\t\tselected plugins. Default is all.
\t\t\tEach element can be a directory, file or plugin name and
\t\t\tcan optionally have a modifier, eg. + or -
\t\t\tExamples: +/tmp/moo.rb,+/tmp/foo.rb
\t\t\ttitle,md5,+./plugins-disabled/
\t\t\t./plugins-disabled,-md5
\t\t\t-p + is a shortcut for -p +plugins-disabled
--info-plugins, -I=PLUGINS\tDisplay detailed information for plugins.
\t\t\tOptionally search with keywords in a comma delimited
\t\t\tlist.
--grep, -g=STRING\tSearch for STRING in HTTP responses. Reports with a
\t\t\tplugin named Grep
--custom-plugin=DEFINITION\tDefine a custom plugin named Custom-Plugin,
\t\t\tExamples: \":text=>‘powered by abc‘\"
\t\t\t\":version=>/powered[ ]?by ab[0-9]/\"
\t\t\t\":ghdb=>‘intitle:abc \\\"powered by abc\\\"‘\"
\t\t\t\":md5=>‘8666257030b94d3bdb46e05945f60b42‘\"
\t\t\t\"{:text=>‘powered by abc‘},{:regexp=>/abc [ ]?1/i}\"
--dorks=PLUGIN\tList google dorks for the selected plugin
--example-urls, -e=PLUGIN Update the target list with example URLs from
\t\t\tthe selected plugins.

OUTPUT:
--verbose, -v\t\tVerbose output includes plugin descriptions. Use twice
\t\t\tfor debugging.
--colour,--color=WHEN\tcontrol whether colour is used. WHEN may be `never‘,
\t\t\t`always‘, or `auto‘
--quiet, -q\t\tDo not display brief logging to STDOUT
--no-errors\t\tSuppress error messages

LOGGING:
--log-brief=FILE\tLog brief, one-line output
--log-verbose=FILE\tLog verbose output
--log-xml=FILE\tLog XML format
--log-json=FILE\tLog JSON format
"+
# --log-sql=FILE\tLog SQL INSERT statements
# --log-sql-create=FILE\tCreate SQL database tables
" --log-json-verbose=FILE Log JSON Verbose format
--log-magictree=FILE\tLog MagicTree XML format
--log-object=FILE\tLog Ruby object inspection format
--log-mongo-database\tName of the MongoDB database
--log-mongo-collection Name of the MongoDB collection. Default: whatweb
--log-mongo-host\tMongoDB hostname or IP address. Default: 0.0.0.0
--log-mongo-username\tMongoDB username. Default: nil
--log-mongo-password\tMongoDB password. Default: nil
--log-errors=FILE\tLog errors

PERFORMANCE & STABILITY:
--max-threads, -t\tNumber of simultaneous threads. Default: #{$MAX_THREADS}.
--open-timeout\tTime in seconds. Default: #{$HTTP_OPEN_TIMEOUT}
--read-timeout\tTime in seconds. Default: #{$HTTP_READ_TIMEOUT}
--wait=SECONDS\tWait SECONDS between connections
\t\t\tThis is useful when using a single thread.

HELP & MISCELLANEOUS:
--help, -h\t\tThis help
--debug\t\tRaise errors in plugins
--version\t\tDisplay version information. (WhatWeb #{$VERSION})

EXAMPLE USAGE:
* Scan example.com
whatweb example.com
* Scan reddit.com slashdot.org with verbose plugin descriptions
whatweb -v reddit.com slashdot.org
* An aggressive scan of mashable.com detects the exact version of Wordpress
whatweb -a 3 mashable.com
* Scan the local network quickly with 255 threads and suppress errors
whatweb --no-errors -t 255 192.168.0.0/24\n"

suggestions=""
suggestions << "To enable JSON logging install the json gem.\n" unless gem_available?(‘json‘)
suggestions << "To enable MongoDB logging install the mongo gem.\n" unless gem_available?(‘mongo‘)
suggestions << "To enable character set detection and MongoDB logging install the rchardet gem.\n" unless gem_available?(‘rchardet‘)

unless suggestions.empty?
print "\nOPTIONAL DEPENDENCIES\n--------------------------------------------------------------------------------\n" + suggestions + "\n"
end
if RUBY_VERSION =~ /^1\.9/
puts "WARNING: Ruby 1.9 support is experimental. For stable usage use Ruby 1.8 instead. Please report bugs at https://github.com/urbanadventurer/WhatWeb/issue\n\n"
end
puts

end

if ARGV.size==0 # faster usage info
usage
exit
end

plugin_selection=nil
use_custom_plugin=false
use_custom_grep_plugin=false
input_file=nil
output_list = []
mongo={}
mongo[:use_mongo_log]=false

opts = GetoptLong.new(
[ ‘--help‘, ‘-h‘, GetoptLong::NO_ARGUMENT ],
[ ‘-v‘,‘--verbose‘, GetoptLong::NO_ARGUMENT ],
[ ‘-l‘,‘--list-plugins‘, GetoptLong::NO_ARGUMENT ],
[ ‘-p‘,‘--plugins‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘-I‘,‘--info-plugins‘, GetoptLong::OPTIONAL_ARGUMENT ],
[ ‘--dorks‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘-e‘,‘--example-urls‘, GetoptLong::NO_ARGUMENT ],
[ ‘--colour‘,‘--color‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-object‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-brief‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-xml‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-json‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-json-verbose‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-magictree‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-verbose‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-mongo-collection‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-mongo-host‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-mongo-database‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-mongo-username‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-mongo-password‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-sql‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-sql-create‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--log-errors‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--no-errors‘, GetoptLong::NO_ARGUMENT ],
[ ‘-i‘,‘--input-file‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘-U‘,‘--user-agent‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘-a‘,‘--aggression‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘-t‘,‘--max-threads‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--follow-redirect‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--max-redirects‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--proxy‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--proxy-user‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--url-prefix‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--url-suffix‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--url-pattern‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--custom-plugin‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘-g‘,‘--grep‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--open-timeout‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--read-timeout‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--header‘,‘-H‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--cookie‘,‘-c‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--user‘,‘-u‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--wait‘, GetoptLong::REQUIRED_ARGUMENT ],
[ ‘--debug‘, GetoptLong::NO_ARGUMENT ],
[ ‘--version‘, GetoptLong::NO_ARGUMENT ],
[ ‘-q‘,‘--quiet‘, GetoptLong::NO_ARGUMENT]
)

begin
opts.each do |opt, arg|
case opt
when ‘-i‘,‘--input-file‘
input_file=arg
when ‘-l‘,‘--list-plugins‘
PluginSupport.load_plugins
PluginSupport.plugin_list
exit
when ‘-p‘,‘--plugins‘
plugin_selection=arg
when ‘-I‘,‘--info-plugins‘
PluginSupport.load_plugins
PluginSupport.plugin_info(arg.split(","))
exit
when ‘--dorks‘
PluginSupport.load_plugins
PluginSupport.plugin_dorks(arg)
exit
when ‘-e‘,‘--example-urls‘
$USE_EXAMPLE_URLS=true

when ‘--color‘,‘--colour‘
case arg
when ‘auto‘
$use_colour="auto"
when ‘always‘
$use_colour="always"
when ‘never‘
$use_colour=false
else
raise("--colour argument not recognized")
end
when ‘--log-object‘
output_list << OutputObject.new(arg)
when ‘--log-brief‘
output_list << OutputBrief.new(arg)
when ‘--log-xml‘
output_list << OutputXML.new(arg)
when ‘--log-magictree‘
output_list << OutputMagicTreeXML.new(arg)
when ‘--log-verbose‘
output_list << OutputVerbose.new(arg)
when ‘--log-sql‘
output_list << OutputSQL.new(arg)
when ‘--log-sql-create‘
PluginSupport.load_plugins("+")
# delete the file if it already exists
begin
File.delete(arg)
rescue
end
OutputSQL.new(arg).create_tables
puts "SQL CREATE statements written to #{arg}"
exit
when ‘--log-json‘
if defined?(JSON)
output_list << OutputJSON.new(arg)
else
raise("Sorry. The JSON gem is required for JSON output")
end
when ‘--log-json-verbose‘
if defined?(JSON)
output_list << OutputJSONVerbose.new(arg)
else
raise("Sorry. The JSON gem is required for JSONVerbose output")
end
when ‘--log-mongo-collection‘
if defined?(Mongo) and defined?(CharDet)
mongo[:collection]=arg
mongo[:use_mongo_log]=true
else
raise("Sorry. The mongo and rchardet gems are required for Mongo output")
end

when ‘--log-mongo-host‘
if defined?(Mongo) and defined?(CharDet)
mongo[:host]=arg
mongo[:use_mongo_log]=true
else
raise("Sorry. The mongo and rchardet gems are required for Mongo output")
end

when ‘--log-mongo-database‘
if defined?(Mongo) and defined?(CharDet)
mongo[:database]=arg
mongo[:use_mongo_log]=true
else
raise("Sorry. The mongo and rchardet gems are required for Mongo output")
end
when ‘--log-mongo-username‘
if defined?(Mongo) and defined?(CharDet)
mongo[:username]=arg
mongo[:use_mongo_log]=true
else
raise("Sorry. The mongo and rchardet gems are required for Mongo output")
end
when ‘--log-mongo-password‘
if defined?(Mongo) and defined?(CharDet)
mongo[:password]=arg
mongo[:use_mongo_log]=true
else
raise("Sorry. The mongo and rchardet gems are required for Mongo output")
end
when ‘--log-errors‘
$OUTPUT_ERRORS = OutputErrors.new(arg)
when ‘--no-errors‘
$NO_ERRORS = true
when ‘-U‘,‘--user-agent‘
$USER_AGENT=arg
when ‘-t‘,‘--max-threads‘
$MAX_THREADS=arg.to_i
when ‘-a‘,‘--aggression‘
$AGGRESSION=arg.to_i
when ‘--proxy‘
$USE_PROXY=true
$PROXY_HOST = arg.to_s.split(":")[0]
$PROXY_PORT = arg.to_s.split(":")[1].to_i if arg.to_s.include?(":")
when ‘--proxy-user‘
$PROXY_USER=arg.to_s.split(":")[0]
$PROXY_PASS=arg.to_s.scan(/^[^:]*:(.+)/).to_s if arg =~ /^[^:]*:(.+)/
when ‘-q‘,‘--quiet‘
$QUIET=true
when ‘--url-prefix‘
$URL_PREFIX=arg
when ‘--url-suffix‘
$URL_SUFFIX=arg
when ‘--url-pattern‘
$URL_PATTERN=arg
when ‘--custom-plugin‘
use_custom_plugin=true if PluginSupport.custom_plugin(arg)
when ‘--grep‘,‘-g‘
use_custom_grep_plugin=true if PluginSupport.custom_plugin(arg,"grep")
when ‘--follow-redirect‘
if ["never","http-only","meta-only","same-site","same-domain","always"].include?(arg.downcase)
$FOLLOW_REDIRECT=arg.downcase
else
raise("Invalid --follow-redirect parameter.")
end
when ‘--max-redirects‘
$MAX_REDIRECTS=arg.to_i
when ‘--open-timeout‘
$HTTP_OPEN_TIMEOUT=arg.to_i
when ‘--read-timeout‘
$HTTP_READ_TIMEOUT=arg.to_i
when ‘--wait‘
$WAIT = arg.to_i
when ‘-H‘,‘--header‘
begin
x=arg.scan(/([^:]+):(.*)/).flatten
raise if x.empty?
$CUSTOM_HEADERS[x.first]=x.last
rescue
raise("Invalid --header parameter.")
end
when ‘-c‘,‘--cookie‘
begin
raise if arg.empty?
$CUSTOM_HEADERS["Cookie"]=arg
rescue
raise("Cookie require a parameter, e.g. name=value; name2=value2")
end
when ‘-u‘,‘--user‘
$BASIC_AUTH_USER=arg.split(":").first
$BASIC_AUTH_PASS=arg.to_s.scan(/^[^:]*:(.+)/).to_s if arg =~ /^[^:]*:(.+)/
when ‘--debug‘
$WWDEBUG = true
when ‘-h‘,‘--help‘
usage
exit
when ‘-v‘,‘--verbose‘
$verbose=$verbose+1
when ‘--version‘
puts "WhatWeb version #{$VERSION} ( http://www.morningstarsecurity.com/research/whatweb/ )"
exit
end
end
rescue Errno::EPIPE
exit
rescue StandardError, GetoptLong::Error => err
puts
usage
puts err
exit
end

# sanity check # Disable colours in Windows environments
if RbConfig::CONFIG[‘host_os‘] =~ /mswin|mingw/
$use_colour = false
end

# example URLs are only allowed with selected plugins, not all plugins
if $USE_EXAMPLE_URLS and not plugin_selection
error "You must select plugins with -p or --plugins to include examples in the target list"
exit 1
end

### PLUGINS
plugin_selection += ",+Custom-Plugin" if use_custom_plugin and plugin_selection
plugin_selection += ",+Grep" if use_custom_grep_plugin and plugin_selection
$plugins_to_use = PluginSupport.load_plugins(plugin_selection)
# load all the plugins

# sanity check # no plugins?
if $plugins_to_use.size == 0
error "No plugins selected, exiting."
exit 1
end

# optimise plugins
PluginSupport.precompile_regular_expressions

### OUTPUT
output_list << OutputBrief.new unless $QUIET # by default output brief
output_list << OutputObject.new() if $verbose > 1 # full output if -vv
output_list << OutputVerbose.new() if $verbose > 0 # full output if -v

## output dependencies
if mongo[:use_mongo_log]
if $plugins_to_use.map { |a,b| a }.include?("Charset")
output_list << OutputMongo.new(mongo)
else
error("MongoDB logging requires the Charset plugin to be activated. The Charset plugin is the slowest whatweb plugin, it not included by default, and resides in the plugins-disabled folder. Use ./whatweb -p +./plugins-disabled/Charset.rb to enable it.")
exit
end
end

## Headers
$CUSTOM_HEADERS["User-Agent"]=$USER_AGENT unless $CUSTOM_HEADERS["User-Agent"]
$CUSTOM_HEADERS.delete_if {|k,v| v=="" }

### TARGETS
# clean up urls, add example urls if needed
$targets=make_target_list(ARGV, input_file, $plugins_to_use)
$recent_targets=[]

# fail & show usage if no targets.
if $targets.size <1
error "No targets selected"
usage
exit 1
end

$semaphore=Mutex.new
Thread.abort_on_exception = true if $WWDEBUG

while t = next_target
Thread.new(t) do |thistarget|
begin
target = Target.new(thistarget) # we set the target within the thread
rescue => err
error(err)
next
end

puts Thread.current.to_s + " started for " + target.to_s if $verbose>1
sleep $WAIT unless $WAIT.nil? # wait

# follow redirects
no_redirects =false
num_redirects = 0
while no_redirects == false do
no_redirects=true if target.is_file?
# if we redirect 10 times we give up
if num_redirects == $MAX_REDIRECTS
error("ERROR Too many redirects: #{target.to_s}")
no_redirects=true
next
end

begin
target.open
rescue => err
error("ERROR Opening target: #{target.to_s} - #{err}")
no_redirects = true # without this we can get stuck in a loop
raise if $WWDEBUG
next
end

if target.is_url? and target.status.nil?
# assume all HTTP sites return a status
no_redirects=true
next
end

results = run_plugins(target)

# reporting
# multiple output plugins simultaneously, some stdout, some files
output_list.each do |o|
begin
o.out(target.to_s, target.status, results)
rescue => err
#srsly, logging failed
error("ERROR Logging failed: #{target.to_s} - #{err}")
raise if $WWDEBUG==true
end
end

# REDIRECTION
unless no_redirects
begin
if newtarget = target.get_redirection_target
num_redirects+=1
target=Target.new(newtarget)
else
no_redirects=true
end
rescue => err
error("ERROR Redirection broken: #{target.to_s} - #{err}")
no_redirects=true
raise if $WWDEBUG==true
end
end
end # while no_redirects
end # Thread.new

while Thread.list.size>($MAX_THREADS+1)
puts "Thread list full, passing control" if $verbose>1
#sleep 0.5
Thread.pass
end
end # targets.each

# close output logs
output_list.each {|o|
o.close
}

# shutdown plugins
Plugin.registered_plugins.map {|name,plugin| plugin.shutdown }

#pp $PLUGIN_TIMES.sort_by {|x,y|y }

whatweb.rb 未完待续,码迷,mamicode.com

时间: 2024-10-01 02:55:37

whatweb.rb 未完待续的相关文章

把握linux内核设计思想系列(未完待续......)

[版权声明:尊重原创,转载请保留出处:blog.csdn.net/shallnet,文章仅供学习交流,请勿用于商业用途] 把握linux内核设计思想(一):系统调用 把握linux内核设计思想(二):硬中断及中断处理 把握linux内核设计思想(三):下半部机制之软中断 把握linux内核设计思想(四):下半部机制之tasklet 把握linux内核设计思想(五):下半部机制之工作队列及几种机制的选择 把握linux内核设计思想(六):内核时钟中断 把握linux内核设计思想(七):内核定时器和

[译]App Framework 2.1 (1)之 Quickstart (未完待续)

最近有移动App项目,选择了 Hybrid 的框架Cordova  和  App Framework 框架开发. 本来应该从配置循序渐进开始写的,但由于上班时间太忙,这段时间抽不出空来,只能根据心情和兴趣,想到哪写到哪,前面的部分以后慢慢补上. App Framework 前生是是叫 jqMobi 注意大家不要和 jQuery Mobile 混淆了,它们是两个不同的框架,一开始我还真混淆了0.01秒. 这里我先翻译一下Quickstart 部分,一是自己工作上用的上,二是也想顺便练练英文,最关键

数据结构与算法之--高级排序:shell排序和快速排序【未完待续】

高级排序比简单排序要快的多,简单排序的时间复杂度是O(N^2),希尔(shell)排序的是O(N*(logN)^2),而快速排序是O(N*logN). 说明:下面以int数组的从小到大排序为例. 希尔(shell)排序 希尔排序是基于插入排序的,首先回顾一下插入排序,假设插入是从左向右执行的,待插入元素的左边是有序的,且假如待插入元素比左边的都小,就需要挪动左边的所有元素,如下图所示: ==> 图1和图2:插入右边的temp柱需要outer标记位左边的五个柱子都向右挪动 如图3所示,相比插入排序

git个人使用总结 —— idea命令行、撤销commit (未完待续)

近期在使用git,最开始在idea界面操作,后来要求用命令行.刚开始还不是很习惯,感觉很麻烦,用了几天后感觉爽极了! 其实git的命令也不是很多,熟悉一段时间就差不多能顺利使用了.使用过程中遇到了各种各样的问题,有些小问题就在这里集中总结一下. 1.idea命令行.git安装后就自带终端git bash,使用起来很方便.但是用idea开发,开发后还要在相应文件夹下打开git bash很麻烦.其实idea也带有终端terminal,在最下方可以找到,在这里就可以执行命令.但是如果是默认方式安装的g

Unity3D快捷键 未完待续

Unity3D 点选Object+F Object在当前视角居中 CTRL+1/2 Scene/Game视图的切换 MonoDevelop CTRL+K  删除光标所在行的该行后面的代码 CTRL + ALT +C  注释/不注释该行 CTRL+ DOWN  像鼠标滚轮一样向下拖 CTRL + UP 像鼠标滚轮一样向上拖 CTRL + F  查找该脚本 CTRL + SHIFT + F 查找全部脚本 CTRL + H 替换代码 CTRL + SHIFT +W 关掉所有脚本 Unity3D快捷键

模板区域[未完待续](会定期的更新哦(有时间就更了))

写这个博客目的就是为了记录下学过的模板方便我这焫鷄复习吧//dalao们绕道 近期学的: (1)来自机房学长jjh大神教的求1~n的所有最小素因数和加上本焫鷄的批注 #include<iostream> #include<cstdio> #include<cstring> #include<algorithm> #include<cmath>//求1~n的最小质因数 using namespace std; const int MAXN=1e6+

NOIP2016 那些我所追求的 [未完待续]

人生第一场正式OI [Day -1] 2016-11-17 期中考试无心插柳柳成荫,考了全市第2班里第1(还不是因为只复习了不到两天考试),马上请了一个周的假准备NOIP(数学生物还是回去上课的) 灰哥跟我一块,tlq考吃了没请假 正好下个周老班出去学习了不害怕 星期4所有人都请假了,漫无目的地复习了一天题,参考题解补了一场模拟赛 晚上灰哥因为住宿直接回家了,还让我给XXX送纸条 SD NOIP的群好多人直播,我们就直播了个国际象棋(竟然有人说八皇后,我只升变了两个兵称为皇后),然而竟然默认开启

[daily][optimize] 去吃面 (python类型转换函数引申的性能优化)(未完待续)

前天,20161012,到望京面试.第四个职位,终于进了二面.好么,结果人力安排完了面试时间竟然没有通知我,也没有收到短信邀请.如果没有短信邀请门口的保安大哥是不让我进去大厦的.然后,我在11号接到了面试官直接打来的电话,问我为啥还没到,我说没人通知我我不知道呀.结果我就直接被他邀请去以访客的身份参加面试了.不知道人力的姑娘是不是认识我,且和我有仇,终于可以报复了... 然后,我终于如约到了,面试官带着我去前台登记.前台的妹子更萌...认为我是面试官,面试官是才是来面试的.我气质真的那么合吗?

React v16-alpha 源码简读【未完待续】

一.物料准备 1.克隆react源码, github 地址:https://github.com/facebook/react.git 2.安装gulp 3.在react源码根目录下: $npm install $gulp default (建议使用node 6.0+) gulp将文件处理在根目录下的build文件夹中,打开build查看react的源码,结构清晰,引用路径明了 二.从生成 virtual dom 开始 react 生成一个组件有多种写法: es 5下:var Cp=React.