一 GoAccess简介
GoAccess是一款日志分析工具,可以用来分析Apache,IIS,Nginx的日志,或者一些其他web服务的日志。其特点是安装简单,使用方便,分析速度快
二 GoAccess的安装
(1)下载:
[[email protected] src]# wget http://sourceforge.net/projects/goaccess/files/0.7.1/goaccess-0.7.1.tar.gz
(2)安装依赖的库文件:
[[email protected] src]# yum install glib2 glib2-devel GeoIP-devel ncurses-devel
(3)安装GoAccess:
[[email protected] src]# tar zxvf goaccess-0.7.1.tar.gz [[email protected] src]# [[email protected] src]# cd goaccess-0.7.1 [[email protected] goaccess-0.7.1]# ./configure -enalbe-geoip -enable-utf8 [[email protected] goaccess-0.7.1]# make && make install
三 使用GoAccess统计Nginx日志并生成HTML文件
GoAccess统计日志最基本的用法:
goaccess [ -b ][ -s ][ -e IP_ADDRESS][ - a ] <-f log_file >
参数说明:
- -f 日志文件名
- -b 开启流量统计,如果希望加快分析速度不推荐使用该参数
- -s 开启HTTP响应代码统计
- -a 开启用户代理统计
- -e 开启指定IP地址统计,默认禁用
如:
goaccess –f access.log
使用GoAccess统计Nginx日志并生成HTML文件:
[[email protected] logs]# grep "31/May/2016" access.log > 0531.log [[email protected] logs]# goaccess -f 0531.log -a -p goaccess.conf >> 0531.html
最后生成的HTML文件就像这样的:
附:goaccess.conf配置文件:
###################################### # Time Format Options (required) ###################################### # # The hour (24-hour clock) [00,23]; leading zeros are permitted but not required. # The minute [00,59]; leading zeros are permitted but not required. # The seconds [00,60]; leading zeros are permitted but not required. # See `man strftime` for more details # # The following time format works with any of the # Apache/NGINX‘s log formats below. # time-format %H:%M:%S # # Google Cloud Storage or # The time in microseconds since the Unix epoch. # #time-format %f # Squid native log format # #time-format %s ###################################### # Date Format Options (required) ###################################### # # The date-format variable followed by a space, specifies # the log format date containing any combination of regular # characters and special format specifiers. They all begin with a # percentage (%) sign. See `man strftime` # # The following date format works with any of the # Apache/NGINX‘s log formats below. # date-format %d/%b/%Y # # AWS | Amazon CloudFront (Download Distribution) # AWS | Elastic Load Balancing # W3C (IIS) # #date-format %Y-%m-%d # # Google Cloud Storage or # The time in microseconds since the Unix epoch. # #date-format %f # Squid native log format # #date-format %s ###################################### # Log Format Options (required) ###################################### # # The log-format variable followed by a space or \t for # tab-delimited, specifies the log format string. # # NOTE: If the time/date is a timestamp in seconds or microseconds # %x must be used instead of %d & %t to represent the date & time. # NCSA Combined Log Format # log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u" # NCSA Combined Log Format with Virtual Host # #log-format %v:%^ %h %^[%d:%t %^] "%r" %s %b "%R" "%u" # Common Log Format (CLF) # #log-format %h %^[%d:%t %^] "%r" %s %b # Common Log Format (CLF) with Virtual Host # #log-format %v:%^ %h %^[%d:%t %^] "%r" %s %b # W3C # #log-format %d %t %h %^ %^ %^ %^ %r %^ %s %b %^ %^ %u %R # AWS | Amazon CloudFront (Download Distribution) # #log-format %d\t%t\t%^\t%b\t%h\t%m\t%^\t%r\t%s\t%R\t%u\t%^ # AWS | Elastic Load Balancing # #log-format %dT%t.%^ %^ %h:%^ %^ %T %^ %^ %^ %s %^ %b "%r" "%u" # Google Cloud Storage # #log-format "%x","%h",%^,%^,"%m","%U","%s",%^,"%b","%D",%^,"%R","%u" # Squid native log format # #log-format %^ %^ %^ %v %^: %x.%^ %~%L %h %^/%s %b %m %U # Virtualmin Log Format with Virtual Host # #log-format %h %^ %v %^[%d:%t %^] "%r" %s %b "%R" "%u" ###################################### # UI Options ###################################### # Prompt log/date configuration window on program start. # config-dialog false # Choose among color schemes # 1 : Default grey scheme # 2 : Green scheme # color-scheme 1 # Color highlight active panel. # hl-header true # Set HTML report page title and header. # #html-report-title My Awesome Web Stats # Turn off colored output. This is the default output on # terminals that do not support colors. # true : for no color output # false : use color-scheme # no-color false # Don‘t write column names in the terminal output. By default, it displays # column names for each available metric in every panel. # no-column-names false # Disable progress metrics. # no-progress false # Enable mouse support on main dashboard. # with-mouse false # Disable summary metrics on the CSV output.. # no-csv-summary false # Custom colors for the terminal output # Tailor GoAccess to suit your own tastes. # # Color Syntax: # DEFINITION space/tab colorFG#:colorBG# [[attributes,] PANEL] # # FG# = foreground color number [-1...255] (-1 = default terminal color) # BG# = background color number [-1...255] (-1 = default terminal color) # # Optionally: # # It is possible to apply color attributes, such as: # bold,underline,normal,reverse,blink. # Multiple attributes are comma separated # # If desired, it is possible to apply custom colors per panel, that is, a # metric in the REQUESTS panel can be of color A, while the same metric in the # BROWSERS panel can be of color B. # # The following is a 256 color scheme (Monokai palette) # #color COLOR_MTRC_HITS color197:color-1 #color COLOR_MTRC_VISITORS color148:color-1 #color COLOR_MTRC_DATA color7:color-1 #color COLOR_MTRC_BW color81:color-1 #color COLOR_MTRC_AVGTS color247:color-1 #color COLOR_MTRC_CUMTS color95:color-1 #color COLOR_MTRC_MAXTS color186:color-1 #color COLOR_MTRC_PROT color141:color-1 #color COLOR_MTRC_MTHD color81:color-1 #color COLOR_MTRC_PERC color186:color-1 #color COLOR_MTRC_PERC color186:color-1 VISITORS #color COLOR_MTRC_PERC color186:color-1 OS #color COLOR_MTRC_PERC color186:color-1 BROWSERS #color COLOR_MTRC_PERC color186:color-1 VISIT_TIMES #color COLOR_MTRC_PERC_MAX color208:color-1 #color COLOR_MTRC_PERC_MAX color208:color-1 VISITORS #color COLOR_MTRC_PERC_MAX color208:color-1 OS #color COLOR_MTRC_PERC_MAX color208:color-1 BROWSERS #color COLOR_MTRC_PERC_MAX color208:color-1 VISIT_TIMES #color COLOR_PANEL_COLS color242:color-1 #color COLOR_BARS color186:color-1 #color COLOR_ERROR color231:color197 #color COLOR_SELECTED color0:color215 #color COLOR_PANEL_ACTIVE color7:color240 #color COLOR_PANEL_HEADER color7:color237 #color COLOR_PANEL_DESC color242:color-1 #color COLOR_OVERALL_LBLS color251:color-1 #color COLOR_OVERALL_VALS color148:color-1 #color COLOR_OVERALL_PATH color186:color-1 #color COLOR_ACTIVE_LABEL color7:color237 #color COLOR_BG color7:color-1 #color COLOR_DEFAULT color7:color-1 #color COLOR_PROGRESS color7:color141 ###################################### # File Options ###################################### # Specify the path to the input log file. If set, it will take # priority over -f from the command line. # #log-file /var/log/apache2/access.log # Send all debug messages to the specified file. # #debug-file debug.log # Specify a custom configuration file to use. If set, it will take # priority over the global configuration file (if any). # #config-file <filename> # Log invalid requests to the specified file. # #invalid-requests <filename> # Do not load the global configuration file. # #no-global-config false ###################################### # Parse Options ###################################### # Include static files that contain a query string in the static files # panel. # e.g., /fonts/fontawesome-webfont.woff?v=4.0.3 # all-static-files false # Consider the following extensions as static files # The actual ‘.‘ is required and extensions are case sensitive # static-file .css static-file .CSS static-file .dae static-file .DAE static-file .eot static-file .EOT static-file .gif static-file .GIF static-file .ico static-file .ICO static-file .jpeg static-file .JPEG static-file .jpg static-file .JPG static-file .js static-file .JS static-file .map static-file .MAP static-file .mp3 static-file .MP3 static-file .pdf static-file .PDF static-file .png static-file .PNG static-file .svg static-file .SVG static-file .swf static-file .SWF static-file .ttf static-file .TTF static-file .txt static-file .TXT static-file .woff static-file .WOFF # Exclude an IPv4 or IPv6 from being counted. # Ranges can be included as well using a dash in between # the IPs (start-end). # #exclude-ip 127.0.0.1 #exclude-ip 192.168.0.1-192.168.0.100 #exclude-ip ::1 #exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808 # Enable a list of user-agents by host. For faster parsing, do not # enable this flag. # agent-list false # Include HTTP request method if found. This will create a # request key containing the request method + the actual request. # http-method true # Include HTTP request protocol if found. This will create a # request key containing the request protocol + the actual request. # http-protocol true # Ignore request‘s query string. # i.e., www.google.com/page.htm?query => www.google.com/page.htm # # Note: Removing the query string can greatly decrease memory # consumption, especially on timestamped requests. # no-query-string false # Disable IP resolver on terminal output. # no-term-resolver false # Write output to stdout given one of the following formats: # csv : A comma-separated values (CSV) # json : JSON (JavaScript Object Notation) # html : HTML report # #output-format json # Display real OS names. e.g, Windows XP, Snow Leopard. # real-os true # Enable IP resolver on HTML|JSON output. # with-output-resolver false # Treat non-standard status code 444 as 404. # 444-as-404 false # Add 4xx client errors to the unique visitors count. # 4xx-to-unique-count false # Decode double-encoded values. # double-decode false # Ignore parsing and displaying one or multiple status code(s) # #ignore-status 400 #ignore-status 502 # Ignore crawlers from being counted. # This will ignore robots listed under browsers.c # Note that it will count them towards the total # number of requests, but excluded from any of the panels. # ignore-crawlers false # Ignore parsing and displaying the given panel. # #ignore-panel VISITORS #ignore-panel REQUESTS #ignore-panel REQUESTS_STATIC #ignore-panel NOT_FOUND #ignore-panel HOSTS #ignore-panel OS #ignore-panel BROWSERS #ignore-panel VISIT_TIMES #ignore-panel VIRTUAL_HOSTS ignore-panel REFERRERS #ignore-panel REFERRING_SITES ignore-panel KEYPHRASES #ignore-panel GEO_LOCATION #ignore-panel STATUS_CODES # Ignore referers from being counted. # This supports wild cards. For instance, # ‘*‘ matches 0 or more characters (including spaces) # ‘?‘ matches exactly one character # #ignore-referer *.domain.com #ignore-referer ww?.domain.* # Sort panel on initial load. # Sort options are separated by comma. # Options are in the form: PANEL,METRIC,ORDER # # Available metrics: # BY_HITS - Sort by hits # BY_VISITORS - Sort by unique visitors # BY_DATA - Sort by data # BY_BW - Sort by bandwidth # BY_AVGTS - Sort by average time served # BY_CUMTS - Sort by cumulative time served # BY_MAXTS - Sort by maximum time served # BY_PROT - Sort by http protocol # BY_MTHD - Sort by http method # Available orders: # ASC # DESC # #sort-panel VISITORS,BY_DATA,ASC #sort-panel REQUESTS,BY_HITS,ASC #sort-panel REQUESTS_STATIC,BY_HITS,ASC #sort-panel NOT_FOUND,BY_HITS,ASC #sort-panel HOSTS,BY_HITS,ASC #sort-panel OS,BY_HITS,ASC #sort-panel BROWSERS,BY_HITS,ASC #sort-panel VISIT_TIMES,BY_DATA,DESC #sort-panel VIRTUAL_HOSTS,BY_HITS,ASC #sort-panel REFERRERS,BY_HITS,ASC #sort-panel REFERRING_SITES,BY_HITS,ASC #sort-panel KEYPHRASES,BY_HITS,ASC #sort-panel GEO_LOCATION,BY_HITS,ASC #sort-panel STATUS_CODES,BY_HITS,ASC ###################################### # GeoIP Options # Only if configured with --enable-geoip ###################################### # Standard GeoIP database for less memory usage. # #std-geoip false # Specify path to GeoIP database file. i.e., GeoLiteCity.dat # .dat file needs to be downloaded from maxmind.com. # # For IPv4 City database: # wget -N http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz # gunzip GeoLiteCity.dat.gz # # For IPv6 City database: # wget -N http://geolite.maxmind.com/download/geoip/database/GeoLiteCityv6-beta/GeoLiteCityv6.dat.gz # gunzip GeoLiteCityv6.dat.gz # # For IPv6 Country database: # wget -N http://geolite.maxmind.com/download/geoip/database/GeoIPv6.dat.gz # gunzip GeoIPv6.dat.gz # # Note: `geoip-city-data` is an alias of `geoip-database` # #geoip-database /usr/local/share/GeoIP/GeoLiteCity.dat ###################################### # Tokyo Cabinet Options # Only if configured with --enable-tcb=btree ###################################### # GoAccess has the ability to process logs incrementally through the on-disk # B+Tree database. # # It works in the following way: # - A data set must be persisted first with --keep-db-files, then the same data # set can be loaded with --load-from-disk. # - If new data is passed (piped or through a log file), it will append it to # the original data set. # - To preserve the data at all times, --keep-db-files must be used. # - If --load-from-disk is used without --keep-db-files, database files will be # deleted upon closing the program. # On-disk B+ Tree # Persist parsed data into disk. This should be set to # the first dataset prior to use `load-from-disk`. # Setting it to false will delete all database files # when exiting the program. #keep-db-files true # On-disk B+ Tree # Load previously stored data from disk. # Database files need to exist. See `keep-db-files`. #load-from-disk false # On-disk B+ Tree # Path where the on-disk database files are stored. # The default value is the /tmp directory. # #db-path /tmp # On-disk B+ Tree # Set the size in bytes of the extra mapped memory. # The default value is 0. # #xmmap 0 # On-disk B+ Tree # Max number of leaf nodes to be cached. # Specifies the maximum number of leaf nodes to be cached. # If it is not more than 0, the default value is specified. # The default value is 1024. # #cache-lcnum 1024 # On-disk B+ Tree # Specifies the maximum number of non-leaf nodes to be cached. # If it is not more than 0, the default value is specified. # The default value is 512. # #cache-ncnum 512 # On-disk B+ Tree # Specifies the number of members in each leaf page. # If it is not more than 0, the default value is specified. # The default value is 128. # #tune-lmemb 128 # On-disk B+ Tree # Specifies the number of members in each non-leaf page. # If it is not more than 0, the default value is specified. # The default value is 256. # #tune-nmemb 256 # On-disk B+ Tree # Specifies the number of elements of the bucket array. # If it is not more than 0, the default value is specified. # The default value is 32749. # Suggested size of the bucket array is about from 1 to 4 # times of the number of all pages to be stored. # #tune-bnum 32749 # On-disk B+ Tree # Specifies that each page is compressed with ZLIB|BZ2 encoding. # Disabled by default. # #compression zlib
参考文章:
时间: 2024-12-24 03:40:41