Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Logstash problem with grok filters parsing apache-access and nginx-access at the same time? #707

Closed
KenChimp opened this issue Nov 15, 2013 · 1 comment

Comments

@KenChimp
Copy link

Relatively new to Logstash/ElasticSearch/Redis with Kibana UI
Using:
Logstash-1.1.13
Elasticsearch-0.90.3
Redis-2.6.16
Kibana 3, milestone 3

Shipping logs to central logs host using rsyslog on CentOS 6.4
Configs (modified for privacy):
logstash shipper.conf:
input {
tcp {
port => 5140
type => syslog
}
udp {
port => 5140
type => syslog
}
}

filter {
grok {
type => "syslog"
pattern => [ "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" ]
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{@source_host}" ]
}
syslog_pri {
type => "syslog"
}
date {
type => "syslog"
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
mutate {
type => "syslog"
exclude_tags => "_grokparsefailure"
replace => [ "@source_host", "%{syslog_hostname}" ]
replace => [ "@message", "%{syslog_message}" ]
}
mutate {
type => "syslog"
remove => [ "syslog_hostname", "syslog_message", "syslog_timestamp" ]
}
}
filter {
grok {
type => "syslog"
match => [ "syslog_program", "apache-access" ]
pattern => "%{COMBINEDAPACHELOG}"
}
}
filter {
grok {
type => "syslog"
match => [ "syslog_program", "nginx-access" ]
pattern => [ "%{IP:client_ip} [%{HTTPDATE:time}] %{HOST:domain} "(?:%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}|%{DATA:unparsedrq})" %{NUMBER:response} %{NUMBER:bytes} (?:%{NUMBER:bytes}|-) "%{QUOTEDSTRING:httpreferrer}" "%{QUOTEDSTRING:httpuseragent}" "(?<gzip_ratio>([0-9],. ]+?)|-)" (?<upstream_response_time>([0-9,. ]+?)|-) (%{BASE16FLOAT:request_time}|-) "(?<upstream_content_type>([\w\W]+?)|-)"" ]
add_field => [ "nginx_response", "%{NUMBER:response}" ]
}
}

output {
redis { host => "10.6.1.76" data_type => "list" key => "logstash" }
}

logstash indexer.conf:
input {
redis {
host => "10.6.1.76"
type => "redis-input"
data_type => "list"
key => "logstash"

format => "json_event"

}
}

output {
elasticsearch { host => "10.6.1.76" cluster => "Monkey_elasticsearch" }
}

Using Kibana 3 UI in apache2 on central logs host, called with:

/usr/bin/java -jar logstash-1.1.13-flatjar.jar web --backend elasticsearch://10.6.1.76/Monkey_elasticsearch

Problem Description:
I am able to receive apache access log data just fine if I change the nginx-access pattern match to %{COMBINEDAPACHELOG}, but then my nginx access logs are not filtered properly and I can't easily get response codes, etc in Kibana for nginx access log data.

If I set up a pattern match for nginx access logs in addition to the apache access log pattern (exactly as in my shipper.conf above), I no longer see ANY apache access logs, and nginx access logs are not properly filtered so I can easily see/retrieve the fields, even if I configure fields (as I did with nginx_response in my shipper.conf above).

The problem is not syntactical, as I get no errors on starting up logstash processes with the configurations above.

Any idea what I'm doing wrong?

@KenChimp
Copy link
Author

I have resolved this issue.
First step was to learn how to write new field entries for pattern matching in grok, then use [url=http://grokdebug.herokuapp.com/]Grok Debugger[/url] to correct any logic or syntax problems.

I ended up with this for my logstash shipper.conf:
shipper.conf:
input {
tcp {
port => 5140
type => syslog
}
udp {
port => 5140
type => syslog
}
}

filter {
grok {
break_on_match => "false"
type => "syslog"
pattern => [ "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" ]
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{@source_host}" ]
}
syslog_pri {
type => "syslog"
}
date {
type => "syslog"
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
mutate {
type => "syslog"
exclude_tags => "_grokparsefailure"
replace => [ "@source_host", "%{syslog_hostname}" ]
replace => [ "@message", "%{syslog_message}" ]
}
mutate {
type => "syslog"
remove => [ "syslog_hostname", "syslog_message", "syslog_timestamp" ]
}
}
filter {
grok {
break_on_match => "false"
type => "syslog"
match => [ "syslog_program", "apache-access", "syslog_program", "apache-error" ]
pattern => "%{COMBINEDAPACHELOG}"
}
}
filter {
grok {
break_on_match => "false"
type => "syslog"
match => [ "syslog_program", "nginx-access", "syslog_program", "nginx-error" ]
pattern => [ "%{IP:client_ip} [%{HTTPDATE:time}] "%{HOST:domain}" "(?:%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}|%{DATA:unparsedrq})" %{NUMBER:response} (%{NUMBER:body_bytes}|-) (%{NUMBER:bytes}|-) %{DATA:unparsedrq}" ]
}
}

output {
redis { host => "10.6.1.76" data_type => "list" key => "logstash" }
}

And also adding a line to load the emelasticsearch module to rsyslog.conf, which I somehow neglected to do before.

w33ble pushed a commit to w33ble/kibana that referenced this issue Sep 13, 2018
* Added EuiButton to element_not_selected

* Added EuiCallOut to function_form

* Swapped out alert for EuiCallOut

* Removed labeled_input
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@rashidkpc @KenChimp and others