Logstash配置详解
大约 4 分钟ELK日志收集技术Logstash配置管理
Logstash配置详解
Logstash配置文件结构
Logstash的主要配置文件位于$LOGSTASH_HOME/config/
目录下,包括:
- logstash.yml:主配置文件
- pipelines.yml:管道配置文件
- jvm.options:JVM配置文件
- log4j2.properties:日志配置文件
- startup.options:启动选项配置文件
logstash.yml配置详解
1. 基本配置
节点配置:
# 节点名称
node.name: logstash-node-1
# 路径配置
path.data: /var/lib/logstash
path.logs: /var/log/logstash
path.config: /etc/logstash/conf.d/*.conf
path.plugins: /usr/share/logstash/plugins
# 配置重新加载
config.reload.automatic: true
config.reload.interval: 3s
网络配置:
# HTTP API配置
http.host: "0.0.0.0"
http.port: 9600
# 模块配置
modules:
- name: beats
var.input.port: 5044
var.logstash.ssl.certificate: /etc/logstash/logstash.crt
var.logstash.ssl.key: /etc/logstash/logstash.key
2. 性能配置
管道配置:
# 管道工作线程数
pipeline.workers: 4
# 批量大小
pipeline.batch.size: 125
# 批量延迟
pipeline.batch.delay: 50
# 队列类型
queue.type: memory
# 内存队列最大字节数
queue.memory.max_bytes: 1gb
# 内存队列检查点间隔
queue.checkpoint.acks: 1024
queue.checkpoint.writes: 1024
持久化队列配置:
# 启用持久化队列
queue.type: persisted
# 队列最大字节数
queue.max_bytes: 1gb
# 队列检查点间隔
queue.checkpoint.acks: 1024
queue.checkpoint.writes: 1024
queue.checkpoint.interval: 1000
3. 安全配置
SSL配置:
# 启用SSL
xpack.management.enabled: true
xpack.management.elasticsearch.hosts: ["https://localhost:9200"]
xpack.management.elasticsearch.ssl.certificate_authority: "/etc/logstash/elasticsearch-ca.pem"
xpack.management.elasticsearch.ssl.verification_mode: certificate
# HTTP SSL配置
xpack.management.logstash.ssl.certificate: "/etc/logstash/logstash.crt"
xpack.management.logstash.ssl.key: "/etc/logstash/logstash.key"
认证配置:
# Elasticsearch认证
xpack.management.elasticsearch.username: logstash_internal
xpack.management.elasticsearch.password: "${LOGSTASH_INTERNAL_PASSWORD}"
pipelines.yml配置详解
1. 多管道配置
# pipelines.yml
- pipeline.id: main
path.config: "/etc/logstash/conf.d/*.conf"
pipeline.workers: 4
pipeline.batch.size: 125
queue.type: memory
- pipeline.id: nginx
path.config: "/etc/logstash/nginx/*.conf"
pipeline.workers: 2
pipeline.batch.size: 100
queue.type: persisted
queue.max_bytes: 512mb
- pipeline.id: apache
path.config: "/etc/logstash/apache/*.conf"
pipeline.workers: 2
pipeline.batch.size: 100
queue.type: memory
2. 管道参数配置
- pipeline.id: web-logs
path.config: "/etc/logstash/weblogs.conf"
pipeline.workers: 3
pipeline.batch.size: 150
pipeline.batch.delay: 50
queue.type: persisted
queue.max_bytes: 1gb
config.string: |
input {
beats { port => 5044 }
}
filter {
grok { match => { "message" => "%{COMBINEDAPACHELOG}" } }
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "weblogs-%{+YYYY.MM.dd}"
}
}
管道配置文件详解
1. Input配置
File输入:
input {
file {
path => "/var/log/nginx/access.log"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => "json"
stat_interval => 5
discover_interval => 15
}
}
Beats输入:
input {
beats {
port => 5044
host => "0.0.0.0"
ssl => false
ssl_certificate => "/etc/logstash/beats.crt"
ssl_key => "/etc/logstash/beats.key"
}
}
Kafka输入:
input {
kafka {
bootstrap_servers => "localhost:9092"
topics => ["log-topic"]
group_id => "logstash-group"
consumer_threads => 4
decorate_events => true
codec => "json"
}
}
2. Filter配置
Grok过滤器:
filter {
grok {
match => {
"message" => "%{COMBINEDAPACHELOG}"
}
tag_on_failure => ["_grokparsefailure"]
}
}
Date过滤器:
filter {
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
target => "@timestamp"
timezone => "Asia/Shanghai"
}
}
Mutate过滤器:
filter {
mutate {
rename => { "old_field" => "new_field" }
remove_field => [ "unnecessary_field" ]
convert => { "response" => "integer" }
split => { "tags" => "," }
strip => [ "message" ]
uppercase => [ "method" ]
lowercase => [ "user_agent" ]
}
}
GeoIP过滤器:
filter {
geoip {
source => "client_ip"
target => "geoip"
database => "/etc/logstash/GeoLite2-City.mmdb"
fields => ["city_name", "country_name", "region_name", "location"]
}
}
3. Output配置
Elasticsearch输出:
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-%{+YYYY.MM.dd}"
document_type => "_doc"
template => "/etc/logstash/templates/logstash-template.json"
template_name => "logstash"
template_overwrite => true
manage_template => true
action => "index"
}
}
File输出:
output {
file {
path => "/var/log/exported/%{+YYYY-MM-dd}.log"
codec => line { format => "%{message}" }
flush_interval => 10
}
}
Kafka输出:
output {
kafka {
bootstrap_servers => "localhost:9092"
topic_id => "processed-logs"
codec => json
compression_type => "snappy"
required_acks => 1
}
}
jvm.options配置详解
1. 基本JVM设置
# 堆内存设置
-Xms1g
-Xmx1g
# 启用G1垃圾收集器
-XX:+UseG1GC
-XX:G1ReservePercent=25
-XX:InitiatingHeapOccupancyPercent=30
# GC日志
-Xlog:gc*,gc+age=trace,safepoint:gc.log:utctime,pid,tags:filecount=32,filesize=64m
2. 系统属性
# 网络地址缓存
-Djava.net.preferIPv4Stack=true
# 文件描述符限制
-Dio.netty.tryReflectionSetAccessible=true
# 日志管理
-Dlog4j2.formatMsgNoLookups=true
log4j2.properties配置详解
1. 日志级别配置
# 根日志级别
rootLogger.level = info
# 包级别日志
logger.pipeline.name = logstash.pipeline
logger.pipeline.level = debug
logger.instrument.name = logstash.instrument
logger.instrument.level = info
2. Appender配置
# 控制台输出
appender.console.type = Console
appender.console.name = plain_console
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c] %m%n
# 文件输出
appender.rolling.type = RollingFile
appender.rolling.name = plain_rolling
appender.rolling.fileName = ${sys:ls.logs}/logstash-plain.log
appender.rolling.filePattern = ${sys:ls.logs}/logstash-plain-%d{yyyy-MM-dd}-%i.log.gz
appender.rolling.layout.type = PatternLayout
appender.rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c] %m%n
appender.rolling.policies.type = Policies
appender.rolling.policies.time.type = TimeBasedTriggeringPolicy
appender.rolling.policies.time.interval = 1
appender.rolling.policies.time.modulate = true
appender.rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.rolling.policies.size.size = 100MB
appender.rolling.strategy.type = DefaultRolloverStrategy
appender.rolling.strategy.max = 30
高级配置选项
1. 条件配置
input {
beats {
port => 5044
}
}
filter {
if [type] == "nginx" {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
} else if [type] == "apache" {
grok {
match => { "message" => "%{COMMONAPACHELOG}" }
}
} else {
drop { }
}
}
output {
if [type] == "nginx" {
elasticsearch {
hosts => ["localhost:9200"]
index => "nginx-%{+YYYY.MM.dd}"
}
} else if [type] == "apache" {
elasticsearch {
hosts => ["localhost:9200"]
index => "apache-%{+YYYY.MM.dd}"
}
}
}
2. 环境变量配置
input {
beats {
port => "${BEATS_PORT:5044}"
}
}
filter {
mutate {
add_field => {
"environment" => "${ENVIRONMENT:production}"
}
}
}
output {
elasticsearch {
hosts => ["${ELASTICSEARCH_HOST:localhost:9200}"]
user => "${ELASTICSEARCH_USER:}"
password => "${ELASTICSEARCH_PASSWORD:}"
}
}
3. 模板配置
{
"index_patterns": ["logstash-*"],
"settings": {
"number_of_shards": 5,
"number_of_replicas": 1,
"blocks": {
"read_only_allow_delete": "false"
}
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"message": {
"type": "text"
},
"host": {
"type": "keyword"
}
}
}
}
性能优化配置
1. 批量处理优化
# logstash.yml
pipeline.batch.size: 250
pipeline.batch.delay: 50
pipeline.workers: 4
# pipeline.conf
input {
beats {
port => 5044
client_inactivity_timeout => 3600
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
flush_size => 1000
idle_flush_time => 5
template_overwrite => false
}
}
2. 队列优化
# 持久化队列配置
queue.type: persisted
queue.max_bytes: 2gb
queue.checkpoint.acks: 1024
queue.checkpoint.writes: 1024
queue.checkpoint.interval: 1000
3. 内存优化
# jvm.options
-Xms2g
-Xmx2g
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
安全配置
1. SSL配置
input {
beats {
port => 5044
ssl => true
ssl_certificate => "/etc/logstash/certs/logstash.crt"
ssl_key => "/etc/logstash/certs/logstash.key"
ssl_verify_mode => "force_peer"
}
}
output {
elasticsearch {
hosts => ["https://localhost:9200"]
ssl => true
ssl_certificate_verification => true
cacert => "/etc/logstash/certs/elasticsearch-ca.crt"
}
}
2. 认证配置
output {
elasticsearch {
hosts => ["localhost:9200"]
user => "logstash_internal"
password => "${LOGSTASH_INTERNAL_PASSWORD}"
}
}
监控与调试
1. 监控API
# 获取节点信息
curl -X GET "localhost:9600/_node/stats?pretty"
# 获取管道信息
curl -X GET "localhost:9600/_node/pipelines?pretty"
# 获取热插件信息
curl -X GET "localhost:9600/_node/hot_threads?pretty"
2. 调试配置
output {
# 调试输出
stdout {
codec => rubydebug
}
# 文件输出用于调试
file {
path => "/tmp/logstash-debug.log"
codec => json_lines
}
}
3. 日志级别调整
# log4j2.properties
logger.pipeline.level = debug
logger.instrument.level = debug
常见配置问题
1. 性能问题
# 问题:处理速度慢
# 解决:调整批量大小和工作线程数
pipeline.batch.size: 500
pipeline.workers: 8
# 问题:内存不足
# 解决:调整JVM内存和队列大小
-Xms4g
-Xmx4g
queue.max_bytes: 1gb
2. 连接问题
# 问题:Elasticsearch连接失败
# 解决:检查连接配置
output {
elasticsearch {
hosts => ["http://elasticsearch-host:9200"]
retry_max_interval => 60
retry_on_conflict => 5
}
}
3. 数据解析问题
# 问题:Grok解析失败
# 解决:添加错误处理
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
tag_on_failure => ["_grokparsefailure"]
}
if "_grokparsefailure" in [tags] {
mutate {
add_field => { "parse_error" => "true" }
}
}
}
总结
Logstash的配置管理涉及多个层面,从基本的管道配置到高级的性能优化和安全设置。通过合理配置各个组件,可以构建高效、可靠的数据处理管道。在实际应用中,需要根据数据源类型、处理需求和性能要求进行相应的配置调整和优化。