使用flume+kafka+storm构建实时日志分析系统网站安全分享!


本文只会涉及flume和kafka的结合,kafka和storm的结合可以参考其他博客
1. flume安装使用
下载flume安装包http://www.apache.org/dyn/closer.cgi/flume/1.5.2/apache-flume-1.5.2-bin.tar.gz
解压$ tar -xzvf apache-flume-1.5.2-bin.tar.gz -C /opt/flume
flume配置文件放在conf文件目录下,执行文件放在bin文件目录下。
1)配置flume
进入conf目录将flume-conf.properties.template拷贝一份,并命名为自己需要的名字
$ cp flume-conf.properties.template flume.conf
修改flume.conf的内容,我们使用file sink来接收channel中的数据,channel采用memory channel,source采用exec source,配置文件如下:
agent.sources = seqGenSrcagent.channels = memoryChannelagent.sinks = loggerSink# For each one of the sources, the type is definedagent.sources.seqGenSrc.type = execagent.sources.seqGenSrc.command = tail -F /data/mongodata/mongo.log#agent.sources.seqGenSrc.bind = 172.168.49.130# The channel can be defined as follows.agent.sources.seqGenSrc.channels = memoryChannel# Each sink’s type must be definedagent.sinks.loggerSink.type = file_rollagent.sinks.loggerSink.sink.directory = /data/flume#Specify the channel the sink should useagent.sinks.loggerSink.channel = memoryChannel# Each channel’s type is defined.agent.channels.memoryChannel.type = memory# Other config values specific to each type of channel(sink or source)# can be defined as well# In this case, it specifies the capacity of the memory channelagent.channels.memoryChannel.capacity = 1000agent.channels.memory4log.transactionCapacity = 100 2)运行flume agent
切换到bin目录下,运行一下命令:
$ ./flume-ng agent –conf ../conf -f ../conf/flume.conf –n agent -Dflume.root.logger=INFO,console
在/data/flume目录下可以看到生成的日志文件。

2. 结合kafka
由于flume1.5.2没有kafka sink,所以需要自己开发kafka sink
可以参考flume 1.6里面的kafka sink,但是要注意使用的kafka版本,由于有些kafka api不兼容的
这里只提供核心代码,process()内容。

Sink.Status status = Status.READY;

Channel ch = getChannel();
Transaction transaction = null;
Event event = null;
String eventTopic = null;
String eventKey = null;

try {
transaction = ch.getTransaction();
transaction.begin();
messageList.clear();

if (type.equals(“sync”)) {
event = ch.take();

if (event != null) {
byte[] tempBody = event.getBody();
String eventBody = new String(tempBody,”UTF-8″);
Map<String, String> headers = event.getHeaders();

if ((eventTopic = headers.get(TOPIC_HDR)) == null) {
eventTopic = topic;
}

eventKey = headers.get(KEY_HDR);

if (logger.isDebugEnabled()) {
logger.debug(“{Event} ” + eventTopic + ” : ” + eventKey + ” : ”
+ eventBody);
}

ProducerData<String, Message> data = new ProducerData<String, Message>
(eventTopic, new Message(tempBody));

long startTime = System.nanoTime();
logger.debug(eventTopic+”++++”+eventBody);
producer.send(data);
long endTime = System.nanoTime(); }
} else {
long processedEvents = 0;
for (; processedEvents < batchSize; processedEvents += 1) {
event = ch.take();

if (event == null) {
break;
}

byte[] tempBody = event.getBody();
String eventBody = new String(tempBody,”UTF-8″);
Map<String, String> headers = event.getHeaders();

if ((eventTopic = headers.get(TOPIC_HDR)) == null) {
eventTopic = topic;
}

eventKey = headers.get(KEY_HDR);

if (logger.isDebugEnabled()) {
logger.debug(“{Event} ” + eventTopic + ” : ” + eventKey + ” : ”
+ eventBody);
logger.debug(“event #{}”, processedEvents);
}

// create a message and add to buffer
ProducerData<String, String> data = new ProducerData<String, String>
(eventTopic, eventBody);
messageList.add(data);
}

// publish batch and commit.
if (processedEvents > 0) {
long startTime = System.nanoTime(); long endTime = System.nanoTime(); }
}

transaction.commit();
} catch (Exception ex) {
String errorMsg = “Failed to publish events”;
logger.error(“Failed to publish events”, ex);
status = Status.BACKOFF;
if (transaction != null) {
try {
transaction.rollback(); } catch (Exception e) {
logger.error(“Transaction rollback failed”, e);
throw Throwables.propagate(e);
}
}
throw new EventDeliveryException(errorMsg, ex);
} finally {
if (transaction != null) {
transaction.close();
}
}

return status; 下一步,修改flume配置文件,将其中sink部分的配置改成kafka sink,如:

producer.sinks.r.type = org.apache.flume.sink.kafka.KafkaSink
producer.sinks.r.brokerList = bigdata-node00:9092
producer.sinks.r.requiredAcks = 1
producer.sinks.r.batchSize = 100
#producer.sinks.r.kafka.producer.type=async
#producer.sinks.r.kafka.customer.encoding=UTF-8
producer.sinks.r.topic = testFlume1 type指向kafkasink所在的完整路径
下面的参数都是kafka的一系列参数,最重要的是brokerList和topic参数

现在重新启动flume,就可以在kafka的对应topic下查看到对应的日志


本文只会涉及flume和kafka的结合,kafka和storm的结合可以参考其他博客
1. flume安装使用
下载flume安装包https://www.apache.org/dyn/closer.cgi/flume/1.5.2/apache-flume-1.5.2-bin.tar.gz
解压$ tar -xzvf apache-flume-1.5.2-bin.tar.gz -C /opt/flume
flume配置文件放在conf文件目录下,执行文件放在bin文件目录下。
1)配置flume
进入conf目录将flume-conf.properties.template拷贝一份,并命名为自己需要的免费精选名字大全
$ cp flume-conf.properties.template flume.conf
修改flume.conf的内容,我们使用file sink来接收channel中的数据,channel采用memory channel,source采用exec source,配置文件如下:

  1. agent.sources = seqGenSrc
  2. agent.channels = memoryChannel
  3. agent.sinks = loggerSink
  4. # For each one of the sources, the type is defined
  5. agent.sources.seqGenSrc.type = exec
  6. agent.sources.seqGenSrc.command = tail -F /data/mongodata/mongo.log
  7. #agent.sources.seqGenSrc.bind = 172.168.49.130
  8. # The channel can be defined as follows.
  9. agent.sources.seqGenSrc.channels = memoryChannel
  10. # Each sink's type must be defined
  11. agent.sinks.loggerSink.type = file_roll
  12. agent.sinks.loggerSink.sink.directory = /data/flume
  13. #Specify the channel the sink should use
  14. agent.sinks.loggerSink.channel = memoryChannel
  15. # Each channel's type is defined.
  16. agent.channels.memoryChannel.type = memory
  17. # Other config values specific to each type of channel(sink or source)
  18. # can be defined as well
  19. # In this case, it specifies the capacity of the memory channel
  20. agent.channels.memoryChannel.capacity = 1000
  21. agent.channels.memory4log.transactionCapacity = 100

2)运行flume agent
切换到bin目录下,运行一下命令:
$ ./flume-ng agent –conf ../conf -f ../conf/flume.conf –n agent -Dflume.root.logger=INFO,console
在/data/flume目录下可以看到生成的日志文件。

2. 结合kafka
由于flume1.5.2没有kafka sink,所以需要自己开发kafka sink
可以参考flume 1.6里面的kafka sink,但是要注意使用的kafka版本,由于有些kafka api不兼容的
这里只提供核心代码,process()内容。

  1. Sink.Status status = Status.READY;

  2. Channel ch = getChannel();
  3. Transaction transaction = null;
  4. Event event = null;
  5. String eventTopic = null;
  6. String eventKey = null;

  7. try {
  8. transaction = ch.getTransaction();
  9. transaction.begin();
  10. messageList.clear();

  11. if (type.equals("sync")) {
  12. event = ch.take();

  13. if (event != null) {
  14. byte[] tempBody = event.getBody();
  15. String eventBody = new String(tempBody,"UTF-8");
  16. Map<String, String> headers = event.getHeaders();

  17. if ((eventTopic = headers.get(TOPIC_HDR)) == null) {
  18. eventTopic = topic;
  19. }

  20. eventKey = headers.get(KEY_HDR);

  21. if (logger.isDebugEnabled()) {
  22. logger.debug("{Event} " + eventTopic + " : " + eventKey + " : "
  23. + eventBody);
  24. }

  25. ProducerData<String, Message> data = new ProducerData<String, Message>
  26. (eventTopic, new Message(tempBody));

  27. long startTime = System.nanoTime();
  28. logger.debug(eventTopic+"++++"+eventBody);
  29. producer.send(data);
  30. long endTime = System.nanoTime();
  31. }
  32. } else {
  33. long processedEvents = 0;
  34. for (; processedEvents < batchSize; processedEvents += 1) {
  35. event = ch.take();

  36. if (event == null) {
  37. break;
  38. }

  39. byte[] tempBody = event.getBody();
  40. String eventBody = new String(tempBody,"UTF-8");
  41. Map<String, String> headers = event.getHeaders();

  42. if ((eventTopic = headers.get(TOPIC_HDR)) == null) {
  43. eventTopic = topic;
  44. }

  45. eventKey = headers.get(KEY_HDR);

  46. if (logger.isDebugEnabled()) {
  47. logger.debug("{Event} " + eventTopic + " : " + eventKey + " : "
  48. + eventBody);
  49. logger.debug("event #{}", processedEvents);
  50. }

  51. // create a message and add to buffer
  52. ProducerData<String, String> data = new ProducerData<String, String>
  53. (eventTopic, eventBody);
  54. messageList.add(data);
  55. }

  56. // publish batch and commit.
  57. if (processedEvents > 0) {
  58. long startTime = System.nanoTime();
  59. long endTime = System.nanoTime();
  60. }
  61. }

  62. transaction.commit();
  63. } catch (Exception ex) {
  64. String errorMsg = "Failed to publish events";
  65. logger.error("Failed to publish events", ex);
  66. status = Status.BACKOFF;
  67. if (transaction != null) {
  68. try {
  69. transaction.rollback();
  70. } catch (Exception e) {
  71. logger.error("Transaction rollback failed", e);
  72. throw Throwables.propagate(e);
  73. }
  74. }
  75. throw new EventDeliveryException(errorMsg, ex);
  76. } finally {
  77. if (transaction != null) {
  78. transaction.close();
  79. }
  80. }

  81. return status;

下一步,修改flume配置文件,将其中sink部分的配置改成kafka sink,如:

  1. producer.sinks.r.type = org.apache.flume.sink.kafka.KafkaSink
  2. producer.sinks.r.brokerList = bigdata-node00:9092
  3. producer.sinks.r.requiredAcks = 1
  4. producer.sinks.r.batchSize = 100
  5. #producer.sinks.r.kafka.producer.type=async
  6. #producer.sinks.r.kafka.customer.encoding=UTF-8
  7. producer.sinks.r.topic = testFlume1

type指向kafkasink所在的完整路径
下面的参数都是kafka的一系列参数,最重要的是brokerList和topic参数

现在重新启动flume,就可以在kafka的对应topic下查看到对应的日志

www.dengb.comtruehttps://www.dengb.com/wzaq/1109722.htmlTechArticle使用flume+kafka+storm构建实时日志分析系统 本文只会涉及flume和kafka的结合,kafka和storm的结合可以参考其他博客 1. flume安装使用 下载flume安装…

—-想了解更多的网站安全相关处理怎么解决关注<计算机技术网(www.ctvol.com)!!>

本文来自网络收集,不代表计算机技术网立场,如涉及侵权请联系管理员删除。

ctvol管理联系方式QQ:251552304

本文章地址:https://www.ctvol.com/webstt/websy/96109.html

(0)
上一篇 2020年4月26日
下一篇 2020年4月26日

精彩推荐