使用Flume把日志存储到HDFS,在启动时报错如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
2017-06-16 08:58:32,634 (conf-file-poller-0) [ERROR - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:146)] Failed to start agent because dependencies were not found in classpath. Error follows. java.lang.NoClassDefFoundError: org/apache/hadoop/io/SequenceFile$CompressionType at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:235) at org.apache.flume.conf.Configurables.configure(Configurables.java:41) at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.SequenceFile$CompressionType at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 12 more |
其实这个报错信息真的不太友好,看不错哪里有问题。第一眼看上去一直以为是JAVA的classpath有问题,后来在网上看到一篇文章:安装Flume遇到的问题及解决,说是缺少Hadoop相关的jar包导致,想想也对。解决这个问题也很简单,就是在Flume主机上解压好Hadoop的二进制安装包,然后输出Hadoop环境变量即可,Flume会根据环境变量自动找到相关的依赖jar包。
由于我这里的Flume是独立安装,没有部署在Hadoop机器上,所以需要在Flume主机上安装一个Hadoop。
Hadoop二进制包下载自行去国内源或者官网搞定。
1 2 3 4 |
$ tar xvf hadoop-2.8.0.tar.gz -C /usr/local/ $ ln -sv /usr/local/hadoop-2.8.0/ /usr/local/hadoop $ useradd hadoop $ passwd hadoop |
然后最主要的就是输出Hadoop环境变量,编辑环境配置文件/etc/profile.d/hadoop.sh,定义类似如下环境变量,设定Hadoop的运行环境。
1 2 3 4 5 6 7 8 |
#!/bin/bash # export HADOOP_PREFIX="/usr/local/hadoop" export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin export HADOOP_COMMON_HOME=${HADOOP_PREFIX} export HADOOP_HDFS_HOME=${HADOOP_PREFIX} export HADOOP_MAPRED_HOME=${HADOOP_PREFIX} export HADOOP_YARN_HOME=${HADOOP_PREFIX} |
1 |
$ source /etc/profile |
再次运行Flume-ng就应该可以了。
另外,当Flume-ng正常运行后,写入HDFS时报错:java.lang.NoClassDefFoundError: org/apache/hadoop/io/SequenceFile$CompressionType
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode=”/”:hadoop:supergroup:drwxr-xr-x.
这个提示很明显,就是没有写入权限(因为你当前运行flume-ng的用户不是Hadoop用户),解决方案也很简单,就是切换到Hadoop用户执行flume-ng命令即可。
1 2 3 |
$ chown hadoop.hadoop -R /usr/local/flume/ $ su - hadoop $ flume-ng ..... |
或者开启HDFS允许所有用户进行文件写入,默认可能你没有开启。需要在hdfs-site.xml配置文件中添加一项属性定义:
1 2 3 4 |
<property> <name>dfs.permissions</name> <value>false</value> </property> |
完结。。。