Hey everyone,
I learned today about a cool ETL/data pipeline/make your life easier tool that was recently released by the NSA (not kidding) as a way to manage the flow of data in and out of system: Apache NiFi. To me, that functionality seems to match PERFECTLY with what people like to do with Hadoop. This guide will just set up NiFi, not do anything with it (that’ll come later!)
Things you’ll need:
- Maven > 3.1
And to use with Hadoop, obviously you’ll need:
- HDP > 2.1
You don’t even need root access!
Here’s how to get it running on your HDP2.3 cluster:
First we need to actually get the source code:
wget https://github.com/apache/nifi/archive/master.zip unzip master.zip cd nifi-master export MAVEN_OPTS="-Xms1024m -Xmx3076m -XX:MaxPermSize=256m" mvn -T C2.0 clean install(Takes about 8 minutes to run all the tests)
After that’s done,
cd nifi-assembly/target tar -zxvf nifi-0.3.1-SNAPSHOT-bin.tar.gz cd nifi-0.3.1-SNAPSHOT vi conf/nifi.propertiesOn line 106 ( :106 in vim)
nifi.web.http.port=9000 bin/nifi.sh install service nifi startor
bin/nifi.sh startif you’re not root.
Then navigate to http://localhost:9000/nifi
There you have it! With any luck, you now have NiFi installed!
Thanks for the instruction. I would like to install nifi on our HDP cluster. I went through the Hortonworks documentation about installing HDF on existing HDP. But I could not find a HDF repository for UBUNTU 16; So, I thought I might be able to use your instruction to install nifi on the cluster.
I wonder if I use your instruction, how I can add nifi to the current list of ambari services for monitoring?
Thanks,
Sam
LikeLike