I have been looking up the information available on apache's site. I believe some of the information shall prove very useful with the setting up of the project, and it also helps me with the required tech document.
Firstly a server needs to be setup.
Since Hadoop is not properly tested windows is not suported as a prodiction platform - so the server shall run Linux as this doubles as a development and production platform.
Secondly, the server needs to have the appropriate software installed:
Java 1.6+
ssh
latest stable hadoop release.
I need to find out, or decide if I am going to set up a server to run in 'pseudo-distributed' mode, (one server that is programmed to split the data up and behave as if it were a number of seperate servers in a cluster) or if I want to assemble or find a way to operate a network cluster.
There are installation instructions on the apache website. Once the server is set up, map reduce needs to be installed. The commands for this are written in Java.
No comments:
Post a Comment