Tuesday, September 13, 2011

Go on Hadoop

It's easy to use Go (or any other language) with Hadoop streaming. Here's a little "word count" example.

System Setup:
  • Hadoop running locally (Cloudera cdh3u0)
  • A copy of hadoop-streaming-0.20.2-cdh3u0.jar in local directory
  • Copy of "Alice In Wonderland" under /user/miki/alice.txt on HDFS
mapper.go

reducer.go

run-job.sh

After the job has ran, you can view the output and check the most common words:

hadoop fs -cat /user/miki/words-out/part-00000 | sort -k 2 -n -r | head
the	1686
and	869
to	799
a	672
of	606
I	545
it	540
she	509
said	456
in	414

34 comments:

  1. I've working on a small library to make Hadoop Streaming code easier to write in Go. It handles the un/marshaling and the line aggregation -- you just need to write the Mapper and the Reducer. It's on github at https://github.com/dgryski/dmrgo .

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete
  6. This comment has been removed by a blog administrator.

    ReplyDelete
  7. This comment has been removed by a blog administrator.

    ReplyDelete
  8. This comment has been removed by a blog administrator.

    ReplyDelete
  9. This comment has been removed by a blog administrator.

    ReplyDelete
  10. This comment has been removed by a blog administrator.

    ReplyDelete
  11. This comment has been removed by a blog administrator.

    ReplyDelete
  12. This comment has been removed by a blog administrator.

    ReplyDelete
  13. This comment has been removed by a blog administrator.

    ReplyDelete
  14. This blog was referred to me by one of my batch-mates who used to participate along with me at hadoop online training center who is also a genius in the subject. Thanks you for the information which is cent percent reliable on this blog.

    ReplyDelete
  15. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. Machine Learning Final Year Projects In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

    ReplyDelete
  16. Your good knowledge and kindness in playing with all the pieces were
    very useful. I don’t know what I would have done if I had not
    encountered such a step like this.
    oracle developer training in chennai
    ASP.NET Training Institute in Chennai
    Best C# Course in Chennai

    ReplyDelete

/* MIKI: Analytics */