Hadoop中的子项目Zookeeper能做什么
2008-08-20 16:09 分类:Hadoop, Relative
很高兴得看到Yahoo捐献的Zookeeper已经从sourceforge迁移到Apache,并成为Hadoop的子项目.那么ZooKeeper是什么呢?Zookeeper是Google的Chubby一个开源的实现.是高有效和可靠的协同工作系统.Zookeeper能够用来leader选举,配置信息维护等.在一个分布式的环境中,我们需要一个Master实例或存储一些配置信息,确保文件写入的一致性等.Zookeeper能够保证如下3点:
- Watches are ordered with respect to other events, other watches, and
asynchronous replies. The ZooKeeper client libraries ensures that
everything is dispatched in order. - A client will see a watch event for a znode it is watching before seeing the new data that corresponds to that znode.
- The order of watch events from ZooKeeper corresponds to the order of the updates as seen by the ZooKeeper service.
函数式编程范式-MapReduce
2008-08-07 16:48 分类:Hadoop, MapReduce
一个月前有人问我什么是函数式编程?虽然熟悉一些函数式编程的概念,那本半年前从托人从加拿大买的The Little Schemer也就看了前面几章,那天就是回答不了究竟什么是函数式编程。函数式编程对于熟悉过程式程序设计的程序员来说是一个陌生的领域,闭包(closure),延续(continuation),和柯里化(currying)等概念对于过程式程序设计的程序员是个噩梦。
Without understanding functional programming, you can't invent MapReduce,the algorithm that makes Google so massively scalable. The terms Map and Reduce come from Lisp and functional programming. MapReduce is, in retrospect, obvious to anyone who remembers from their 6.001-equivalent programming class that purely functional programs have no side effects and are thus trivially parallelizable. The very fact that Google invented MapReduce, and Microsoft didn't, says something about why Microsoft is still playing catch up trying to get basic search features to work, while Google has moved on to the next problem: building Skynet^H^H^H^H^H^H the world's largest massively parallel supercomputer. I don't think Microsoft completely understands just how far behind they are on that wave.
上段内容摘自 Joel Spolsky的Blog,明白的解释了函数式编程模型是MapReduce的灵感。
MapReduce的名字源于函数式编程模型中的两项核心操作:Map和Reduce。也许熟悉Functional Programming(FP)的人见到这两个词会倍感亲切。因为Map和Reduce这两个术语源自Lisp语言和函数式编程。Map是把一组数据一对一的映射为另外的一组数据,其映射的规则由一个函数来指定。Reduce是对一组数据进行归约,这个归约的规则由一个函数指定。Map是一个把数据分开的过程,Reduce则是把分开的数据合并的过程。如Hadoop的wordcount例子:用Map把[one,word,one,dream]进行映射就变成了[{one,1}, {word,1}, {one,1}, {dream,1}],再用Reduce把[{one,1}, {word,1}, {one,1}, {dream,1}]归约变成[{one,2}, {word,1}, {dream,1}]的结果集。