Prajwal Tuladhar’s Blog
 
programming, life and some random thoughts

Jan 14 2010

One crucial difference between MapReduce and SQL query

Published by at 8:34 pm under Hadoop

MapReduce is a linearly scalable programming model. The programmer writes two functions—a map function and a reduce function—each of which defines a mapping from one set of key-value pairs to another. These functions are oblivious to the size of the data or the cluster that they are operating on, so they can be used unchanged for a small dataset and for a massive one. More importantly, if you double the size of the input data, a job will run twice as slow. But if you also double the size of the cluster, a job will run as fast as the original one. This is not generally true of SQL queries. – Excerpt from Hadoop – The Definite Guide


Comments Off

Comments are closed at this time.

RSS Feed
Subscribe by email
Follow me @ Twitter