Tagged Questions

The advantage of MapReduce is that it allows for distributed processing of the map and reduction operations. Provided each mapping operation is independent of the other, all maps can be performed in parallel - though in practice it is limited by the data source and/or the number of CPUs near that ...

learn more… | top users | synonyms

0
votes
1answer
10 views

when does data to output file is written in a map reduce architecture?How can i use processed reducer output data in reducer?

I am using hadoop version:1.0.0 After processing each reducer input key i am collecting the output.But it is not written to actual output file. I am trying to use processed intermediate output for ...
0
votes
0answers
22 views

Image Analysis in Hadoop

I am trying to write a hadoop mapreduce program in which I wanted to achieve the following: The overall system will search a collection of images already made available on HDFS, and select which one ...
1
vote
1answer
18 views

MongoDB unique value aggregation via map reduce

I see plenty of questions on SO about aggregation in MongoDB, however, I have not found a complete solution to mine yet. Here's an example of my data: { "fruits" : { "apple" : "red", ...
-2
votes
0answers
12 views

Run MatrixMultiply program on the hadoop

Hi I am in learning process of mapreduce and haddoop. I want to run the example of matrix multiplitaction presented here along with its code: http://www.norstad.org/matrix-multiply/index.html I ...
0
votes
0answers
18 views

how can i emit a list of values from a mapper or reducer

i have a file that contains some geophysical data(seismic data), and i am reading these files from the local FS and storing them as hadoop sequential files in HDFS. now i want to write a mapreduce job ...
0
votes
1answer
17 views

Number of records reducing after mongo mapreduce

This is my mapreduce code: DBCollection mongoCollection = MongoDAO.getCollection(); String map = "function() {" + "for (index in this.positions.positionList) {" + ...
0
votes
0answers
6 views

Hadoop Pig Cassandra get_range_slices error

I am using Cassandra 1.0.9 and the latest Pig and Hadoop to execute MapReduce tasks. Just a simple task scripted in Pig to extract 2 columns from the Cassandra database. Seems to work and then it ...
2
votes
1answer
111 views

reduce a list of functions to a boolean

I'm looking for a way to reduce this list to a boolean. Here is the original: let ones = [1;1;1;1] let twos = [2;2;2;2] let bad = [1;2;3] let isAllOnes = List.forall (fun op -> op = 1) let ...
2
votes
2answers
61 views

couchdb - retrieve documents for which an id is not included in an array property

I have the following document structure in a couchdb database: { 'type': 'Message', 'subject': 'Cheap V1@gr@', 'body': 'not spam', 'metadata': { 'participation': { 'read': [1,2,3] ...
0
votes
0answers
37 views

Unreasonable Error in Hadoop

I am using System.loadLibrary("native1"); where libnative1.so is dependent on other .so 's. I have added each such .so into the Distributed Cache. Interestingly, if I do not add any one of the .so ...
0
votes
0answers
9 views

OUtput two mysql tables from same reducer

I am using hadoop Map-Reduce to read write from mysql. I have a situation where i have to output data in two different tables from same reducer.Is it possible?
0
votes
0answers
23 views

Output file of size zero

I am running a Hadoop mapreduce streaming job (mappers only job). In some cases my job writes to stdout whereupon an output file with non-zero size is created. In some cases my job does not write ...
0
votes
1answer
42 views

What's the best way to store and update a Set in HBase?

So here's the situation : I've created a SetWritable class, basically a wrapper for java.util.Set that implements the Writable interface. I have an HBase table with one column family and one column, ...
2
votes
1answer
52 views

Map reduce in RavenDb over 2 collections with child collection

I have 2 different object types stored in RavenDb, which are a parent/child type relationship, like this in JSON: Account/1 { "Name": "Acc1", } Items/1 { "Account": "Account/1", ...
1
vote
0answers
44 views

Parallelizing a serial algorithm

Hej folks, I am working on porting a Text mining/Natural language application from single-core to a Map-Reduce style system. One of the steps involves a while loop similar to this: ...

1 2 3 4 5 89
15 30 50 per page