- Hazelcast - Discussion
- Hazelcast - Useful Resources
- Hazelcast - Quick Guide
- Common Pitfalls & Performance Tips
- Hazelcast - Collection Listener
- Map Reduce & Aggregations
- Hazelcast - Monitoring
- Hazelcast - Spring Integration
- Hazelcast - Serialization
- Hazelcast——客户
- Hazelcast - 数据结构
- 确立多标准事例
- Hazelcast - Configuation
- Hazelcast - First Application
- Hazelcast - Setup
- Hazelcast - Introduction
- Hazelcast - Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Hazelcast - Map Reduce & Aggregations
MapReduce is a computation model which is useful for data processing when you have lots of data and you need multiple machines, i.e., a distributed environment to calculate data. It involves map ing of data into key-value pairs and then reducing , i.e., grouping these keys and performing operation on the value.
Given the fact that Hazelcast is designed keeping a distributed environment in mind, implementing Map-Reduce Frameworks comes naturally to it.
Let’s see how to do it with an example.
For example, let s suppose we have data about a car (brand & car number) and the owner of that car.
Honda-9235, John Hyundai-235, Apce Honda-935, Bob Mercedes-235, Janice Honda-925, Catnis Hyundai-1925, Jane
And now, we have to figure out the number of cars for each brand, i.e., Hyundai, Honda, etc.
Example
Let s try to find that out using MapReduce −
package com.example.demo; import java.lang.reflect.Array; import java.util.ArrayList; import java.util.Map; import java.util.concurrent.ExecutionException; import java.util.concurrent.atomic.AtomicInteger; import com.hazelcast.core.Hazelcast; import com.hazelcast.core.HazelcastInstance; import com.hazelcast.core.ICompletableFuture; import com.hazelcast.core.IMap; import com.hazelcast.mapreduce.Context; import com.hazelcast.mapreduce.Job; import com.hazelcast.mapreduce.JobTracker; import com.hazelcast.mapreduce.KeyValueSource; import com.hazelcast.mapreduce.Mapper; import com.hazelcast.mapreduce.Reducer; import com.hazelcast.mapreduce.ReducerFactory; pubpc class MapReduce { pubpc static void main(String[] args) throws ExecutionException, InterruptedException { try { // create two Hazelcast instances HazelcastInstance hzMember = Hazelcast.newHazelcastInstance(); Hazelcast.newHazelcastInstance(); IMap<String, String> vehicleOwnerMap=hzMember.getMap("vehicleOwnerMap"); vehicleOwnerMap.put("Honda-9235", "John"); vehicleOwnerMap.putc"Hyundai-235", "Apce"); vehicleOwnerMap.put("Honda-935", "Bob"); vehicleOwnerMap.put("Mercedes-235", "Janice"); vehicleOwnerMap.put("Honda-925", "Catnis"); vehicleOwnerMap.put("Hyundai-1925", "Jane"); KeyValueSource<String, String> kvs=KeyValueSource.fromMap(vehicleOwnerMap); JobTracker tracker = hzMember.getJobTracker("vehicleBrandJob"); Job<String, String> job = tracker.newJob(kvs); ICompletableFuture<Map<String, Integer>> myMapReduceFuture = job.mapper(new BrandMapper()) .reducer(new BrandReducerFactory()).submit(); Map<String, Integer&g; result = myMapReduceFuture.get(); System.out.println("Final output: " + result); } finally { Hazelcast.shutdownAll(); } } private static class BrandMapper implements Mapper<String, String, String, Integer> { @Override pubpc void map(String key, String value, Context<String, Integer> context) { context.emit(key.sppt("-", 0)[0], 1); } } private static class BrandReducerFactory implements ReducerFactory<String, Integer, Integer> { @Override pubpc Reducer<Integer, Integer> newReducer(String key) { return new BrandReducer(); } } private static class BrandReducer extends Reducer<Integer, Integer> { private AtomicInteger count = new AtomicInteger(0); @Override pubpc void reduce(Integer value) { count.addAndGet(value); } @Override pubpc Integer finapzeReduce() { return count.get(); } } }
Let’s try to understand this code −
We create Hazelcast members. In the example, we have a single member, but there can well be multiple members.
We create a map using dummy data and create a Key-Value store out of it.
We create a Map-Reduce job and ask it to use the Key-Value store as the data.
We then submit the job to cluster and wait for completion.
The mapper creates a key, i.e., extracts brand information from the original key and sets the value to 1 and then emits that information as K-V to the reducer.
The reducer simply sums the value, grouping the data, based on key, i.e., brand name.
Output
The output of the code −
Final output: {Mercedes=1, Hyundai=2, Honda=3}Advertisements