pig tutorial 11 – pig example to implement custom filter functions

Filter functions are eval functions that return a boolean value. Filter functions can be used anywhere a Boolean expression is appropriate, including the FILTER operator or bincond expression.

Input Data


Here we want to filter out the sub_element_id which is not a port.

raw_data = load 'src/main/resources/service_mapping.txt' USING PigStorage(',') as (element_id:chararray,sub_element_id:chararray,service_id:chararray,update_date:chararray);

raw_data_filter = FILTER raw_data BY com.bt.haas.pig.CustomFilter(raw_data.sub_element_id);

dump raw_data_filter;

Custom Filter Code

import java.io.IOException;
import org.apache.pig.FilterFunc;
import org.apache.pig.data.Tuple;

public class CustomFilter extends FilterFunc {

public Boolean exec(Tuple input) throws IOException {
// TODO Auto-generated method stub

String str = (String) input.get(0);
int index = str.indexOf("port");
if (index >= 0) {
return true;
} else {
return false;