Filter functions are eval functions that return a boolean value. Filter functions can be used anywhere a Boolean expression is appropriate, including the FILTER operator or bincond expression.
Input Data
ABCDDI,ABCDDI./shelf=0/port=21,OGEA02699269,08/03/2012
ABCDJC,ABCDJC./shelf=0/port=31,OGEA05357149,06/18/2016
ABCDEG,ABCDEG./shelf=0/svlan=15,OGEA16722054,07/24/2015
Here we want to filter out the sub_element_id which is not a port.
raw_data = load 'src/main/resources/service_mapping.txt' USING PigStorage(',') as (element_id:chararray,sub_element_id:chararray,service_id:chararray,update_date:chararray); raw_data_filter = FILTER raw_data BY com.bt.haas.pig.CustomFilter(raw_data.sub_element_id); dump raw_data_filter;
Custom Filter Code
import java.io.IOException; import org.apache.pig.FilterFunc; import org.apache.pig.data.Tuple; public class CustomFilter extends FilterFunc { @Override public Boolean exec(Tuple input) throws IOException { // TODO Auto-generated method stub String str = (String) input.get(0); int index = str.indexOf("port"); if (index >= 0) { return true; } else { return false; }}}