Filter functions are eval functions that return a boolean value. Filter functions can be used anywhere a Boolean expression is appropriate, including the FILTER operator or bincond expression.
Input Data
ABCDDI,ABCDDI./shelf=0/port=21,OGEA02699269,08/03/2012
ABCDJC,ABCDJC./shelf=0/port=31,OGEA05357149,06/18/2016
ABCDEG,ABCDEG./shelf=0/svlan=15,OGEA16722054,07/24/2015
Here we want to filter out the sub_element_id which is not a port.
raw_data = load 'src/main/resources/service_mapping.txt' USING PigStorage(',') as (element_id:chararray,sub_element_id:chararray,service_id:chararray,update_date:chararray);
raw_data_filter = FILTER raw_data BY com.bt.haas.pig.CustomFilter(raw_data.sub_element_id);
dump raw_data_filter;
Custom Filter Code
import java.io.IOException;
import org.apache.pig.FilterFunc;
import org.apache.pig.data.Tuple;
public class CustomFilter extends FilterFunc {
@Override
public Boolean exec(Tuple input) throws IOException {
// TODO Auto-generated method stub
String str = (String) input.get(0);
int index = str.indexOf("port");
if (index >= 0) {
return true;
} else {
return false;
}}}