I wish to have the ability to perform a standard diff on two large files. I have got something which works but it is not as quick as diff around the command line.

A = load 'A' as (line);
B = load 'B' as (line);
JOINED = join A by line full outer, B by line;
DIFF = FILTER JOINED by A::line is null or B::line is null;
DIFF2 = FOREACH DIFF GENERATE (A::line is null?B::line : A::line), (A::line is null?'REMOVED':'ADDED');
STORE DIFF2 into 'diff';

Anybody got much better ways to get this done?

Thanks, Wealthy