I have created a VPC, internet gateway, 3 subnets (1 for each cluster and one for an EC2 instance). I am trying to use a Mapreduce Word count function on set of text files stored in a S3 folder I created. First, I login using SSH Web browser login using a foxyproxy extension on chrome. Once logged in, I am executing a script using pig script editor.
A = LOAD ‘s3a://wordcount’ USING TextLoader() AS (words:chararray); B = FOREACH A GENERATE FLATTEN(TOKENIZE(*)); C = GROUP B BY $0; D = FOREACH C GENERATE group, COUNT(B); STORE D INTO ‘s3a://wordcount’;
Once I execute this, halfway into completion is throws an error – Job killed. (oozie:Launcher:T=pig:W=Batch job for que…
submitted by /u/aspiringearthling [link] [comments]Read More