I'm using HIVE with two tables searching like (pretty much):

-TABLE1 understood to be [(Variables : string),(Value1 : int),(Value2 : int)]

with area "Variables" searching like "x0,x1,x2,x3,...,xn"

-TABLE2 define as [(Value1Sum : int),(Value2Sum : int),(X1 : string),(X4 : string),(X17 : string)]

I "convert" table1 to table2 using the query :

INSERT OVERWRITE TABLE table2
    SELECT sum(v1), sum(v2), x1, x4, x17
        FROM (SELECT
                Value1 as v1,
                Value2 as v2,
                split(Variables, ",")[1] as x1,
                split(Variables, ",")[4] as x4,
                split(Variables, ",")[17] as x17 
              FROM Table1) tmp
        GROUP BY tmp.x1, tmp.x4, tmp.x17

Does Hive call 3 occasions the split function ?

It is possible to way to really make it more elegant ?

It is possible to way to really make it more generic ?

Sincerely, CC

Yes it'll call split every time. You may make it a little more elegant:

Why don't you define Variables being an array column to begin with? They you have access to elements directly:

select Varaibles[1] from table1

I am presuming you are utilizing an exterior table, so it can be done like so:

create external table table1(variables array<string>, a int, b int)
ROW FORMAT DELIMITED
    COLLECTION ITEMS TERMINATED BY ','
LOCATION 'hdfs://somewhere'