I've got a group of .csv files that I wish to process. It might be far simpler to process it with SQL queries. I question if there's a way to load b .csv file and employ SQL language to consider it having a scripting language like python or ruby. Loading it with something such as ActiveRecord could be awesome.

However , I'd rather not need to operate a database somewhere just before running my script. I souldn't have additionnal installations needed outdoors from the scripting language plus some modules.

My real question is which language and what modules must i use with this task. I looked around and should not find something that suits my need. Could it be even possible?

There's sqlite3, incorporated into python. By using it you may create a database (on memory) and add rows into it, and perform SQL queries.

If you would like neat ActiveRecord-like functionality you need to add an exterior ORM, like sqlalchemy. This is a separate download though

Quick example using sqlalchemy:

from sqlalchemy import create_engine, Column, String, Integer, MetaData, Table
from sqlalchemy.orm import mapper, create_session
import csv
CSV_FILE = 'foo.csv'
engine = create_engine('sqlite://') # memory-only database

table = None
metadata = MetaData(bind=engine)
with open(CSV_FILE) as f:
    # assume first line is header
    cf = csv.DictReader(f, delimiter=',')
    for row in cf:
        if table is None:
            # create the table
            table = Table('foo', metadata, 
                Column('id', Integer, primary_key=True),
                *(Column(rowname, String()) for rowname in row.keys()))
        # insert data into the table

class CsvTable(object): pass
mapper(CsvTable, table)
session = create_session(bind=engine, autocommit=False, autoflush=True)

You can now query the database, blocking by any area, etc.

Suppose you take the code above about this csv:

Afila Tun,32,afilatun
Foo Bar,33,baz

And build and populate a table in memory with fields name, age, nickname. After that you can query the table:

for r in session.query(CsvTable).filter(CsvTable.age == '32'):
    print r.name, r.age, r.nickname

Which will instantly create and operate a SELECT query and return the right rows.

An additional of utilizing sqlalchemy is the fact that, if you choose to use another, more effective database later on, that can be done pratically without altering the code.

Make use of a DB inside a library like SQLite. You will find Python and Ruby versions .

Load your CSV into table, there can be modules/libraries that will help you here too. Then SQL away.

Checked out Perl and and Text::CSV and DBI? You will find many modules on CPAN to complete exactly this. Here's a good example (from HERE):

use strict;
use warnings;
use DBI;

# Connect to the database, (the directory containing our csv file(s))

my $dbh = DBI->connect("DBI:CSV:f_dir=.;csv_eol=\n;");

# Associate our csv file with the table name 'prospects'

$dbh->{'csv_tables'}->{'prospects'} = { 'file' => 'prospects.csv'};

# Output the name and contact field from each row

my $sth = $dbh->prepare("SELECT * FROM prospects WHERE name LIKE 'G%'");
while (my $row = $sth->fetchrow_hashref) {
     print("name = ", $row->{'Name'}, "  contact = ", $row->{'Contact'}. "\n");

name = Glenhuntly Pharmacy  contact = Paul
name = Gilmour's Shoes  contact = Ringo

Just type perldoc DBI and perldoc Text::CSV in the command prompt for additional.

CSV files aren't databases--they've no indices--and then any SQL simulation you enforced upon them would add up to nothing more than searching car factor again and again again.