I have to read a 200mb "space"-separated file line-by-line and collect its contents into an assortment.

Each time I run the script, Perl throws an "from memory" exception, however i do not understand why!

Top tips please?

#!/usr/bin/perl -w
use strict;
use warnings;

open my $fh, "<", "../cnai_all.csd";
my @parse = ();

while (<$fh>) {
     my @words = split(/\s/,$_);
     push (@parse, \@words);

print scalar @parse;

the cnai file appears like this: it consists of 11000 rows and 4200 values, seperated by "space", per line.


The code above is simply a removed lower sample.
The ultimate script will store all values inside a hash and write it to some database later .

However, I must solve that memory problem!

That might be because... you are drained of memory!

You are not basically storing 200MB of information. You are developing a new list data structure for every line, wonderful its connected overhead, as well as creating a lot of separate string objects for every word, wonderful their connected overhead.

Edit: To illustrate the type of overhead we are speaking about here, every single value (which includes strings) has the following overhead:

/* start with 2 sv-head building blocks */
#define _SV_HEAD(ptrtype) \
    ptrtype sv_any;     /* pointer to body */   \
    U32     sv_refcnt;  /* how many references to us */ \
    U32     sv_flags    /* what we are */

#define _SV_HEAD_UNION \
    union {             \
    char*   svu_pv;     /* pointer to malloced string */    \
    IV      svu_iv;         \
    UV      svu_uv;         \
    SV*     svu_rv;     /* pointer to another SV */     \
    SV**    svu_array;      \
    HE**    svu_hash;       \
    GP* svu_gp;         \
    }   sv_u

struct STRUCT_SV {      /* struct sv { */

So that's a minimum of 4 32-bit values per Perl object.

Generally which means you are drained of memory for Perl, but it is possible you are not drained of system memory. First of all, you will find ways you can a lot of perl's memory usage within the perl debug guts doc -- though you might find yourself recompiling perl, then. (Also note the warning for the reason that doc about perl's hunger for memory...)

However, many os's it's possible for memory limits to become set per-process or per-user. If, for instance, you are using Linux (or any other POSIX system) you will need to change your ulimits. Type 'ulimit -a' and check out your memory dimensions it is possible your 'max memory size' is underneath the memory inside your machine -- or you've got a limited data seg size. After that you can totally reset it using the appropriate option, e.g. ulimit -d 1048576 for any 1GB data seg size limit.

Obviously, there's an alternative choice: process the file line-by-line, in case your situation enables it. (The example code above could be rewritten in a way.)