So I am assembling an RSS parser that will process an Feed, filter it, after which download the matched up products. Think that the files being downloaded are legal torrent files.

Now I have to keep track from the files which i have previously downloaded, so that they aren't done again.

I have already first got it dealing with SQLite (create database otherwise is available, place row if your choose statement returns nothing), however the resulting jar file is 2.5MB+ (because of the sqlite libs).

I am convinced that basically make use of a text file, I possibly could cut lower the jar file to some couple of hundred kilobytes.

I possibly could keep a listing from the names of files downloaded Body per line - and reading through the entire file into memory, search if your file is available, etc.

The couple of questions that happen to me know:

  • Say if 10 files are downloaded each day, would the written text file method finish up taking an excessive amount of assets?
  • Overall which is faster

Anyway, exactly what do everyone think? I possibly could apply certain advice here, as I am still a new comer to programming and carrying this out like a hobby factor :)

If you want to keep a record only of couple of informations (like title from the file), you are able to without a doubt make use of a simple text file.

Utilizing a BufferedReader to see you need to achieve good performance.

Theoretically DB (either relational or NoSQL is much better. But when the distribution dimensions are crucial for you using file system could be more suitable.

The only issue this is actually the performance of information access (because of write or read). Most likely consider the next approach. Don't use a single file. Use directory that consists of several files rather. The file title will contain key (or secrets) that permit access specific data much like type in map. Within this situation you'll have the ability to access data relatively easily and fast.

Most likely have a look on XStream. They've implementation of Map that's implemented as referred to above: stores records on disk, each entry in separate file.