Hi im writing an internet crawler in python to extract news articles from news websites like nytimes.com. i wish to understand what will be a good db for a after sales with this project?

Thanks ahead of time!

This may be an excellent project to utilize a document database like CouchDB, MongoDB, or SimpleDB.

MongoDB includes a located solution: http://mongohq.com. There's also a binding for Python (Pymongo).

SimpleDB is a superb choice if you're hosting this on Amazon . com Web Services

CouchDB is definitely an free package in the Apache Foundation.

Personally, I really like PostGreSQL -- but other free DBs for example MySql (or, for those who have reasonably small quantities of data -- a couple of GB for the most part -- the SQLite that comes with Python) is going to be fine too.

I believe the database itself will most likely be among the simpler facets of an internet crawler such as this.

If expect high load reading through or writing the database (for instance if you plan to operate many spiders simultaneously) then you will need to steer in direction of MySql, otherwise something similar to Sqlite will most likely would you all right.

You are able to have a look at Firebird

Firebird python driver are developped through the core team