I'm focusing on a Java based backup client that scans for files around the file system and populates a Sqlite database using the sites and file names it find to backup. Wouldn't it seem sensible to make use of neo4j rather than sqlite? Could it be more perfomant and simpler for this application. I believed just because a filesystem is really a tree (or graph should you consider symbolic links), a gaph database might be appropriate? The sqlite database schema defines only 2 tables, one for sites (full path along with other info) and something for files (title just with foreign answer to that contains directory in directory table), so its easy.

The applying must index millions of files therefore the solution must be fast.

As lengthy as possible carry out the DB procedures basically using string matching around the saved file system pathways, utilizing a relational databases is sensible. As soon as the information model will get more complicated and also you really can't do your queries with string matching but have to traverse a graph, utilizing a graph database can make that much simpler.

When I comprehend it then among the earliest uses of Neo4j would just do this as part of the Content management systems system Neo4j is originiated from.

Lucene, the indexing after sales for Neo4j, will help you to build any indexes you will need.

You need to educate yourself on might request them directly.