Suppose I've got a large ammount of heterogeneous JSON documents (i.e. named key-value mappings) along with a hierarchy of classes (i.e. named sets) these documents are affixed to. I have to setup an information structure that will permit:
- CRUD procedures on JSON documents.
- Locating JSON documents by ID really rapidly.
- Locating all JSON documents that are affixed to a particular class really rapidly.
- Editing class hierarchy: adding/removing classes, ordering them.
I have initially emerged with the thought of storing JSON documents inside a document-oriented database (like CouchDB or MongoDB) and storing class hierarchy within an RDF storage (like 4store).
4 are then determined naturally, and
3 solved by preserve listing of attached document IDs for each class within the storage.
However I believed that the RDF storage could really perform the document-oriented a part of locating JSON documents by ID. In a first glance this appears true, but I am still worried about
3. It is possible to RDF storage that's in a position to retrieve documents (nodes) in a speed document-oriented db's serve documents? How quickly does it serve
3-like queries? I have heard a bit about RDF storages being slow, reification problem, etc.
Can there be an RDF storage that's also as comfortable for casual locating objects by ID, as CouchDB, for instance? What's the distinction between using document-oriented and RDF storage for storing, locating and editing JSON-like objects?
You initially requested this for graph databases (like Neo4j). This is exactly why Let me then add notes.
- Graph databases use integrated indexing for nodes (and associations) therefore the fast initial research for that root nodes of the documents is performed via that (exterior or perhaps in graph indexes)
- Additional in graph indexes for pathways (really trees towards the root) could be modelled cleaner that simply a vital-value research)
- Should you model your documents as trees of nodes with qualities that you can do any simple, and complex CRUD procedures (also structural)
- locating all documents of the "type" or "class" can again be carried out by a index (index root nodes to type) or perhaps in graph category nodes
- place the individuals "types or class" category-nodes right into a hierarchy (or graph) which in turn could be edited while using usual graph database API
- crossing the graph can be achieved using traversers / integrated graph query language (e.g. cypher for Neo4j)
- Loading hierarchical data may either be carried out by custom importers or perhaps a more general sub-graph importer (e.g. GEOFF)