First, a caveat: I am rather a new comer to the concepts behind document databases, which means this might be a completely apparent question.

I have to design a method that keeps an in-depth hierarchical catalog of parts that comprise highly-complex items. It'll detail the physical components that comprise each part - and every part could be aspects of other areas - completely as much as the ultimate product. Such as this:

Widget
 |- Sprocket
 |  |- A4 nut
 |  |- B15 screw
 |  |- Sprocket Backshell
 |- Flange
 |  |- A4 washer
 |  |- Flange Housing
 |- Widget Assembly

Widget within this example could then later be incorporated as part underneath another Product.

Each product could have hundreds-of-1000's to 100s-of-1000's of parts and, each kind of component has different, unrelated qualities that must definitely be maintained. These qualities may include the connections between related parts.

There exists a poorly-designed version of the system in position at this time, in SQL Server, like a single flat table about 120 posts and many million records. About 85% from the fields within this table are null. My job would be to replace this with some thing maintainable and efficient and fewer error-prone.

Building this effectively inside a relational database means normalization - within this situation, getting one table for every kind of part. This is something of the problem I suspect as you will find 100s of part types, and new part types using their own specific qualities are added regularly.

Let me use RavenDB with this, like a document database would suit the dynamic character from the parts, but I'm not sure if it would be a great fit, or how I'd implement the machine when it comes to documents. The items are the main objects, but due to their size I can not manage to store an item like a single document.

Is RavenDB a great fit with this concept? What are the pointers how better to represent it in documents?

Erik, whether a document database is a great fit in cases like this and what type of structure might be best is dependent on which type of procedures for you to do about this data.

You are able to certainly use RavenDB to persist your parts and items, but to reply to the question if it's the ideal choice, you have to answer the next question:

  • What type of queries do you want?
  • Furthermore performance-important to the body, reads or creates?
  • Are you able to accept eventual consistency?
  • Are you able to take adavantage of features like map/reduce indexes?
  • etc.

This appears like an excellent fit for any relational database:

Part
  - PartId
  - Name
  - ...more fields..

PartRelations
  - ParentPartId (PK)
  - ChildPartId (PK)

By using their schema, you can easily build massive hierarchies of parts, storing the part information just one time, after which just storing the connection of parent part to child part.

You can further add the attribute model for this to permit your parts to possess various different qualities

PartAttribute
   - PartId
   - Attribute
   - Value

Attribute, could further be divided should you wanted into another table Attribute

Attribute
   - AttributeId
   - AttributeName
   - ..datatype?.. (for pretty formatting logic or other..)

Then within the PartAttribute table, you could utilize AttributeId instead of Attribute. Used, I have discovered that simply utilizing a string instead of a characteristic table that lists them out is a little simpler for coding and merely as simple to keep

You could utilize a document database, but since the pieces are extremely related, I am unsure it is a good fit, I am sure many people might disagree beside me though.