Most triple stores I just read about are stated to become scalable close to .5 billion triples.

I'm interested to understand if people think there's a theoretical reason to why they need to come with an maximum, and whether you realize associated with a particular ways to ensure they are more scalable.

I'm curious to understand if existing triple stores do such things as this:

  • Represent URIs with integers
  • Integers so as
  • Search the integers rather than the URIs that we would imagine should be faster (since you can do such things as a binary search etc.)

    Ideas ...

  • Just to get at 500million a triple store needs to do all that and much more. I've spent many years focusing on a triple store implementation, and I will tell you that breaking 1 billion triples isn't as simple as it might appear.

    However , many rdf queries are second or 3rd order (and greater-orders are not even close to uncommon). What this means is that you're not only querying some organizations, but concurrently the information concerning the group of organizations data concerning the organizations schemas data explaining the schema language accustomed to describe the organizations schemas.

    All this without the constraints open to a relational database to let it make presumptions concerning the form of this data/metadata/metametadata/etc.

    You will find methods for getting beyond 500 million, but they're not even close to trivial, and also the low hanging fruit (ie. the approaches you've pointed out) were needed just to get at where we're now.

    That being stated, the versatility supplied by an rdf-store, coupled with a denotational semantic available via its interpretation in Description Logics, causes it to be all useful.