To be used inside a language-learning web application, are you aware of information structures and underlying database schema/ layout that will allow efficient storage, processing and querying of sentences, verbs, nouns etc. for various natural languages? For instance I must store each verb only one time and link sentences to some verb object etc.

I discovered concrete syntax trees and i'm considering make use of an abstract Node class and derive Noun class from this etc. Would a syntax tree structure be too limited?

I understand this is a reasonably broad question and I don't require that you do my 'homework' but when you can point me to the assets you realize of that might help me get began that might be greatly appreciated.



Your example looks pretty solid when it comes to natural language/sentences manipulation.

About other available choices.. for text search/storage, you may have a look at Patricia tree. There's implementation from it in Java on Google code.

Also, have you think about using among existing solutions, like Hunspell, Lucene or Sphinx?