I have to extract data from an incoming message that may be in almost any format. The removed data to keep can also be based mostly on the format, i.e. format A could extract area X, Y, Z, but format B could extract area A, B, C. I should also view Message B by trying to find area C inside the message.
At this time I am setting up and storing a the extraction strategy (XSLT) and performing it at runtime when it is related format is experienced, but I am storing the removed data within an Oracle database being an
XmlType column. Oracle appears to possess pretty poor development/support for
XmlType because it requires a classic jar that can make you make use of a pretty old DOM
DocumentBuilderFactory impl (appears like Java 1.4 code), which collides with Spring 3, and does not play very nicely with Hibernate. The XML queries are slow and non-intuitive too.
I am concluding that Oracle with
XmlType is not an excellent way to keep the removed data, so my real question is, what's the easiest method to keep serialized/queryable data?
One alterative that you simply haven't listed is applying an XML Database. (Observe that Oracle is among the ten approximately XML database items.)
(Clearly, a blob type will not allow querying "inside" the endured XML objects unless of course you read each blob instance into memory and perform the querying there e.g. using XSLT.)
I've had positive results in storing complex xml objects in PostgreSQL. Along with the functional index features, you may also create indexes on node values from the saved xml files, and employ individuals indexes to complete extremely fast searches using index scans without needing to reparse the XML file.
Nevertheless this is only going to work knowing your query designs, arbitrary xpath queries is going to be slow also.
Example (untested, consists of syntax errors without a doubt):
Produce a simple table:
create table test123 ( int serial primary key, myxml text )
Now allows assume you've xml documents like:
<test> <name>Peter</name> <info>Peter is a <i>very</i> good cook</info> </test>
Now produce a function index:
create index idx_test123_name on table123 using xpath(xml,"/test/name");
Now would you fast xml searches:
SELECT xml FROM test123 WHERE xpath(xml,"/test/name") = 'Peter';
Opt for creating a catalog using text_pattern_operations, so that you can have fast prefix searches like:
SELECT xml FROM test123 WHERE xpath(xml,"/test/name") like 'Pe%';