I've application which retrieves many large log files from the system LAN.
Presently I invest log files on Postgresql, the table includes a column type TEXT and that i don't plan any explore this text column because I personally use another exterior process which nightly retrieves all files and scans for sensitive pattern.
Therefore the column value might be additionally a BLOB or perhaps a CLOB, however my real question is the next, the database has its compression system, but tend to I improve this compression by hand as with common compressor utilities? And most importantly Let's Say I by hand pre-compress the big file after which I put as binary in to the data table, could it be unuseful as database system provides its internal compression?
I'm not sure who'd compress the information more effectively, you or even the db, is dependent around the algo used etc. But what's sure is when you compress it, asking the db to compress it again is a waste of CPU. Once compressed, attempting to compress it again yields less gain every time before you finish up consuming more room eventually.
The interior compression utilized in PostgreSQL is made to err along the side of speed, particularly for decompression. Thus, if you do not really need that, you'll have the ability to achieve greater compression ratios should you compress it inside your application.
Note additionally that when the database does the compression, the information will travel between your database and also the application server in uncompressed format - which might be considered a problem based on your network.
As others have pointed out, should you choose this, make sure to switch off the builtin compression, or you are wasting cycles.
The question you have to request on your own is do you want more compression compared to database provides, and may you spare the CPU cycles with this in your application server. The only method to discover what compression you will get in your information is to give it a try. Unless of course there is a substantial gain, think before by using it.
My prediction here's that if you don't need any searching or querying ability here you could gain a decrease in disk usage by zipping the file after which just storing the binary data directly within the database.