I process lots of text/data which i exchange between Python, R, and often Matlab.
My go-to may be the flat text file, but additionally use SQLite from time to time to keep the information and access from each program (not Matlab yet though). I do not use GROUPBY, AVG, etc. in SQL around I actually do these procedures in R, and so i don't always require database procedures.
For such programs that needs swapping data among programs to take advantage of accessible libraries in every language, it is possible to good guideline which data exchange format/approach to use (even XML or NetCDF or HDF5)?
I understand between Python -> R there's rpy or rpy2 but I'm wondering relating to this question inside a more general sense - I personally use many computer systems which all do not have rpy2 as well as make use of a couple of other bits of scientific analysis software that need accessibility data at various occasions (the stages of processing and analysis will also be separated).
If all of the languages support SQLite - utilize it. The energy of SQL is probably not helpful for you at this time, however it most likely is going to be sooner or later, also it helps you save needing to rewrite things later when you choose you need to have the ability to query your computer data in additional complicated ways.
SQLite will even most likely be substantially faster should you simply want to access certain items of data inside your datastore - since doing by using a set-text file is challenging without reading through the entire file in (climax not possible).
A set text file (e.g. in csv format) will be the most portable solution. Nearly every program/library can function with this particular format: R and Python have good csv support and when your computer data set is not too big you may also import the csv into Stand out for more compact tasks.
However, text files are unhandily for bigger data sets as you have to see them completely for most procedures (with respect to the structure of the data).
SQLite enables you to definitely filter the information effortlessly (even with little SQL experties) so that as you already pointed out can perform some computation by itself (AVG, SUM, ...). While using Opera Plug-in SQLiteManager you are able to use the DB on every computer with no installation/configuration trouble and therefore easily manage your computer data (import/export, filter).
So I would suggest to make use of SQLite for bigger data sets that requires lots of blocking to extract the information that you'll require. For more compact data sets or maybe there's you don't need to choose subsets of the data a set (csv) text file ought to be fine.