I'm presently focusing on an easy revision system that allows me to keep multiple versions of merely one file, which works fine to date.

Table structure is the following (obsolete posts removed with regard to brevity):

file_id     file_revision     file_parent      file_name
1           1                 0                foo.jpg
2           2                 1                foorevised.jpg                 
3           3                 1                anotherrevision.jpg


  • file_id may be the primary key, which auto batches
  • file_revision stores the revision number, defaulting to 1 when it is the very first
  • file_parent may be the top level parent of revision, defaulting to 0 when first.
  • file_name being the file title.

The issue:

  • Ideally utilizing a single query I wish to retrieve all files...
  • Only the most recent revision of every file...
  • ... when just one revision is saved (original), that one ought to be retrieved.

Any pointers are greatly appreciated. Thanks ahead of time.

The best way with regard to retrieval would be to give a column like is_latest which you have to populate ahead of time, then select * from table where file_id=1 and is_latest=true when you wish to seize the most recent version of file 1. Clearly this makes upgrading this table more difficult, however.

A different way to do it might be to keep the most recent versions from the files in a single table, and historic versions in another table. Should you mainly wish to choose all files which are the most recent version, select * from table where is_latest=true could likely add up to a complete table scan even when if is_latest is indexed. When the latest rows were all-in-one table the database can see them out in consecutive IO without having to either 1) perform a large amount of seeks with the table to locate only the records it requires or 2) scan the entire table getting rid of considerable amounts of information on the way for that old records.

Presuming you won't want to alter the existing table design, what for you to do is known as choosing the groupwise maximum, see this short article for many various ways to get it done in mysql.

I would need to accept nick in adding another column towards the table. We'd an identical problem for this on the project previously and that i really was glad that people made the decision to simply add the additional row.

It can make the programming simpler, also it helps make the query faster by getting rid of the subqueries.