There exists a simple interface to tag a specific question

(e.g. entry has 1..many tags and each tag entry has a foriegn key pointer back to the entry table)

1.    What is the current production version of the jdk? (Tags: jdk6 jdk-6 jdk java)
2.    In what version was java.util.spi package introduced? (Tags: jdk-6, jdk7, jdk5)
3.    Which version of java is going to be released soon? (Tags: jdk-6, jdk7, jdk8)

We wish to merge all tags named as "jdk-6" to jdk6. How can we accomplish this inside a system that is approaching production but consists of helpful data.

In [1] jdk-6 must be removed, since jdk6 has already been present. In [2,3] jdk-6 must be re-named as "jdk6".

What type of scripts should i migrate this data inside a effective fashion.


create table entry (id, question, ...)
create table entry_tag (id, entry_id, tag)

I'd perform the following:

  1. Update the "bad" tags using the doozy (UPDATE TagTable SET Tag = 'jdk6' WHERE tag = 'jdk-6')

  2. Take away the duplicate tags (where entry_id and Tag are identical) . Just how you need to do this can rely on whether you've got a separate unique key up for grabs or otherwise, but a fast google provides you with a number of techniques that actually work under different conditions.

  3. Presuming you've got a TagsList table using the listing of all available tags, remove jdk-6 from this (DELETE FROM TagsList WHERE Tag = 'jdk-6').

I'd first produce a new table with a listing of entry IDs which contain either from the tags 'jdk-6' or 'jdk6'.

I Quickly would remove all tag records for that tags 'jdk6' and 'jdk-6'.

After which I'd add them in while using table produced in the beginning.

/* Step 1 - Delete where both tags exist */
delete from et1
    from entry_tag et1
        inner join entry_tag et2
            on et1.entry_id = et2.entry_id
                and et2.tag = 'jdk6'
    where et1.tag = 'jdk-6'

/* Step 2 - Update remaining tags */
update entry_tag
    set tag = 'jdk6'
    where tag = 'jdk-6'