Background: We're moving an internet site located on the us dot internet based custom Content management systems to Wordpress.

The Issue: This content within the various posts consists of links with other content within the Content management systems. These links happen to be place in by hand and retain the entire URL beginning from http. In the end have moved all of the publish contents to Wordpress utilizing a php script, the hyperlinks inside the content still indicate that old links. Because the URL structure has transformed there doesn't appear to become a programmatic method of changing the hyperlinks.

Illustration of that old url:

Illustration of the brand new url:

Request: I want applying for grants exactly how should we handle this without requiring to alter all links by hand.

Thanks ahead of time.

I can not think about an excellent method of doing this but this is a thought. You are able to operate a command line script to loop total the web pages then loop total the hyperlinks and show the consumer the initial link and also the "recommended" link. The recommended link may be the new format most abundant in common category title with choice to switch to the other category names.

If you won't want to write the script you are able to alternatively make use of a text editor like notepad++ or vim/gvim. In notepad++ you would employ the replace 'search mode' as 'regular expression' with vim you would employ the confirm flag from the substitute command (:%s/foo/bar/gc).

I am doing such like right now, moving an enormous static html store to on operate on django (it's painful and bloody).

Our solution is not anything particularly elegant. Throughout migration of every page, we're observing that old url, then your new url, and add these to a redirect database. Once we have migrated our content towards the new after sales and url structure, we are managing a script which will identify all links within the document with one of these xpath selectors:


Next we pull-up the redirects from your redirect table and replace the hyperlinks using the regexes below.

#escape special characters to avoid problems with the regex
link = link.replace('#', r'\#')
link = link.replace('.', r'\.')
link = link.replace('/', r'\/')
link = link.replace(':', r'\:')

#compile a regex, using the source link, and replace all existing links
repl_regex = r'href\s{0,}\=[\s\"\']{0,}(%s)[\s\"\']{0,}'%link
markup = re.sub(repl_regex, 'href="%s"'%dst_url, markup)

#repeat for images
repl_regex = r'src\s{0,}\=[\s\"\']{0,}(%s)[\s\"\']{0,}'%link
markup = re.sub(repl_regex, 'src="%s"'%dst_url, markup)

#Let me know if you have any questions, the above is written in python
#and it sounds like you're using php and a .net language.

Now although this technique is most likely more work than you would like, and can require a bit more upfront preparation, it's two advantages:

1) By evaluating every link inside a document to some redirect table, you'll have the ability to easier identify missing pages / missing redirects

2) Search engine optimization. Rather than making the googlebot recrawl your whole site, simply provide 301 redirects against your redirect table

Tell me for those who have any queries.

If you're able to create a mapping between your number within the URL along with a category title, then its achievable. You search and replace all files having a regex to locate Web addresses from the form and also you replace all of them with the brand new URL.