Sometimes for any college, and previously year we finally broke from our static HTML site of countless 1000 pages and gone to live in a Drupal site. This clearly entails massive levels of data entry.

Let's say you are already utilizing a Content management systems and therefore are switching to a different one which better suits your requirements? How can you minimize the mountain of information entry throughout this type of huge change? Exist tools designed for this, or some guidelines you ought to follow?

The Migrate module for Drupal provides a large help. The Economist.com data migration to Drupal provides you with an introduction to the procedure.

The video in the Migration: not only for that wild birds presentation at Drupalcon Electricity 2009 is most likely somewhat out-of-date, but additionally provides a good introduction.

  • Have a much to both pre-process and publish-process your computer data by hand, no matter what. Accept in early stages that the data will probably be inside a worse condition than you believe it is: fields is going to be misused record-to-record references (foreign secrets) is probably not implemented correctly, or whatsoever content will probably need weeding and from time to time to become just bad or incorrect.

  • Look at your database encoding. Older databases will not maintain Unicode encodings, and obtain irritated if you need to export data dumps and import them elsewhere. Even so, think that there will be some crazy nonprintable figures inside your data: programs like Word appear to in some way inject them everywhere, and I have seen... codepoints... you people wouldn't believe. Consider sweeping your computer data even before you start (as well as sweeping a database dump) of these figures. Decide if you should junk them or attempt to convert them within the situation of e.g. Word "wise" punctuation figures.

  • It's tough to create explicit data structures from implied one. In case your incoming data includes a separate date area, you are able to map that to some date area if it features a date included in a large lump of HTML, even when that date is within a tag by having an id attribute, simple scripting will not work. You could utilize offline scripting with BeautifulSoup or (in case your HTML's a little better) the faster lxml to pre-process your computer data set, extract individuals implicit fields, and save them into an implicit format. Consider creating medium difficulty database where these revisions are likely to go.

  • The Migrate module is great, but to obtain great data fidelity and play more clever methods you will need to understand about its hook system (Drupal's terminology for functions carrying out a particular naming plan) and also the fundamentals of writing a module to place these hooks in (a module is broadly only a PHP file where all of the functions begin with similar text, the title from the module file.)

  • All imported content ought to be flagged not less than a general check. This can be done by posting it with status= i.e. unregistered, after which produce a view using the Sights module to undergo this content and open it up in other tabs for checking. Sights Bulk Procedures allows you've got a group of checkboxes alongside your view products, which means you could approve many nodes at the same time.

  • Be prepared to run and re-run and re-run the import, fixing something totally new each time. Check ten, or twenty products, as soon as possible. If you will find any problems, check ten or twenty more. Fix and repeat the import.

  • Gauge how lengthy just one import run will probably take. Be pessimistic: we'd an import we likely to take ten hrs encounter exponential downturn whenever we introduced the entire data set until we finally fixed some slow queries, it had been forecasted to consider two days.

  • If uncertain, or you think the technical aspects of the aforementioned are simply likely to harder compared to work itself, then just hire temps to complete the information. However, you still need decent quality controls, as soon as possible throughout their work. Drupal designers will also be services: try your country's relevant IRC funnel, or publish an email inside a relevant groups.drupal.org group. They are more costly than temps however they usually write better PHP...! Consider employing a company too: this is a shameless plug, when i work with one, but sometimes it is best to get experts set for these jobs.

  • Great imports will always be hard, harder than you anticipate. Don't allow it enable you to get lower!