"issues with content migration" by deane barker at content workshops 2012
DESCRIPTION
"Issues with Content Migration" by Deane Barker at Content Strategy Workshops 2012 in Portland, OR, October 9 2012.TRANSCRIPT
Issues in Content Migration
Deane Barker Blend Interactive
They’re painful.
[The End]
Blend Interactive
! Based in Sioux Falls, South Dakota ! Specialize in large-scale content
management implementations and migrations ! EPiServer ! eZ publish ! TerminalFour
Definition: The one-time movement of
content from one publishing platform to a different publishing
platform.
“Migration” vs. “Implementation”
Editorial Process vs.
Technical Process
The Four Phases
1. Inventory 2. Mapping 3. Transfer 4. QA
Phase #1: Inventory
! What content is moving? ! What content can we get rid of?
! How can it be grouped? ! What content requires special handling? ! What content requires changes? ! How volatile is the content?
Don’t move bad content.
This is the time for spring-cleaning.
Start your inventory as early as possible.
Before you start development.
Even before you pick a new platform.
Be prepared for this process to get highly politicized.
Keep your inventory systematic and organized.
Have a central point of focus and
record-keeping.
Inventory Outputs
! List of content that will migrate divided into logical groups
! List of content that will require special handling
! List of content that will require changes along with scope
Phase #2: Mapping
! How is content going to “fit” and work in the new platform?
! What changes will be required to rich text content?
! How is the overall structure of the content going to transfer?
What HTML is templated and what HTML is embedded?
Content has different levels of “geography”
Some content is very specifically
placed, while other content is automatically organized.
Home
Products
Product A
Product B
About
History
Press Release
Highly-geographical content is much harder to migrate.
You have to migrate both the content and the placement.
Home
Products
Product A Product B
About
History
Stub Mapping
Existing Home
Products
Product A Product B
About
History
New
Mapping Outputs
! An understanding of where all content is going in the new platform and why
! Page stub structure
Phase 3: Transfer
! How are the actual bytes moving from one system to another?
! Key Questions ! Repository or publication extraction ! Embedded URL resolution ! Markup transformation ! Automated vs. manual migration
Migrating out of a CMS is a lot easier than the alternative.
CMS enforces at least some
consistency.
Are you going to extract from the repository level or the
publication level?
Repository vs. Publication Extraction
Repository HTML
Processing
How will URLs change on the new platform?
How interlinked is your content?
How are you going to keep all
those links valid?
Embedded URL Resolution
! If you have embedded URLs, they are now broken.
! How do you “re-connect” these URLs to the correct content?
! Usually performed as some kind of batch job. ! You rarely get 100% accuracy. ! Prepare to catch the remainder in QA.
Embedded URLs
Always store the old URL for a migrated page of content.
Once migrated, use the old URL to do a lookup in your 404
handler.
If you can preserve binary file URLs, do so. Your new CMS will
likely make this easier.
Content Transformation
Common Transformations
Common Transformations
What is the actual mechanism of movement?
Copy-and-paste?
Automated?
When Copy-and-Paste Works
! When you don’t have a lot of content ! When you have access to cheap labor ! When your content is highly geographic ! When you cannot automate transformation ! When you have enough resources for
sufficient QA
When Automated Migration Works
! When you have large volumes of content ! When your content is not highly-geographic ! When you have sufficient technology and/or
development resources
You don’t have to use the same method for your entire project.
Automated Migration Tools
! Great answer to the Transfer phase ! Less of an answer to everything else ! They still have to be configured and tested
Transfer Output
! Content ready for QA ! Outputs from this phase will likely be
segmented
Phase 4: QA
! How much content is going to be reviewed for compliance? ! All of it? ! A representative sample?
! Who has the authority to clear individual content, and the site as a whole, for release?
The Dreaded Content Freeze
! Once you start migrating from A to B, content changes on A need to stop
! Length of the freeze window depends on the volatility of the content
Types of QA ! Technical QA
! Did this content transfer well? ! Does it look broken? ! Does it comply with the style guide?
! Editorial QA ! Is this content valid and correct? ! Where any errors introduced during transfer?
Ideally, track the QA process inside the CMS itself.
During QA, reporting is key.
You should have access to a daily number showing the percentage
of content cleared.
The Four Phases
1. Inventory 2. Mapping 3. Transfer 4. QA