Open Source Repository Upgrades:
Top Advice from Practitioners
Erin Tripp, Business Development Manager
Wednesday, March 28
@eeohalloran on Twitter
This work is licensed under a Creative Commons Attribution 2.0 Generic License.
Slides http://bit.ly/2FJoVS5
Notes http://bit.ly/2FDvWIp
Q&A via Zoom
“To migrate content is no
small or easy task.
Migrations require
significant time and
funding.”
(Gilbert & Mobley, 2013)
Open Source Repository Upgrades
Image Credit: Audrey Sage
Migration/Upgrade Survey Project
Migration/Upgrade Survey Project
The advice falls into four
categories:
1. Planning
2. Metadata Normalization
3. Migration
4. Verification
Planning and Metadata
Normalization are the most complex
Image Credit: liftarn
Migration Advice
Planning
Planning
1. Stakeholder Communication and
Setting Expectations
2. Build you Team, Testing
Software and Tools
3. Refine Requirements, Estimate
Work, Define Scope
Image Credit: lmproulx
Stakeholder Communication and Setting Expectations
● Migrations are inevitable. Prepare stakeholders to move data every 3-5 years
● For stakeholders, a migration is about changes to a service, trust, credibility
● Pay attention to the political climate and possible blockers
● Talk to others who are migrating (especially the same path)
● Be prepared to make mistakes and learn from them
● A migration is a big deal, especially if this is your first one or there are many
new components
Build you Team, Testing Software & Tools
● Review and test software and tools you’ll use
“As part of the analysis, we set up sandboxes of all of the alternatives for the
campuses to try. We have our IR task force that worked on a survey of
requirements [...]. We used that as a base and compared it to the experiences people
recorded from the sandboxes.”
● Test moving your data with 10% representative subset
“Give people flexibility to start working on it. [...] A lot of our planning hasn't
always been as useful as we'd like. We didn't know enough about it to plan. [...]
giving people time to look at the product is good to do before planning.”
Refine Requirements, Estimate Work, Define Scope
“Determining the difference between something valid and something "we've always
done it that way" took extended review.”
“I would have had more tech coordination at a high level with more discussion to
decide scope of what features should be included in version 1. It could have impacted
migration pace.”
“Have a shared commitment in library and IT staff to only make customizations
judiciously. They will impact migrations/ upgrades.”
“Take time to be thoughtful in the way you build and deploy environments to ease
future upgrades/ migrations.”
Metadata
Normalization
Metadata Normalization
Metadata will not be perfect
Enrichment or normalization projects
can be done at any time, even without a
migration requirement.
Highly recommend to normalize
metadata before migrating/ moving
data
Cited as one of the biggest challenges
and advantages reported by
practitioners
Metadata Normalization
“I was surprised how long data modeling and normalization took. I would have
started the project there instead of focusing so much time on what system seemed
best.”
“[...] give yourself (at least) double the amount of time you think you'll need,
because you will find plenty of metadata enrichment mini-projects along the way,
which is not a bad thing.”
“I would try to do a better job of managing the scope of the metadata enrichment. I
had a big opportunity to edit and polish. But there's a line between polishing as
much as possible and what should be done as a post-migration project.”
Migration
Migration
● When moving data, plan for
iteration
● Be willing to change a decision,
method, or process if you need to
● Be agile
● Effective planning doesn’t mean
migration will go perfectlyImage Credit:
sixsixfive
Migration
“Develop a minimum viable product as soon as possible. Our project charter didn't
help us understand the minimum requirements of our stakeholders until well into our
development and implementation phase.”
Verification
Verification
“The verification may be difficult. You might not be able to have bit-level
reproducibility. You want to check checksums and then you actually want to
use the data. You need expertise to parse the verification. It's not always a
technical process. Often it's based on collective knowledge. Often it's based on
collective knowledge. Collective knowledge is a difficult thing to preserve
because everyone knows it and no one writes it down. ”
Verification
● Assign a metadata specialist to do spot checking during the migration
● Write scripts for post-migration data validation, e.g.:
○ Checksums
○ Number of objects
○ Resolve identifiers
● Assign data managers to manually review subsets of content
Reflect & Share
Reflect & Share
● Document the process
● Review what went well and what
didn’t
● Adapt plans for next migration
● Share knowledge so others may
benefit from itImage Credit: Minduka
Reflect & Share
“A challenge to this is that some people will keep specialized knowledge to
themselves to give them an advantage. It's such a competitive landscape.
People are not rewarded for sharing knowledge. It's a barrier when trying to
interpret data from the distant past.”
Discussion
Discussion
● Do you have questions?
● Do the survey results align with your experience?
● What do you want to know more about?
● How can we help each other?
Thank You for Coming!
Erin Tripp l etripp@duraspace.org l @eeohalloran

3.28.18 "Open Source Repository Upgrades: Top Advice from Practitioners" Presentation Slides

  • 1.
    Open Source RepositoryUpgrades: Top Advice from Practitioners Erin Tripp, Business Development Manager Wednesday, March 28 @eeohalloran on Twitter This work is licensed under a Creative Commons Attribution 2.0 Generic License.
  • 2.
  • 3.
    “To migrate contentis no small or easy task. Migrations require significant time and funding.” (Gilbert & Mobley, 2013) Open Source Repository Upgrades Image Credit: Audrey Sage
  • 4.
  • 5.
  • 6.
    The advice fallsinto four categories: 1. Planning 2. Metadata Normalization 3. Migration 4. Verification Planning and Metadata Normalization are the most complex Image Credit: liftarn Migration Advice
  • 7.
  • 8.
    Planning 1. Stakeholder Communicationand Setting Expectations 2. Build you Team, Testing Software and Tools 3. Refine Requirements, Estimate Work, Define Scope Image Credit: lmproulx
  • 9.
    Stakeholder Communication andSetting Expectations ● Migrations are inevitable. Prepare stakeholders to move data every 3-5 years ● For stakeholders, a migration is about changes to a service, trust, credibility ● Pay attention to the political climate and possible blockers ● Talk to others who are migrating (especially the same path) ● Be prepared to make mistakes and learn from them ● A migration is a big deal, especially if this is your first one or there are many new components
  • 10.
    Build you Team,Testing Software & Tools ● Review and test software and tools you’ll use “As part of the analysis, we set up sandboxes of all of the alternatives for the campuses to try. We have our IR task force that worked on a survey of requirements [...]. We used that as a base and compared it to the experiences people recorded from the sandboxes.” ● Test moving your data with 10% representative subset “Give people flexibility to start working on it. [...] A lot of our planning hasn't always been as useful as we'd like. We didn't know enough about it to plan. [...] giving people time to look at the product is good to do before planning.”
  • 11.
    Refine Requirements, EstimateWork, Define Scope “Determining the difference between something valid and something "we've always done it that way" took extended review.” “I would have had more tech coordination at a high level with more discussion to decide scope of what features should be included in version 1. It could have impacted migration pace.” “Have a shared commitment in library and IT staff to only make customizations judiciously. They will impact migrations/ upgrades.” “Take time to be thoughtful in the way you build and deploy environments to ease future upgrades/ migrations.”
  • 12.
  • 13.
    Metadata Normalization Metadata willnot be perfect Enrichment or normalization projects can be done at any time, even without a migration requirement. Highly recommend to normalize metadata before migrating/ moving data Cited as one of the biggest challenges and advantages reported by practitioners
  • 14.
    Metadata Normalization “I wassurprised how long data modeling and normalization took. I would have started the project there instead of focusing so much time on what system seemed best.” “[...] give yourself (at least) double the amount of time you think you'll need, because you will find plenty of metadata enrichment mini-projects along the way, which is not a bad thing.” “I would try to do a better job of managing the scope of the metadata enrichment. I had a big opportunity to edit and polish. But there's a line between polishing as much as possible and what should be done as a post-migration project.”
  • 15.
  • 16.
    Migration ● When movingdata, plan for iteration ● Be willing to change a decision, method, or process if you need to ● Be agile ● Effective planning doesn’t mean migration will go perfectlyImage Credit: sixsixfive
  • 17.
    Migration “Develop a minimumviable product as soon as possible. Our project charter didn't help us understand the minimum requirements of our stakeholders until well into our development and implementation phase.”
  • 18.
  • 19.
    Verification “The verification maybe difficult. You might not be able to have bit-level reproducibility. You want to check checksums and then you actually want to use the data. You need expertise to parse the verification. It's not always a technical process. Often it's based on collective knowledge. Often it's based on collective knowledge. Collective knowledge is a difficult thing to preserve because everyone knows it and no one writes it down. ”
  • 20.
    Verification ● Assign ametadata specialist to do spot checking during the migration ● Write scripts for post-migration data validation, e.g.: ○ Checksums ○ Number of objects ○ Resolve identifiers ● Assign data managers to manually review subsets of content
  • 21.
  • 22.
    Reflect & Share ●Document the process ● Review what went well and what didn’t ● Adapt plans for next migration ● Share knowledge so others may benefit from itImage Credit: Minduka
  • 23.
    Reflect & Share “Achallenge to this is that some people will keep specialized knowledge to themselves to give them an advantage. It's such a competitive landscape. People are not rewarded for sharing knowledge. It's a barrier when trying to interpret data from the distant past.”
  • 24.
  • 25.
    Discussion ● Do youhave questions? ● Do the survey results align with your experience? ● What do you want to know more about? ● How can we help each other?
  • 26.
    Thank You forComing! Erin Tripp l [email protected] l @eeohalloran