Thoughts on a data migration

When I arrived at Wright State University Libraries in the end of January 2013, as a new librarian, I had no idea that one task would dominate so much of my time here. Shortly after my arrival we were alerted to the fact that our DSpace-based Institutional Repository would no longer be supported. Since I was originally going to be the primary librarian uploading content to CORE (our DSpace IR) I was chosen to be the “migration guy.”

The migration impacted the next year of my life. I was suddenly in charge of moving 5,000 items from our DSpace platform to our new Digital Commons repository. We began talking about the migration in February and March of 2013, and the last nail was driven into the migration’s coffin in December of that year.

I presented on my experience at the 2013 DCGLUG (Digital Commons Great Lakes Users’ Group) Conference, for the OHIODIG (Ohio Digitization Interest Group) in January, and most recently at SOA’s (Society of Ohio Archivists) Annual Conference.  The last two presentations were performed as part of a group. I wanted to use this post to collect what I learned from these presentations:

NUMBER 1: It’s a learning process.

I didn’t know how to perform a website audit. I didn’t know how the handles (persistent URLs) were maintained. I didn’t really know much about our collections. That being said, I investigated how others prepared a website audit (they counted links and pages). I researched the Handle System and ultimately found out that maintaining it would incur a separate expense as well as cost us time. I also learned about the Wright Brothers, a variety of oral histories from the area, and more.

NUMBER 2: It is a learning process, but be practical about how you use your time.

In my research for ways to perform an audit I found that some people used web-crawler software; however, I found it to be unreliable and difficult to retrieve consistent results. So I abandoned it. I manually clicked through every page listing the communities, series, and all the items attached to records.

We were offered a PERL application from Asbury Theological Seminary. I lacked the familiarity to use it properly; thus, I abandoned that too.

I had to make the decision about whether to spend extra time attempting to understand these programs with an unsure final outcome, or I could get to work, performing tedious tasks, but accomplishing something. I choose to start the project.

NUMBER 3: The Most Important Thing about Migrating Your Content…It takes time.

The process will take longer than you estimate. This was a truth echoed by my peers as well. No matter the planning and best efforts, there will always be unforeseen anomalies that need their own special solution. The differences in how the two systems worked often required me to re-evaluate my planning.

In the end I learned a lot about our collections, DSpace, Digital Commons, and most importantly how to plan and carry out a migration between systems. I’m sure, that if I ever have to migrate again, there will be new issues and concerns; however, I have a solid plan of attack and am confident on how to proceed. I’ll just hope I don’t have to perform another migration for a long time…

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>