We are continuing with our series of blog posts about institutions sharing their migration stories. Today we are joined by the team at the University Library of Ludwig-Maximilians-Universität München (LMU Munich). They are sharing their story of migrating content from their pilot project titled “VerbaAlpina”.
Learning from, and understanding how community members are tackling migrations within their institutions has given us the information needed to help shed light on pathways for others looking to do the same. Their lessons learned and their experiences help give a better scope of the work and sharing these stories is helpful for all members of the community.
Here, Jaime Penagos answers our questions about their migration.
Tell us your name and job title:
Jaime Penagos, Developer / SysAdmin (Team: Science-related services)
What institution do you work at?
Can you share a few details about the repository you migrated:
The size of the repository on disk is around 50GB, consisting of approx. 450,000 text-based files. https://discover.ub.uni-muenchen.de/
The files contained in the repository represent the data from our pilot project “VerbaAlpina“. Each dataset consists of 2 files: a metadata file, in a format derived from DataCite “rdUB” <https://github.com/UB-LMU/rdUB>, and a research file containing the the data as a text file in CSV format.
What version of Fedora were you previously running?
What other integrations or other software were you using?
Our environment around Fedora includes a Solr Index and Project Blacklight as GUI to showcase the collections. The integration and synchronization of the 3 systems is done with Apache Camel.
Tell us about some of the challenges or roadblocks you encountered:
One of the biggest challenges we stumbled upon during the first ingest and during data modeling were related to limitations of RDF and Linked Data found in Fedora to describe all the properties we needed to create for our use case. That is the main reason why we decided to use less the RDF properties in this first approach and model all the metadata into the rdUB (https://github.com/UB-LMU/rdUB) file by extending the DataCite schema.
Some challenges for the upcoming steps include extending the functionalities of our workflow to include features from OCFL and allowing us to include new projects with very different use cases and scenarios.
Are there any general suggestions on improvements the Fedora team could make?
- Were there things missing in the migration toolkit?
Not sure if it is included, but a tool to check the checksums of every object / single objects contained in the repository on demand. Otherwise, all tools and communication ways offered by the community have been very helpful when needed.
- Improvements the Fedora team could have made to make the process easier?
Once we decided to test and include Fedora as a solution for our needs at University Library LMU, the process was very efficient and quick.
What would you like other institutions to know about your experience?
One of the main reasons why we decided to upgrade to Fedora 6 was the native support of OCFL, to prepare our data for data preservation. The upcoming projects are going to be built and prepared around Fedora, and we won’t need any additional conversions or preparations for it.
How can community members reach out to you if they have questions?
General contact: email@example.com
Developer contact: firstname.lastname@example.org
A thank you to Jaime and his team for taking the time to share this with us. We were fortunate to hear from them during our summer Open House, and you can watch that conversation here if you missed it: https://youtu.be/6nJbuRmvcuc
If you or your institution has completed a Fedora migration recently and you are willing to share your story, please reach out to us at email@example.com. We are looking for migration stories of all kinds, not just migrations to Fedora 6. We would love to continue gathering your user stories to help us better serve the needs of our community.