Many university and national libraries are exploring the best way to support researchers with text and data mining. That’s why on July 5th 2017, OpenMinTeD and FutureTDM organised a workshop about text and data mining at the LIBER conference in Patras. 4 different speakers guided 16 participants through the various aspects of TDM.
How I Learned to Stop Worrying and Love Copyright
Anna Vernon of JISC kicked off by talking about the TDM opportunities and challenges for libraries. Something that inevitably comes up with TDM is intellectual property rights, namely copyright and sui generis rights including database rights. Although the UK has a copyright exception for non-commercial purposes, there are still obstacles. For example: publishers have technical measures to cut off ‘unusual traffic’, including legal TDM activities; workflows, communications and contract negotiations are far from optimal and researchers need more training on the policies and legal aspects. Anna also described that “Universities are uniquely placed to support the uptake of TDM.”. Libraries could support this by creating a “‘safe space’ for TDM, for example a bank of IP addresses that is secured for this purpose.”.
The FutureTDM guidelines
After this, Kiera McNeice of the British Library took over and presented the outcomes of the FutureTDM project. The project is reaching its final stage and has delivered guidelines for TDM practitioners. One of the findings of the FutureTDM project was that universities have the potential to contribute significantly to the uptake of text and data mining, but that a coordinated approach is still missing. Most interviewed stakeholders felt that the fragmentation of resources and knowledge is a significant barrier to moving things forward. Kiera mentioned that libraries can serve as a hub for TDM: bringing different stakeholders together to raise awareness and exchange experiences. The FutureTDM presentation is available on slideshare.
OpenMinTeD: a platform for TDM
From the policy level we went to the infrastructural level. Natalia Manola of Athena Research & Innovation presented the OpenMinTeD project and how it will be useful for libraries. She described the complex and fragmented landscape, including text mining researchers, content providers, end users and computing infrastructures. Natalia also explained how the OpenMinTeD platform that is expected to be released in beta this summer can serve as a hub for TDM, combining content/corpora, services and tools. The OpenMinTeD presentation is available on slideshare.
TDM: Have a go
After having talked about copyright, policy and infrastructures, it was time to find out what TDM in practice actually looks like. Matteo Cancellieri of the Open University and CORE took all participants through a manual. He showed things you can do with out of the box TDM tools, and various tools.
Responses from participants
The initial responses to the workshop were positive. Several attendees were particularly enthusiastic about the policy and legal aspects, one even calling it a “good overview to TDM from the legal perspective”. Other participants preferred the technical side and would have liked to know more practical and technical aspects for non-computing people. One of the attendees described the workshop as follows “Interesting. Very interesting. But a bit difficult to understand how it could be transferred from coding to an added value service.” Other participants also said they would like to know more about “how to do TDM for non-computing people” and that a user-friendly interface like the OpenMinTeD platform (and training!) would be very welcome and interesting for libraries.