Accepted Tutorials

(1)Tutorial 1: The Science of Science

Reporters:
* Dashun Wang, Northwestern University (dashun.wang@kellogg.northwestern.edu)
* Lu Liu, Pennsylvania State University (lpl5107@ist.psu.edu)

Abstract:
The rapid development of digital libraries and the proliferation of scholarly big data have created an unprecedented opportunity to explore scientific production and reward at scale. The science of science is an emerging field fueled by the recent data explosion in digital libraries, which uses and develops tools from computer science, information science and network science to offer lessons underlying creativity from a systematic and quantitative framework by mining scholarly big data, hoping to broadly explore the opportunities for innovation. In this tutorial, we aim to provide an overview of the science of science, including its rich historical context, the state-of-art technologies and exciting discoveries, and promising future applications especially for the JCDL community. This tutorial will cover main topics in this field including scientific careers, scientific collaborations and scientific knowledge from the perspective of participants in the digital library domain, and will be specifically geared toward the audience that JCDL attracts.

(2)Tutorial 2: Preparing Code and Data for Computational Reproducibility

Reporters:
* William Ingram, Virginia Polytechnic Institute and State University (waingram@vt.edu)
* Edward Fox, Virginia Polytechnic Institute and State University (fox@vt.edu)

Abstract:
Computational analyses are playing an increasingly central role in research and are a feature of many advanced digital libraries. Journals, sponsors, and researchers, including in the digital library field, are calling for published research to include associated data and code. However, many involved in research have not received training in best practices and tools for building systems (e.g., using containers) and implementing methods that facilitate sharing code and data. This tutorial aims to address this gap in training while also providing those who support researchers with curated best practices guidance and tools.
This tutorial is unique compared to other reproducibility events due to its practical, step-by-step design. It is comprised of hands-on exercises to prepare research code and data for computationally reproducible publication. Although the tutorial starts with some brief introductory information about computational reproducibility, the bulk of the tutorial is guided work with data and code. The basic best practices for publishing code and data are covered with curated resources. Examples will include from the digital library and information retrieval domains. Participants move through preparing research for reuse, organization, documentation, automation, and submitting their code and data to share. Tools to support reproducibility will be introduced but all lessons will be platform agnostic.

(3)Tutorial 3: Introduction to Digital Libraries

Reporters:
* Edward Fox, Virginia Polytechnic Institute and State University (fox@vt.edu)
* William Ingram, Virginia Polytechnic Institute and State University (waingram@vt.edu)

Abstract:
This tutorial is a thorough and deep introduction to the Digital Libraries (DL) field, providing a firm foundation: covering key concepts and terminology, as well as services, systems, technologies, methods, standards, projects, issues, and practices. It introduces and builds upon a firm theoretical foundation (starting with the ‘5S’ set of intuitive aspects: Streams, Structures, Spaces, Scenarios, Societies), giving careful definitions and explanations of all the key parts of a ‘minimal digital library’, and expanding from that basis to cover key DL issues. Illustrations come from a set of case studies, including from multiple current projects, including with webpages, tweets, and social networks. Attendees will be exposed to four Morgan and Claypool books that elaborate on 5S, published 2012-2014. Complementing the coverage of 5S will be an overview of key aspects of the DELOS Reference Model and DL.org activities. Further, new material will be added on building digital libraries using container and cloud services, on developing a digital library for electronic theses and dissertations, and methods to integrate UX and DL design approaches.

(4)Tutorial 4: Writing about Data Science Research

Reporters:
* Kevin Bretonnel Cohen, University of Colorado School of Medicine (kevin.cohen@gmail.com)
* Daniela Gifu, University of Iasi & Romanian Academy - Iasi Branch (daniela.gifu73@gmail.com)

Abstract:
We propose a tutorial on writing about research in data science. The focus will be on writing conference papers and journal articles. The target audience is graduate students, post-doctoral fellows, and early-career faculty. The teaching methodology will include lectures and a heavy hands-on component.

(5)Tutorial 5: Building Digital Library Collections with Greenstone3

Reporters:
* David Bainbridge, University of Waikato (davidb@cs.waikato.ac.nz)

Abstract:
This tutorial is designed for those who want an introduction to building a digital library using an open source software program. The tutorial will focus on the Greenstone digital library software. In particular, participants will work with the Greenstone Librarian Interface, a flexible graphical user interface designed for developing and managing digital library collections. Attendees do not require programming expertise, however they should be familiar with HTML and the Web, and be aware of representation standards such as Unicode, Dublin Core and XML. The Greenstone software has a pedigree of more than two decades, with over 1 million downloads from SourceForge. The premier version of the software has, for many years, been Greenstone2. This tutorial will introduce users to Greenstone3 -- a redesign and reimplementation of the original software to take better advantage of newer standards and web technologies that have been developed since the original implementation of Greenstone. Written in Java, the software is more modular in design to increase the flexibility and extensibility of the software design. Emphasis in the tutorial is placed on where Greenstone3 goes beyond what Greenstone2 can do. Through the hands-on practical exercises participants will, for example, build collections where geo-tagged metadata embedded in photos is automatically extracted and used to provide a map-based view in the digital library of the collection.