Linked Open Usable Data for Cultural Heritage: Perspectives on Community Practices and Semantic Interoperability

PhD Thesis in Digital Humanities, completed as part of the Graduate School of Social Sciences’ (G3S) doctoral programme. It was successfully defended on 18 November 2024 (slides).

This page will host a lightweight HTML version of my thesis, optimised for easy access and readability. The PDF version (e-dissertation) is available on the University of Basel’s repository: https://doi.org/10.5451/unibas-ep96807.

Page in construction (please be patient ⌛)

Author
Supervisors

Abstract

Digital technologies have fundamentally transformed how Cultural Heritage (CH) collections are accessed and engaged with. Linked Open Usable Data (LOUD) specifications, including the International Image Interoperability Framework (IIIF) Presentation API 3.0, Linked Art, and the W3C Web Annotation Data Model, have emerged as web standards to facilitate the description and dissemination of these valuable resources. Despite the widespread adoption of IIIF, implementing LOUD specifications, particularly in combination, remains challenging. This is especially evident in the development and assessment of infrastructures, or sites of assemblage, that support these standards.

This research is guided by two perspectives: community practices and semantic interoperability. The first perspective assesses how organizations, individuals, and apparatuses engage with and contribute to the consensus-making processes surrounding LOUD. By examining these practices, the social fabrics of the LOUD ecosystem can be better understood. The second perspective focuses on making data meaningful to machines in a standardized, interoperable manner that promotes the exchange of well-formed information. This research is grounded in the SNSF-funded project, Participatory Knowledge Practices in Analogue and Digital Image Archives (PIA) (2021–2025), which aims to develop a citizen science platform for three photographic collections from the Cultural Anthropology Switzerland (CAS) archives. Actor-Network Theory (ANT) forms the theoretical foundation, aiming to describe the collaborative structures of the LOUD ecosystem and emphasize the role of non-human actors.

Beyond its implementation within the PIA project, this research includes an analysis of the social dynamics within the IIIF and Linked Art communities and an investigation of Yale’s Collections Discovery platform, LUX. The research identifies socio-technical requirements for developing specifications aligned with LOUD principles. It also examines how the implementation of LOUD standards in PIA highlights their potential benefits and limitations in facilitating data reuse and broader participation. Additionally, it explores Yale University’s large-scale deployment of LOUD standards, emphasizing the importance of ensuring consistency between Linked Art and IIIF resources within the LUX platform for the CH domain.

The core methodology of this thesis is an actor- and practice-centered inquiry, focusing on a detailed examination of specific cosmologies within LOUD-driven communities, PIA, and LUX. This micro-perspective approach provides rich empirical evidence to unravel the intricate web of cultural processes and constellations in these contexts.

Key empirical findings indicate that LOUD enhances the discoverability and integration of data in CH, requiring community-driven consensus on model interoperability. However, significant challenges include engaging marginalized groups, sustaining long-term participation, and balancing technological and social factors. Strategic use of technology and the capture of digital materiality are critical, but LOUD also poses challenges related to resource investment, data consistency, and the broader implementation of complex patterns.

LOUD should lead efforts to improve the accessibility and usability of CH data. The community-driven methodologies of IIIF and Linked Art inherently foster collaboration and transparency, making these standards essential tools in evolving data management practices. Even for institutions and projects that do not adopt these specifications, the socio-technical practices of LOUD offer vital insights into effective digital stewardship and strategies for community engagement.

Keywords: Actor-Network Theory; Community of Practice; Cultural Anthropology Switzerland; Cultural Heritage; Digital Infrastructure; International Image Interoperability Framework; Knowledge Practices; Linked Art; Linked Data; LUX; Participatory Archives; Photographic Archives; Semantic Interoperability; Web Annotation Data Model

Table of Contents

  1. Introduction
  2. Context
  3. Interlinking Cultural Heritage Data
  4. Exploring Relationships through an Actor-Network Theory Lens
  5. Research Scope and Methodology
  6. The Social Fabrics of IIIF and Linked Art
  7. PIA as a Laboratory
  8. Yale’s LUX and LOUD Consistency
  9. Discussion
  10. Conclusion

1. Introduction

Since its inception in 2011, the IIIF has revolutionised[1] the accessibility of image-based resources. Initially driven by the needs of manuscript scholars, IIIF focused on two-dimensional images, but has since expanded to encompass a wide range of image-based resources, including audiovisual materials and, in the near future, 3D images. Similarly, Linked Art, formally established in 2017, initially concentrated on art museum objects but has since broadened its scope to model a variety of CH entities, leveraging CIDOC-CRM, a renowned ontology in the museum and DH space. Both initiatives aim to break down silos: IIIF focuses on improving the presentation of digital objects, while both initiatives enhance their dissemination. Together, they make CH data more accessible through IIIF and more meaningful through Linked Art for machines. These efforts have primarily benefited the CH domain.

A key commonality is that the main APIs these communities create align with the LOUD design principles, either intentionally or empirically demonstrated through use cases. These principles enable software developers to develop compliant tools and services without needing to fully understand RDF, a syntax for representing information on the web. Additionally, they may not need to grasp all LOD principles, which promote the interlinking of data from diverse datasets using tools like KOS such as thesauri. WADM, a W3C standard, is also recognised as a LOUD specification. It provides a framework for creating interoperable annotations on web resources, facilitating the linking and sharing of data across different platforms and applications. These LOUD design principles include the right abstraction for the audience, few barriers to entry, comprehensibility by introspection, documentation with working examples, and the use of many consistent patterns rather than few exceptions. Additionally, both IIIF and Linked Art are driven by vibrant communities, mainly comprising GLAM and higher education institutions.

While the standards and principles discussed have broad applications, it is important to clarify the scope of this dissertation. This work does not focus on KGs by assessing triplestores – databases specifically designed to store and retrieve triples, which are the fundamental data structures in RDF. Similarly, it does not deal with evaluating SPARQL engines, which are specifically designed to query KGs. Additionally, this dissertation does not address the intersection of ML and IIIF, or the ontological reasoning of Linked Art.

Instead, this dissertation concentrates on LOUD, the consistency of its standards, design principles and the vibrant communities behind it. It examines JSON-LD serialisation efforts and the crucial intersection required to establish robust semantic interoperability baselines between presentation and semantic layers. It also presents real-world use case implementations, both on a small scale in a laboratory and flexible space within the PIA research project, and on a large scale at Yale, exemplified by the LUX platform that provides access to (meta)data from YUL, YCBA, YUAG, and YPM.

The focus is therefore on digital infrastructures capable of delivering JSON-LD files from the above specifications, which are primarily, though not exclusively, CH resources. It is more about the different actors – both human and non-human – that create and maintain these interconnected systems and the dynamic interactions that sustain them. The deployment of various LOUD specifications addresses the need for semantic interoperability between CH resources and disparate datasets by establishing a standardised approach to representing and linking data, ensuring that information can be seamlessly shared and understood across different platforms and contexts.

This dissertation seeks to carve out a distinct niche by addressing an often-overlooked aspect of IIIF and Linked Art. IIIF is sometimes perceived and studied merely as a service or an appendix, with the content it delivers taking precedence. However, this PhD thesis positions IIIF as a first-class citizen worthy of in-depth study. Similarly, Linked Art, despite its potential and its relatively recent establishment, has been the subject of very few scholarly papers. This gap underscores the significance of LOUD in this context. Furthermore, this thesis elevates Linked Art to a position of primary importance, recognising its significance and advocating for its thorough examination. To thoroughly study LOUD and its adherence to design principles, it is essential to immerse ourselves actively in both communities – an approach I have embraced for years. The thesis also emphasises the importance of participatory efforts and collaboration between research projects, which typically have shorter lifespans, and memory institutions, which need to implement technical standards as a lingua franca. In doing so, it reveals the mediating role of LOUD in advancing the heritage sphere. To truly understand IIIF, Linked Art, and to a lesser extent WADM, it is crucial to examine the social fabrics and consensus decision-making of each community. Among these considerations are how the specifications can be implemented pragmatically, and how the standards can support the implementation and maintenance of more extensive semantic interoperability efforts.

The significance of this research lies in highlighting the commitment and diligence of the individuals and organisations that make up both the IIIF and Linked Art communities. It aims to demonstrate that community-driven practices, such as those exemplified by IIIF and Linked Art, have a potential that goes beyond the mere sharing of digital objects and their associated metadata. The more people who embrace these approaches and implement the associated specifications, the more society as a whole will benefit. Furthermore, this research illustrates that IIIF is no longer limited to two-dimensional images, that Linked Art is not restricted to artworks, and that WADM is a simple, content-agnostic standard that can be easily integrated into a range of systems. This adaptability is a strength of LOUD standards, which are designed to be simple yet effective. LOUD can serve a variety of purposes, primarily rooted in CH, but with the potential to extend its benefits to other sectors. The true beauty of LOUD lies in its ability to foster networking opportunities and transparent socio-technical practices, demonstrating its value beyond mere technical implementation.

By emphasising these aspects, this dissertation highlights the wider impact of LOUD in promoting semantic interoperability and enhancing collaborative efforts within the heritage field and beyond. In addition, the implementation of standards through PIA underlines the potential for similar participatory or citizen science projects, while the LUX initiative serves as an illustrative example of robust infrastructure and cross-unit engagement. These examples demonstrate the practical applications and far-reaching implications of adopting LOUD standards in different contexts.

This dissertation is structured across ten chapters, each building upon the previous ones up to Chapter 5 to provide a comprehensive understanding of the research. These initial chapters lay the foundation of the study, establishing the context, theoretical framework, and methodological approaches. After this foundational section, Chapters 6, 7, and 8 present empirical studies that, while interconnected, can be read independently if desired. These chapters offer detailed insights into specific aspects of the research and can be appreciated on their own or as part of the broader narrative.

The thesis continues with Chapter 2, which extends this introduction by providing more information about the research setting, specifically PIA. Chapter 3 follows with an extensive literature review, offering a comprehensive overview of methods to interlink CH data. Next, Chapter 4 presents the theoretical framework, conceptualised as a toolbox and firmly rooted in ANT, guiding the analysis and discussion throughout the dissertation. Following this, Chapter 5 details the research scope and methodology, explaining the approaches and methods employed in the study.

Moving on to the empirical work, Chapter 6 sheds light on the social fabrics of IIIF and Linked Art, exploring the communities and practices that underpin these initiatives.Chapter 7 then examines the implementation of LOUD standards within PIA, highlighting the practical aspects and challenges encountered. This is followed by Chapter 8, which focuses on the LUX initiative at Yale, examining the underlying governance and interdepartmental ownership of the Yale Collections Discovery platform. The discussion of findings is presented in Chapter 9, where the results from the empirical chapters are synthesised and analysed in relation to the theoretical framework. Finally, Chapter 10 concludes the thesis, summarising the key insights and contributions of the research while outlining potential directions for future study.

2. Context

In this chapter, I will set the stage for my PhD thesis by providing important background information. First, in Section 2.1, I will explain why I chose the title for my thesis. This will give you an understanding of the main focus and the direction of my research. Next, in Section 2.2, I will describe the PIA research project, which is central to my work. This section will cover the project’s goals, significance, and overall framework. In Section 2.3, I will detail my specific contributions to the PIA project. I will emphasise how my work fits into the larger project and its importance to my thesis. Finally, in Section 2.4, I will talk about my active participation in the IIIF and Linked Art communities. This section will highlight how my involvement in these communities has influenced my research and its broader implications.

2.1 PhD Title

I chose the title ‘Linked Open Usable Data for Cultural Heritage: Perspectives on Community Practices and Semantic Interoperability’ as it encapsulates the essence of my research focus but I could have indeed chosen other ones.

During the initial stages of my research, multiple working titles were explored to capture the diverse facets of my interests and objectives. If I was quite sure about having in the title after the third iteration, I was quite unsure of what should follow and if a subtitle was actually needed at all. Amidst this dynamic progression, the underlying theme of my research remained steadfast – to delve into the transformative potential of LOUD for CH.

I also opted to maintain in the title of my thesis subsection. While holds its appeal, my choice reflects a broader narrative that acknowledges the crucial role of CHIs and spotlighting the multifaceted nature of heritage preservation, encapsulating both its digital facets and the essential contribution of individuals and institutions in curating, interpreting, and making heritage accessible.

As for the subtitle, while I do explore CoP as defined by (Lave & Wenger, 1991) and (Wenger, 2011) through investigating the social fabrics of the IIIF and Linked Art communities, my main interest lies in the broader application of LOUD for describing and interlinking CH resources. Thus, I decided to opt for the more generic as the first axis or perspective.

For the second perspective, I wanted to see how semantic interoperability can be achieved through standards adhering to the LOUD design principles, as they seem to be key enablers for seamless collaboration and knowledge exchange among practitioners. There was a time in my research when I envisaged decoupling and , perceiving them as two distinct dimensions. However, what really captivates me is the unification of these factors to facilitate collective reasoning for both humans and machines.

In summary, this title reflects my enthusiasm for using web-based and community-driven technologies to transform the way we understand, share and value CH.

2.2 The PIA Research Project

I undertook my doctoral studies within the scope of the PIA research project financed by the SNSF under their Sinergia funding scheme from February 2021 to January 2025[2]. The project aimed to analyse the interplay of participants, epistemological orders and the graphical representation of information and knowledge in relation to three photographic collections from CAS. It sought to bring together the world of data and things in an interdisciplinary manner, exploring the phases of the analogue and digital archive from a cultural anthropological, technical and design research perspective (Felsing et al., 2023, p. 42). As part of this endeavour, interfaces were developed to enable the collaborative indexing and use of photographic archival records (Chiquet et al., 2023, p. 110). I discuss in more detail the interdisciplinary components and briefly introduce the people involved in the project in Subsection 2.2.1, then talk about the photographic collections that were the overarching narrative of the research in Subsection 2.2.2, and lastly in Subsection 2.2.3, the vision that we had put together.

The project, divided in three interdisciplinary teams, was led by the University of Basel through the Institute for Cultural Anthropology and European Ethnology[3] (Team A) and the DHLab[4] in collaboration with the DBIS group (Team B) as well as by the HKB[5], an art school and department of the Bern University of Applied Sciences (Team C) (Felsing et al., 2023, p. 43). Table 2.1 lists the people who contributed to the project, broken down by the three teams and their particular perspectives.

Table 2.1: PIA Team Core Members
Perspective People
A) Anthropological Prof. Dr. Walter Leimgruber, Team Leader and Dissertation Supervisor
Dr. Nicole Peduzzi, Photographic Restoration and Digitisation Supervisor
Regula Anklin, Conservation and Restoration Specialist (project partner at Anklin & Assen)
Murielle Cornut, PhD Candidate in Cultural Anthropology
Birgit Huber, PhD Candidate in Cultural Anthropology
Fabienne Lüthi, PhD Candidate in Cultural Anthropology
B) Technical Prof. Dr. Peter Fornaro, Team Leader and Dissertation Supervisor
Prof. Dr. Heiko Schuldt, Dissertation Supervisor (project partner at the University of Basel)
Dr. Vera Chiquet, Postdoctoral Researcher
Adrian Demleitner, Software Developer (2021-2023)
Fabian Frei, Software Developer (2023-2025)
Christoph Rohrer, Software Developer (2023-2025)
Julien A. Raemy, PhD Candidate in Digital Humanities
Florian Spiess, PhD Candidate in Computer Science
C) Communicative Dr. Ulrike Felsing, Team Leader and Dissertation Supervisor
Prof. Dr. Tobias Hodel, Dissertation Supervisor (project partner at the University of Bern)
Daniel Schoeneck, Research Fellow
Lukas Zimmer, Designer (project partner at A/Z&T)
Max Frischknecht, PhD Candidate in Digital Humanities

2.2.2 Photographic Collections/Archives as Anchors

CAS has historically been engaged in active collaborations that bridge the academic research and the public sphere, primarily through traditional analogue methods. The PIA project was created with the intention of exploring the complexities inherent in both analogue and digital approaches, and to encourage and investigate these collaborative endeavours between academia and the wider public. As such, PIA represents a paradigm shift within the scope of projects associated with or supported by CAS, facilitating the seamless integration of digital tools to explore multiple facets of participation and engagement. This transformative endeavour embodies a profound exploration of new intersections where scholarly endeavours intertwine with the active involvement of citizens.

PIA drew on three collections: one focusing on scientific cartography and titled (Atlas der Schweizerischen Volkskunde), a second from the estate of the photojournalist Ernst Brunner (1901–1979), and a third collection consisting of vernacular photography which was owned by the Kreis Family (1860–1970).

SGV_05 ASV consists of 292 maps and 1000 pages of commentary published from 1950 to 1995 — an example of such a map is shown in Figure 2.1. This collection was commissioned by the CAS to do an extensive survey of the Swiss population in the 1930s and 1940s on many issues pertaining, for instance, to everyday life, local laws, superstitions, celebrations or labour (Weiss, 1940). The contents were compiled by researchers and by people who were described as [6]. Questions were asked about everyday habits, community rights, work, trade, superstitions, and many other topics (Schmoll, 2009b, 2009a). This collection offers a snapshot of everyday life in Switzerland right before the beginning of a modernisation process that fundamentally changed lifestyles in all areas during the postwar period. A digitised version of the ASV would not only allow the results of that time to be enriched with further findings  (Felsing & Frischknecht, 2021), but would also make transparent how knowledge was generated in cartographic form through a complex process along different types of media and actors. The restoration, digitisation, cataloguing and indexing efforts took all part throughout PIA under the supervision of Birgit Huber, who extensively based her doctoral research on this particular collection (see Huber, 2023).

Map from the SGV_05 Collection Relating to Question 93 Showing Walks and Excursions at Pentecost. ASV. CAS. CC BY-NC 4.0
Figure 2.1: Map from the SGV_05 Collection Relating to Question 93 Showing Walks and Excursions at Pentecost. ASV. CAS. CC BY-NC 4.0

SGV_10 Kreis Family comprises approximately 20,000 loose photographic objects, where a quarter of them are organised and kept in 93 photo albums — as illustrated by Figure 2.2, from a wealthy Basel-based family and spanning from the 1850s to the 1980s. This private collection was acquired by CAS in 1991. The collection, which originally arrived in banana cases and was enigmatic due to the lack of clear organisation or accompanying information from the family, posed significant challenges. Despite these initial hurdles, CAS undertook meticulous efforts to catalogue and preserve its contents (Felsing & Cornut, 2024, p. 42). The pictures were taken by studio photographers as well as by family members themselves. The Kreis Family collection represents a typical example of urban bourgeois culture and gives a comprehensive insight into the development of private photography over the course of a century (Pagenstecher, 2009). The photographic materials and formats are very diverse, ranging from prints to negatives, small, medium or large format photographs, black and white or colour. The collection also encompasses many photographic techniques, from the one-off daguerreotypes and ferrotypes, to the glass-based negatives that could be reproduced en masse, to the modern paper prints. While some of the albums and loose images were restored and digitised during the 2014 project, much of this work was completed during PIA and overseen by Murielle Cornut, whose doctoral investigation was centred on the study of photo albums (see Cornut, 2023).

A photo Album Page from the SGV_10 Collection, Bearing the Following Inscription: Botanische Excursion ins Wallis, Pfingster 1928. SGV_10A_00031_015. Kreis Family. CAS. CC BY-NC 4.0
Figure 2.2: A photo Album Page from the SGV_10 Collection, Bearing the Following Inscription: Botanische Excursion ins Wallis, Pfingster 1928. SGV_10A_00031_015. Kreis Family. CAS. CC BY-NC 4.0

SGV_12 Ernst Brunner is a donation of about 48,000 negatives and 20,000 prints to the CAS archives from Ernst Brunner, a self-taught photojournalist, who lived from 1901 to 1979 and who documented mainly in the 1930s and 1940s a wide range of folkloristic themes — as shown by Figure 2.3. He is one of the most important photographers of the era and one of the most outstanding visual chroniclers of Swiss society (Pfrunder, 1995). His photographs show rural lifestyles, but also urban motifs. In his late work, he led the documentation and research on farmhouses in a specific Swiss district, a project initiated by CAS. Before Ernst Brunner became an independent photojournalist in the mid-1930s, he worked as a carpenter, influenced by the ideas of the Bauhaus and Neues Bauen movements. This can also be seen in the aesthetics and formal language of his photography. If all the black and white negatives were digitised and recorded between 2014 and 2018, the digitisation of prints, which is a selection done by Ernst Brunner, was conducted at the end of the PIA research project. The latter was supervised by Fabienne Lüthi, whose PhD was about organisational systems and knowledge practices in the Ernst Brunner Collection.

Picture from the SGV_12 Collection Showing Walkers Looking at the Timetable Train. Lucerne, 1938. Ernst Brunner. SGV_12N_00716. CAS. CC BY-NC 4.0
Figure 2.3: Picture from the SGV_12 Collection Showing Walkers Looking at the Timetable Train. [Wanderer studieren den Fahrplan in der Bahnhofhalle]. Lucerne, 1938. Ernst Brunner. SGV_12N_00716. CAS. CC BY-NC 4.0

Whereas for each of the PhD Candidates in Cultural Anthropology, a particular collection was assigned to them and its content was to varying degrees part of their subject of study, this was not exactly the same for the PhD Candidates in DH, including myself, and in Computer Science. Put differently, we had relative leeway in terms of what interested us in each or all of these three photographic collections. In my case, I briefly explain my contribution to the project more in and then in as part of the empirical portion of my thesis focusing on the deployment of LOUD specifications using the three CAS photographic collections.

Florian Spiess focused on the use of VR through vitrivr, a multimedia retrieval system developed by the DBIS research group at the Department of Mathematics and Computer Science (Spiess et al., 2024; Spiess & Schuldt, 2022; Spiess & Stauffiger, 2023). His work included experiments with PIA-related collections, such as the creation of virtual galleries clustered according to content-based similarity (see Peterhans et al., 2022). In the case of Max Frischknecht, his doctoral research centred on generative design[7], a methodology to visualise dynamic cultural archives. He mostly worked on the ASV collection and on a mapping tool which is a cartographic visualisation designed to explore the CAS photographic archives (see Frischknecht, 2022; Huber & Frischknecht, 2024).

It should also be mentioned that not only did we use the three collections of the CAS photographic archives within the project, but that both formal and informal meetings took place most commonly within the photographic archives at the Spalenvorstadt premises in the old Gewerbemuseum and later either at the on Allschwilerstrasse, though less frequently, or at Rheinsprung where the Institute for Cultural Anthropology and European Ethnology is located. This meant that there was a strong and sometimes blurred entanglement between those involved in the archives and the PIA core team members.

2.2.3 Project Vision

Between December 2021 and March 2022, we worked together to develop and finalise a vision for the project[8]. It includes seven key priorities, or pillars, which were meant to strengthen the interdisciplinary perspectives of PIA. Although ambitious, these elements were of paramount importance to us and served as a guiding blueprint for all PIA activities. Hereafter is a modified version of the vision[9] taken from (Cornut et al., 2023) [p. 4].

  1. Accessibility by developing open interfaces and offering the possibility of expanding the archive and turning it into an instrument of current research that collects and evaluates knowledge with the participation of other users (Citizen Science).

  2. Heterogeneity by making visible where, why and under what circumstances the objects were created, how they were handled and what path they have taken to get to and in the archive. We work on visualisations that take into account the heterogeneous character of archival materials and make their respective biographies visible.

  3. Materiality by conveying the material properties of the objects: they have front and back sides, inscriptions, traces, development errors, they are transparent, multi-layered or fabric-covered. They tell of their origin, use, and peculiarities. We want to make this knowledge accessible and understandable in digital form. To this end, we also consider the necessary infrastructure involved in the creation as part of their narrative: the restoration, the relocation, the indexing, the storage devices, the research tools, the display medium, as well as the process of repro-photography.

  4. Interoperability as a crucial component and which will be done by supporting digital means that allow different stakeholders to freely access and interact with the project’s data. Both humans and machines can use, contribute to, correct and annotate the existing data in an open and interoperable manner, thus encouraging exchange and the creation of new knowledge. To do this, we use web-based standards that are widely adopted in the cultural heritage field.

  5. Affinities by leveraging data models and pattern recognition which can uncover semantic relationships between entities that were previously incomplete or difficult for users to access. Using specific interfaces and visualisations, we make it possible to explore digital assets and discover forms of relationships and similarities between images.

  6. AI that facilitates automated searches for simple image attributes such as colour, shapes, and localisation of image components. It should also become possible to recognise texts and object types for extracting metadata.

  7. Bias Management by taking into account that associated metadata was human-made[10] and thus is never objective. Collections and their metadata reflect biases or focus narrowly on selected areas and perceptions. Machines working on the basis of such data automatically reproduce the implicit biases in decision-making due to so-called biased algorithms. Therefore, understanding the data used for training and the algorithms applied for decision making is crucial to ensure the integrity of the application of these technologies in archives. We take ethical issues into account when using AI and visualisations, because the higher the awareness of a possible bias, the faster it can be detected or brought up for consideration with users.

As my thesis is notably concerned with semantic interoperability, Interoperability and Affinities are of particular importance to my PhD thesis, although I recognise the importance of all pillars. Each of these resonated with me and my fellow PhD Candidates. As we immersed ourselves in the vision of the PIA research project, it became a unifying thread that brought us together in our research ambitions. We found that all these priorities within the project spoke to us at different points and provided a strong point of communication and practice in the development of processes, prototypes or interfaces.

2.3 Contribution to PIA and its Relevance to the Thesis

To develop a participatory platform, an open and sustainable technological foundation for facilitating the reuse of CH resources was needed (Raemy, 2021). Throughout the PIA project, I was mainly involved in the extension of the data infrastructure, the uptake of IIIF as well as designing the data model, leveraging Linked Art and WADM (Raemy, 2024). As a member of Team B, I undertook this PhD as a bridge between the different teams, mostly participating in discussions with the three doctoral candidates from Team A to further develop and agree on the CAS data model and with the software developers from my team to discuss the impact of the data model on our evolving — yet transitory — infrastructure as well as helping in implementing the APIs adhering to the LOUD design principles.

It was necessary to redesign the data model within the context of a database migration, from Salsah to the DSP, that happened between November 2021 and March 2024. This updated version, based on the Knora Base Ontology[11], corresponded to the needs of the CAS archives and to some extent to those of PIA, in particular to enable the PhD Candidates in Cultural Anthropology to make more precise assertions, whether in terms of descriptive metadata, or in the ability to link one object to another or to provide comments on these objects in several narrative forms.

Moreover, an assessment of the appropriate technical standards for improved usability of the objects by both humans and machines was carried out, as a basis for extending the capabilities provided by DaSCH, such as helping the software developers to implement SIPI[12], a C++ image server compatible with the IIIF Image API and build services that create IIIF Presentation API 3.0 resources.

While the theoretical framework of the thesis extends across the scope of PIA, the empirical part focuses on a specific set of findings derived from the research project outlined in , under the title . In this chapter, I discuss the data model and its refinement as well as the generation of custom IIIF Manifests during the specific digitisation, cataloguing and indexing efforts that took place throughout the project for the three CAS collections (SGV_05, SGV_10 and SGV_12) under investigation, the implementation of LOUD standards, and the overall design of the technological underpinnings.

2.4 Involvement within the IIIF and Linked Art communities

I must acknowledge the invaluable role that my involvement within the IIIF and Linked Art communities has played in shaping my journey as a trained information specialist and an aspiring DH practitioner. Being an active participant in both communities has not only broadened my understanding of the latest developments in the field but has also profoundly influenced the trajectory of this dissertation.

I have been involved within the IIIF community since October 2016 and the Working Groups Meeting that happened in The Hague[13]. This significant journey was, in fact, initiated by a recommendation from my first supervisor, Peter Fornaro, during my time as an undergraduate doing an internship at the DHLab. Little did I know that this recommendation would lead me to be carrying out a PhD and looking at IIIF not only as community-driven standards but as an object of study. Engaging with the IIIF community exposed me to cutting-edge advances in image interoperability and standards, and fostered a deeper appreciation for the importance of digital representations of cultural heritage. Through collaborative discussions with experts from diverse backgrounds, I gained new perspectives on the potential of technology to advance humanities research and preserve our collective cultural memory.

Similarly, my involvement in the Linked Art community introduced me to the opportunities offered by LOUD and its transformative impact on research discourse. Exposure to Linked Data methodologies and the CIDOC-CRM has significantly influenced the way I have structured and interpreted the data in this dissertation, thereby enriching its scholarly breadth and rigour. I started to be actively involved in Linked Art at the beginning of my PhD in 2021, but I was already a by 2020, driven by the efforts of Rob Sanderson, my third supervisor. By mid-2023, I had become a member of the Editorial Board.

The individuals I have met and the knowledge shared in these vibrant communities have deeply informed my approach as a scholar. The invaluable connections and collaborations I have made have expanded my network of fellow researchers, educators, and experts, leading to fruitful discussions that have significantly shaped the research questions addressed in this thesis. The events and workshops organised by these communities have also provided immersive learning experiences, giving me first-hand insights into the tools, technologies and methodologies used in the context of describing and disseminating CH data. The dynamic ecosystem of these communities has served as an inspiring backdrop, fostering innovative thinking and encouraging a more holistic approach to my research.

3. Interlinking Cultural Heritage Data

Interlinking CH data is an important aspect of publishing heritage collections over the web, in particular by using LOD technologies to make assertions more easily readable and meaningful to machines (Marcondes, 2021). Due to the complexity of CH data and their intrinsic inter-relationships, it is necessary to define its nature and introduce controlled vocabularies and ontologies that can be integrated with existing web standards and interoperable with relevant platforms (Bruseker et al., 2017; Hyvönen, 2020).

Efforts to interlink CH data have brought about significant advancements, but challenges remain. One such challenge is finding a balance between completeness and precision of expression to ensure that the that CH data remain accessible and usable to a wider audience. Addressing this challenge, the Linked Open Usable Data (LOUD) design principles and the specifications that adhere to those, such as the IIIF Presentation API 3.0 and Linked Art, offer a promising approach (Raemy et al., 2023). By focusing on usability aspects from the perspective of software developers and data scientists involved in designing visualisation tools and data aggregation approaches, LOUD strives to enhance the overall user experience (Sanderson, 2019).

Finding this equilibrium becomes crucial as CH data continues to grow in complexity and size, necessitating the seamless integration of native web technologies. The LOUD concept cultivates an environment that encourages the formation of vibrant CoP and the seamless integration of native web technologies, wherein an essential principle is the availability of comprehensive documentation supplemented with practical examples (Raemy, 2022). Moreover, the emphasis on leveraging widely adopted technologies enhances the interoperability of data and promotes its wider dissemination. With LOUD principles guiding the linking of CH data, the resulting web of knowledge becomes more than just a machine-readable resource; it transforms into a user-centric ecosystem where both accessibility of Linked Data and usability intersect to enable scholars and a wider audience to engage in the exploration and appreciation of CH (Newbury, 2018). Finally, by fostering a collaborative, knowledge-sharing mindset, LOUD empowers software developers to implement data in a robust way, drawing insights from shared experiences (see Page et al., 2020).

In this chapter, which serves as the literature review of the PhD thesis, I attempt to draw on this brief introduction by dividing the insights into seven sections in order to provide an overview of the key concepts related to interlinking data in the CH domain. The literature review primarily encompasses works published up until December 2023, providing a comprehensive snapshot of the field’s current state and its evolution. Section 3.1 discusses what makes CH data stand out and Section 3.2 is about CH metadata standards, while Section 3.3 explores the technological trends, scientific movements and guiding principles that have shaped the field. Section 3.4 provides an overview of the web as an open platform, which are essential to understanding the current landscape of interlinking CH data. Section 3.5 focuses on LOUD, while Section 3.6 looks at characterising the community practices and semantic interoperability dimensions for CH. Finally, in Section 3.7, I summarise key elements from each section and within each of these I give some initial thoughts with respect to LOUD, and then conclude the chapter with some considerations on why we as a society need to care about CH data.

3.1 What Makes Cultural Heritage Data Stand Out?

Here, I aim to establish the indirect territory of my study, as I am situated on a distinct plane that focuses on web technologies and standards — as well as software and services that enable them — as the subjects of investigation. However, it is crucial to acknowledge that LOUD specifications owe their existence to the available data that have served as case studies. Thus, their significance can be best understood through the lens of data and I recognise here the pivotal role played by CH practitioners — encompassing individuals from research and memory institutions — who have had a significant impact on specifying a series of web-based standards and who have helped to move forward the discovery of CH data and beyond, in particular those belonging to the public domain, in an open manner.

In Subsection 3.1.1, I provide an introduction to CH as recognised by the UNESCO. I explore the tangible, intangible, and natural dimensions of CH, laying the foundation for further discussions on its representation and preservation, notably by giving a first definition of CH data. Next in 3.1.2, I look at the challenges of representation and embodiment of CH data. This subsection examines the challenges in describing and preserving its materiality or embodied aspects. Understanding the significance of collective efforts, communities, and the interplay of technologies. Thirdly, I discuss what I called ‘Collectives and Apparatuses’ in 3.1.3 where I highlight how actors in terms of collaborative actions and apparatuses play a pivotal role in CH.

3.1.1 Cultural Heritage

The legacy of CH encompasses physical artefacts and intangible aspects inherited from past generations, reflecting the history and traditions of societies. Meanwhile, CH constantly evolves due to complex historical processes, necessitating preservation and protection efforts to prevent its loss over time (Loulanski, 2006). The dynamic nature of CH demands collaborative actions, including documentation and the use of a range of technologies.

The concept of CH is also characterised by perpetual evolution, mirroring the historical processes that shape societies over time. Social, political, economic, and technological shifts invariably influence the definition and perception of CH, prompting continuous reinterpretations and reevaluations of its significance. Over the years, the enthusiasm for the protection of cultural property has enriched the term with new shades of meaning. As societies undergo transformations, new layers of meaning and relevance are superimposed on existing CH, perpetually enriching its essence. As articulated by (Ferrazzi, 2021, p. 765):

‘Cultural heritage’, as an abstract legacy or as a merge of tangible and intangible values, is able to encompass the totality of culture(s); in so, assuming a symbolic value that brings a clear break with all other terminologies. In conclusion, ‘cultural heritage’ as a legal term has demonstrated more than any others to be a real ensemble of historical stratification and cultural diversity.

The advent of globalisation and rapid advancements in technology have further accelerated the evolution of CH. Increased interconnectedness and cross-cultural interactions have led to the fusion of traditions and the emergence of novel cultural expressions. Moreover, the digital era has facilitated the dissemination of CH resources on a global scale, transcending geographical barriers and preserving cultural knowledge for future generations as (Portalés et al., 2018).

Thus, the intriguing nature of CH resources can be attributed to their multifaceted and diverse characteristics. The conservation and promotion of these resources demand a nuanced comprehension of the various types of heritage resources, culminating in effective preservation and promotion strategies that can account for their heterogeneity (Windhager et al., 2019).

According to (UNESCO Institute for Statistics, 2009), CH includes tangible and intangible heritage. Tangible CH refers to physical objects such as artworks, artefacts, monuments, and buildings, while intangible CH comprises practices, knowledge, folklore and traditions that hold cultural significance (Munjeri, 2004). The concept of heritage has evolved through a process of extension to include objects that were not traditionally considered part of the heritage. The criteria for selecting heritage have also changed, taking into account cultural value, identity, and the ability of the object to evoke memory. This shift has led to the recognition and protection of intangible CH, challenging a Eurocentric perspective and embracing cultural diversity as a valuable asset for humanity (Vecco, 2010).

Conservation guidelines have broadened the concept of heritage to include not only individual buildings and sites but also groups of buildings, historical areas, towns, environments, social factors, and intangible heritage (Ahmad, 2006). In 2019, another instance of UNESCO defines CH in an even more comprehensive manner, taking into account natural heritage:

Cultural heritage is, in its broadest sense, both a product and a process, which provides societies with a wealth of resources that are inherited from the past, created in the present and bestowed for the benefit of future generations. Most importantly, it includes not only tangible, but also natural and intangible heritage. (UNESCO. Culture for Development Indicators, 2014, p. 130)

In thinking about the concept of CH, I find this last definition particularly resonant. This broader perspective is motivated by my interest with LOUD specifications as a research area, particularly because of their notable data agnosticism and as it resonated with (Hyvönen, 2012) [pp. 1-3]'s subdivision of CH as well. These services have the adaptability to process and use different types of data, transcending the boundaries of specific domains or disciplines. Although grounded in concrete CH cases, their potential to extend to any type of data, including those from STEM, is a compelling prospect that warrants further exploration, a point that I will explore later.

The following sub-subsections aim to briefly discuss tangible, intangible, and natural heritage, as well as providing a definition of CH data which can serve as a foundational reference for this thesis.

3.1.1.1 Tangible Heritage

Tangible CH encompasses physical artefacts and sites of immense cultural significance that are passed through generations in a society (Vecco, 2010). These objects are tangible manifestations of human creativity, representing artistic creations, architectural achievements, archaeological remains as well as collections held by CHIs.

One aspect of tangible CH is artistic creations such as paintings, sculptures and traditional handicrafts. These artefacts embody cultural values and artistic expressions and serve as essential reflections of a society’s collective ethos. For example, artworks such as ‘Irises’ from Vincent van Gogh[14] and Alberto Giacometti’s ‘L’Homme qui Marche I’ [15] are revered works of art that have deep cultural significance in Europe and all over the world.

The built heritage, including monuments, temples and historic buildings, is another important component of the tangible CH. These architectural marvels not only represent past civilisations, but also convey the social values and aspirations of their time. The Taj Mahal, an exemplary white marble structure in India, stands as a poignant testament to Mughal architecture. Closer to where I write this dissertation one can mention the Abbey of St Gall, a convent from the century which is inscribed on the UNESCO World Heritage List. In the context of urban heritage, conventional definitions of built heritage often focus narrowly on the architectural and historical value of individual buildings and monuments, which are well protected by existing legislation. However, the challenge is to preserve urban fragments - areas within towns and cities that may not qualify as designated conservation areas, but are of significant cultural and morphological importance (Tweed & Sutherland, 2007). For instance, (Rautenberg, 1998) proposes two categories of built CH: heritage by designation and heritage by appropriation. Heritage by designation involves experts conferring heritage status on sites, buildings, and cultural objects through a top-down approach, often without public participation. This method can be predictable and uncontroversial, but can be criticised for being elitist and neglecting unconventional heritage. On the other hand, heritage by appropriation emphasises community and public involvement in identifying and preserving cultural expressions, leading to a more inclusive and dynamic understanding of heritage.

Archaeological sites are also an integral part of the tangible CH, offering invaluable insights into past societies and ways of life. As per May 2024, UNESCO's long list of World Heritage Sites includes 1,199 cultural and natural sites in 168 different state parties — including 48 sites in transboundary regions[16]. Sites such as Machu Picchu, an impressive Inca citadel in the Peruvian Andes, bear witness to the architectural achievements and cultural practices of ancient civilisations. If archaeological sites are invaluable, they face significant threats such as looting, destruction, exploitation, and extreme weather phenomena (Bowman, 2008; Micle, 2014). To safeguard them, conservation efforts must be case-specific and include documentation and assessment of experiences gained (Aslan, 1997).

The preservation of tangible CH extends beyond physical objects to include libraries, archives and museums that house collections of books, manuscripts, historical documents and artefacts.

Incidentally, the term “cultural property” is also employed as a related concept to tangible CH, encompassing both movable and immovable properties as opposed to less tangible cultural expressions (Ahmad, 2006). Cultural property is protected by a number of international conventions and national laws. For instance, the Blue Shield[17] — an international organisation established in 1996 by four non-governmental organisations[18] — aims to protect and preserve heritage in times of armed conflict and natural disasters (Van der Auwera, 2013). Its mission has been revised in 2016:

The Blue Shield is committed to the protection of the world’s cultural property, and is concerned with the protection of cultural and natural heritage, tangible and intangible, in the event of armed conflict, natural- or human-made disaster. (Blue Shield, 2016 art. 2.1)

Overall, tangible CH is a testament to human ingenuity and cultural diversity, and serves as a bridge between the past and the present. Its preservation is a collective responsibility, ensuring that the legacy of past generations endures and the wealth of cultural diversity continues to enrich the fabric of society.

3.1.1.2 Intangible Heritage

The concept of intangible heritage emerged in the 1970s and was coined at the UNESCO Mexico Conference in 1982 (Leimgruber, 2010) with the aim of protecting cultural expressions that were previously excluded from preservation efforts (Hertz et al., 2018). UNESCO's previous focus had been on material objects, primarily from wealthier regions of the global North, leaving the intangible cultural heritage of the South overlooked. Attempts to protect intangible heritage through legal measures like copyright and patents were ineffective due to the collective nature of these cultural expressions and the anonymity of creators. The Convention acknowledges that intangible CH is essential for cultural diversity and sustainable development.

Below is the definition given by the Convention for the Safeguarding of the Intangible Cultural Heritage:

‘The Intangible Cultural Heritage’ means the practices, representations, expressions, knowledge, skills – as well as the instruments, objects, artefacts and cultural spaces associated therewith – that communities, groups and, in some cases, individuals recognize as part of their cultural heritage. This intangible cultural heritage, transmitted from generation to generation, is constantly recreated by communities and groups in response to their environment, their interaction with nature and their history, and provides them with a sense of identity and continuity, thus promoting respect for cultural diversity and human creativity. (UNESCO, 2022)

According to UNESCO, intangible CH can be manifested in the following domains:

Overall, intangible CH is a multifaceted concept that encompasses both traditional practices inherited from the past and contemporary expressions in which diverse cultural groups actively participate (Leimgruber, 2008; Munjeri, 2004). It includes inclusive elements shared by different communities, whether they are neighbouring villages, distant cities around the world, or practices adapted by migrant populations in new regions. These expressions have been passed down from generation to generation, evolving in response to their environment, and play a crucial role in shaping our collective identity and continuity. Intangible CH promotes social cohesion, strengthens a sense of belonging and responsibility, and enables individuals to connect with different communities and society at large.

Central to the nature of intangible CH is its representation within communities. Its value goes beyond mere exclusivity or exceptional importance; rather, it thrives on its association with the people who preserve and transmit their knowledge of traditions, skills and customs to others within the community and across generations. The recognition and preservation of intangible CH depends on the communities, groups or individuals directly involved in its creation, maintenance and transmission. Without their recognition, no external entity can decide on their behalf whether a particular practice or expression constitutes their heritage. The community-based approach ensures that intangible CH remains authentic and deeply rooted in the living fabric of society, protected by those who care for and perpetuate it.

In Switzerland, the Winegrower’s Festival in Vevey (La Fête des Vignerons), a plurisecular event celebrating the world of wine making (Vinck, 2019) and the Carnival of Basel (Basler Fasnacht) (Chiquet, 2023) are examples of traditions that are listed among UNESCO's intangible CH.

(In)tangibility is not always a straightforward concept and can indeed be blurred, i.e. it goes beyond the mere idea of materialisation. Many artefacts and elements of CH possess both tangible and intangible qualities that intertwine and complement each other, making the distinction less clear-cut.

For instance, this Male Face Mask, held at the Art Institute Chicago[19], also known as ‘Zamble’, from the Guro people in the Ivory Coast holds dual significance as both a tangible and intangible CH. As a tangible object, the mask is a physical artefact made from wood and pigment, fabric, and various adornments, that combines animal and human features representing the Guro people’s artistic skills. On the other hand, as an intangible cultural object, the Zamble mask carries profound spiritual and cultural meaning. It plays a significant role in commemorating the deceased during a man’s second funeral. These second funerals are organised months or even years after the actual burial as a way to honour and remember the departed (see Haxaire, 2009). Thus, the preservation and appreciation of both the tangible and intangible aspects of the mask are essential to its cultural relevance.

Another example of the blurred line between tangible and intangible heritage is emphasised by (De Muynke et al., 2022) in recreating reported perceptions of the acoustics of Notre-Dame de Paris through a collaboration between sciences of acoustics and anthropology. The authors highlight the heritage value of how people subjectively perceive sound in a space, particularly in places of worship where sound and music are integral to the religious experience. The authors advocate integrating the study of both material and non-material aspects to understand the changing sonic environments of heritage buildings [(De Muynke et al., 2022) pp. 1-2]. (Katz, 2023) articulates that ‘acoustics is an intangible product of a tangible building’. This integrated perspective could lead to a more holistic understanding of the dynamics between physical spaces and the perceptual and experiential dimensions attached to them.

3.1.1.3 Natural Heritage

Natural heritage, encompassing geological formations, biodiversity, and ecosystems of cultural, scientific, and aesthetic value, shares a significant overlap with CH. Many natural sites hold spiritual and symbolic importance for communities, becoming repositories of cultural memory and identity (Lowenthal, 2005). Traditional ecological knowledge developed by various cultures also underscores the interconnectedness of cultural and natural heritage, as indigenous communities have accumulated wisdom on sustainable resource use and ecological balance (Azzopardi et al., 2023). Moreover, the conservation and sustainable management of natural heritage is often intertwined with efforts to protect CH, fostering a collective commitment to preserve these entangled legacies for future generations.

The link between natural and CH goes beyond their shared values; spatial overlaps further accentuate their interdependence. Natural sites may have cultural significance, while CH sites may be situated within natural landscapes. For example, a national park may include archaeological sites or culturally revered landscapes, thus intertwining the cultural and natural dimensions. This spatial intermingling highlights the inextricable relationship between human societies and the natural environment, as cultural practices and beliefs become intertwined with the landscapes they inhabit. In this way, the preservation of both natural and cultural heritage becomes essential not only for their intrinsic worth but also for sustaining the narrative of our shared human and environmental history.

Additionally, the distinction between nature and culture is not only subjective and dependent on human appreciation (Vandenhende & Van Hoorick, 2017). Rather, it is a concept intrinsically linked with the overarching framework of modernism, a perspective that has been critically examined and deconstructed by the influential sociologist and philosopher, Bruno Latour, that have argued that ‘we have never been modern’ (Latour, 1993). Latour’s deconstruction of the modernist perspective extends to the recognition that the ‘the proliferation of hybrids has saturated the constitutional framework of the moderns’ (Latour, 1993, p. 51). This assertion underscores the fundamental challenge posed by hybrid entities – those that blur the boundaries between nature and culture – to the traditional categories upon which modernist thinking has been predicated. In essence, the concept of hybrids disrupts the neat divisions between the natural and social worlds that have been a hallmark of modernist discourse and provide us an opportunity to situate ourselves as ‘amodern’ as opposed to postmodern (Latour, 1990).

In addition to Latour’s critique of the modernistic distinction between nature and culture, the concept of the ‘parasite’, as expounded by Michel Serres, one of the influential thinkers who significantly influenced Latour’s intellectual development (Berressem, 2015). It offers a valuable lens through which to examine the intricacies of interconnectedness and interdependence within our world. In his view, everything is enmeshed in a complex web of relationships that negates the existence of self-contained entities. Rather than seeing discrete and isolated entities, Serres invites us to see everything as an integral part of a larger system in which each component is inextricably dependent on the others (Serres, 2014). Together, these complementary perspectives invite us to reevaluate our understanding of the intricate tapestry of existence, emphasising the complexities of our relationship with the world.

Thus, the appreciation of nature and culture is not mutually exclusive, but rather forms a continuous and evolving relationship. The modern perspective has historically separated these realms, treating them as distinct and disconnected. However, a more inclusive approach dissolves this artificial boundary and recognises the interconnectedness of nature and culture (Donna Haraway, 2008; Donna Jeanne Haraway, 2016). This paradigm shift challenges the traditional modern understanding and invites a more holistic view in which natural and cultural heritage are mutually constructed within a complex network of relationships.

Recognition of this relationship is essential in the context of heritage conservation and understanding. The dynamic interplay between nature and culture is recognised, and the acknowledgement of their coexistence promotes a more holistic approach to heritage conservation, where cultural practices, traditions and ecological systems are seen as interdependent aspects of the wider heritage tapestry. This recognition encourages us to see heritage sites not as isolated entities, but as part of a larger web of interconnectedness, and urges us to conserve and value both cultural and natural heritage with a shared responsibility. Adopting this interconnected perspective enables us to appreciate the profound connections between human societies and the natural world, and inspires a collective commitment to safeguarding these precious legacies for future generations.

3.1.1.4 Cultural Heritage Data

As I embark on the exploration of CH data, it is first necessary to establish a basic understanding of data in this context. At its core, data represents more than mere numbers and facts; it constitutes a collection of discrete or continuous values that are assembled for reference or in-depth analysis. In essence, data are the rich tapestry upon which the narratives of CH are woven, making its comprehension a critical prerequisite for our expedition into this domain.

Luciano Floridi — a prominent philosopher in the field of information and digital ethics — provides a thorough perspective on the term ‘data’ and offers valuable insights into its fundamental nature in its PI. He perceives ‘data at its most basic level as the absence of uniformity, whether in the real world or in some symbolic system. Only once such data have some recognisable structure and are given some meaning can they be considered information’ (Floridi, 2010). This initial definition sets the stage for a deeper exploration of Floridi’s understanding of data, as he further focuses on its transformative journey into a more meaningful and structured form, which we will explore next.

Building upon Floridi’s foundational concept of data as the absence of uniformity, his subsequent definition provides a more comprehensive perspective. In a previous work, (Floridi, 2005) [p. 357] argues that ‘data are definable as constraining affordances, exploitable by a system as input of adequate queries that correctly semanticise them to produce information as output’. This definition highlights the dynamic role of data, not only as raw entities awaiting structure and meaning but also as elements imbued with the potential to constrain and guide systems towards the generation of meaningful information.

Transitioning from Floridi’s concept of data, we progress to the view that data can be notably seen as interpretable texts within the DH perspective. According to (Owens, 2011) there are four main perspectives on how Humanists can engage with data:

These considerations highlight the multifaceted nature of data within the field of DH. It is in this complex landscape that we recognise that data transcends its traditional role as a passive entity. As (Rodighiero, 2021) [p. 26, citing (Akrich et al., 2006)] suggests that ‘there is no doubt that data are full-fledged actors that take part in the social network the actor-network theory describes, in which both human and non-human intertwine and overlap’. This notion – rooted and borrowed from STS – reinforces the idea that data, as an active and dynamic entity, plays a significant role in shaping the interactions between human and non-human actors in any digital spheres.

From these angles, I can look at the characteristics of CH data. (Bruseker et al., 2017) [p. 94] articulate that ‘data coming from the cultural heritage community comes in many shapes and sizes. Born from different disciplines, techniques, traditions, positions, and technologies, the data generated by the many different specializations that fall under this rubric come in an impressive array of forms’.

In exploring CH data, it is important to recognise the inherent diversity stemming from diverse disciplines, techniques, and traditions. (Bruseker et al., 2017) [p. 94] aptly emphasise this, highlighting the extensive array of forms in which data manifests. This heterogeneity raises fundamental questions about the unity and identity of CH data — a crucial aspect deserving acknowledgement within this context. As the authors astutely ponder:

It could be a natural problem to pose from the beginning: if the data of this community indeed presents itself in such a state of heterogeneity, does it not beg the question if there is truly an identity and unity to cultural heritage data in the first place? It could be argued that Cultural Heritage, as a term, offers a fairly useful means to describe the fuzzy and approximate togetherness of a wide array of disciplines and traditions that concern themselves with the human past.

Expanding on these insights, CH data refer to digital or data-driven affordances of CH[20], embodying a rich and varied compilation of insights originating from a variety of disciplines, techniques, traditions, positions and technologies. It encompasses both tangible and intangible aspects of a society’s culture as well as natural heritage. These data, derived from a wide range of disciplines, offer a latent capacity to support the generation of knowledge relating to historical time periods, geospatial areas, as well as current and past human and non-human activities. They are collected, curated and maintained by various entities such as libraries, archives, museums, higher education institutions, non-governmental organisations, indigenous communities and local groups as well as by the wider public.

Building further on the mosaic of CH data, three primary dimensions come to the fore: heterogeneity, knowledge latency, and custodianship.

Taken together, these dimensions contribute to a comprehensive understanding of the nuanced fabric of CH data. They reveal the diversity of forms and origins, the temporal aspects and the responsible stewardship that are crucial to the sustainability of such data.

By shifting our focus to the sphere of humanities data, we broaden our scope to extend beyond the peculiarities of CH data. Drawing parallels between these areas allows us to grasp the interconnectedness of our heritage.

CH data usually refers to information about cultural artefacts, sites, and practices that hold historical or cultural significance. Humanities data encompasses information about human culture, history, and society, including literature, philosophy, art, and language (Tasovac et al., 2020). Both often involve ethical considerations, such as ownership, access, and preservation, and require a comprehensive understanding of their various meanings and values (Ioannides & Davies, 2019). Moreover, (Schöch, 2013) explains that data in the humanities, such as text and visual elements, have unique qualities. While these analogue forms could be considered data, they lack the ability to be analysed computationally as they are non-discrete. The semiotic nature of language, text and art introduces dimensions tied to meaning and context, making the term ‘data’ problematic. Critics question its use because it conflicts with humanistic principles such as contextual interpretation and the subjective position of the scholar.

(Schöch, 2013) distinguish data in the humanities further into two core types: smart and big data. The former tends to be small in volume, carefully curated, but harder to scale such as digital editions. As for the latter, it describes voluminous and varied data and it loosely relies on the three ⋁ by volume, velocity and variety (see 3.3.1.2). Yet, big data in the humanities differs significantly from other fields as it rarely requires rapid real-time analysis, is less focused on handling massive volumes, and instead deals with diverse, unstructured data sources. (Schöch, 2013) concludes by arguing that ‘I believe the most interesting challenge for the next years when it comes to dealing with data in the humanities will be to actually transgress this opposition of smart and big data. What we need is bigger smart data or smarter big data, and to create and use it, we need to make use of new methods’.

Data processing offers great potential for humanities research as (Owens, 2011) argues: ‘In the end, the kinds of questions humanists ask about texts and artifacts are just as relevant to ask of data. While the new and exciting prospects of processing data offer humanists a range of exciting possibilities for research, humanistic approaches to the textual and artifactual qualities of data also have a considerable amount to offer to the interpretation of data’.

While the term ‘data’ in the context of the humanities may raise questions due to its semiotic and contextual complexities, it serves as a foundation for understanding both CH data and broader humanities data. The data originating from CH and the humanities are inherently intertwined, as they often share a similar nature and purpose for scholars. This strong interconnection leads to a collaborative relationship between the GLAM sector and the humanities or DH. Scholars in the humanities frequently rely on digitised cultural artefacts, historical records, linguistic resources, and literary works provided by GLAM institutions to gain valuable insights into human history, culture, and traditions. The digitisation efforts and research collaborations between these entities play a pivotal role in preserving CH data and advancing our understanding of diverse societies, fostering a deeper appreciation of our shared human heritage. CH data and humanities data are distinct from other scientific data due to their qualitative and subjective nature, which requires different methods of analysis than quantitative scientific data. They include archival and special collections, rare books, manuscripts, photographs, recordings, artefacts, and other primary sources that reflect the cultural beliefs, identity, and memory of a people (Izu, 2022; see Sabharwal, 2015).

In summary, while CH data and humanities data share some commonalities, they differ in terms of scope and subject matter. CH data focuses specifically on the preservation and documentation of physical artefacts and intangible attributes, while humanities data encompasses a broader range of disciplines within the humanities (Münster et al., 2019). However, it is important to note that the distinction between CH data and humanities data can be blurred, as (meta)data should ideally be co-created and integrated across both domains.

3.1.2 Representation and Embodiment of Cultural Heritage Data

Digital representation of CH data, while preserving their context and complexity, remain a significant challenge. Those representations, sometimes referred to as digital surrogates or digital twins (Conway, 2015; Semeraro et al., 2021; Shao & Kibira, 2018), of CH data can potentially lead to a loss of context and a reduction in the richness of the CH represented. For instance, a digital image of a cultural artefact may not capture its materiality, such as its texture, weight, and feel, which are essential aspects of the artefact’s cultural significance (Force & Smith, 2021). Furthermore, digital representations may also exclude vital social, cultural, and historical contexts surrounding the object, which is crucial to understanding its full cultural value (Cameron, 2007).

This subsection is structured around two key dimensions. Firstly, it explores materiality, highlighting how digital representations may fail to capture important aspects that are integral to understanding the significance of CH resources. Secondly, it navigates the convergence and divergence between digitised CH and digital heritage.

3.1.2.1 Materiality

Briefly, materiality refers to the physical qualities of an object or artefact, such as its colour, texture, and composition. As part of built heritage, the emphasis for materiality relates primarily to architecture, its associated techniques and the range of materials used in the construction or renovation of a building. More specifically, materiality acts as a pivotal factor in the transformation of disparate fragments of material culture into heritage, providing a vital link to the intangible facets of heritage. It contributes significantly to an individual’s social position and ability to navigate specific social milieus, thereby determining their ability to transmit cultural knowledge and values to future generations. The transformative potential of materiality in this regard underscores its fundamental role in perpetuating heritage and the transmission of cultural legacies (Carman, 2009). The physical attributes of objects, including texture, colour and shape, can evoke different emotions and associations, shaping people’s perceptions and memories of these events. Beyond retrospective influences, the potential of materiality extends to the creation of new memories and meanings, as exemplified by the use of materials such as glass in contemporary art. In such cases, materials evoke not only their inherent properties but also symbolic connotations, adding new layers of meaning and memory to the artistic narrative (Fiorentino & Chinni, 2023).

(Edwards & Hart, 2004) [p. 3] argue that materiality is not just concerned with physical objects in a positivist sense, but also involves complex and fluid relationships between people, images, and things. This relationship is influenced by social, cultural, and historical contexts, and plays a crucial role in shaping our perceptions and experiences of the world. Moreover, materiality is central to giving meaning to non-human entities (Haraway, 2003; see Latour, 1996; Star & Griesemer, 1989), which emphasises the role of both humans and non-humans in shaping social and cultural phenomena. For CH data, diversity is at its core, as it allows for the exploration of different ways of knowing, experiencing, and expressing the world. Therefore, it is important to approach materiality not as a static and fixed concept, but as a dynamic and evolving phenomenon that is shaped by multiple forces (Müller, 2018, pp. 62–63). When discussing materiality, there is also its negation, i.e. the notion of space or emptiness, such as how people interact with it through built heritage, which is regarded as a primordial medium of material culture, as expounded by (Guillem et al., 2023) [p. 2]:

The most intuitive and foundational definition of architecture is the built thing, that is the architecture qua building or built work. Human beings continuously interact with the built materiality through the non-materiality of space. Space as emptiness is formed and defined by the materiality that affects its existence. That relation between fullness and emptiness is what makes possible architecture as lived and experienced space.

Materiality also offers a means of challenging dominant narratives and power structures, particularly the Western-centric perspective on CH. It gives greater recognition to the importance of intangible CH, which often takes a back seat to tangible objects in dominant narratives (Lenzerini, 2011). By highlighting the materiality of marginalised or forgotten elements, individuals can reclaim their heritage and challenge dominant narratives that marginalise certain groups, contributing to a more inclusive and accurate representation of CH.

The primary focus in terms of digitisation is also on preserving material-based knowledge, often overlooking the dynamic and living nature of intangibility. (Hou et al., 2022) stress the crucial role of computational heritage and information technologies advances in preserving and improving access to intangible CH. Effectively documenting the ephemeral aspects of intangible heritage and communicating the knowledge that is deeply linked to individuals are pressing challenges. Recent initiatives seek to capture the dynamic facets of cultural practices, using visualisation, augmentation, participation and immersive experiences to enhance experiential narratives. There is a strong call for a strategic re-evaluation of the intangible CH digitisation process, emphasising the human body as a vessel for traditions and memories, such as capturing traditional Southern Chinese martial arts, who has been passed down colloquially from generations and needs a methodological approach to capture such embodied knowledge (see Adamou et al., 2023; Hou & Kenderdine, 2024).

Even in cases where considerable efforts have been devoted to digitisation of physical objects such as medieval manuscripts and rare books over the past few decades (Nielsen, 2008), a lingering concern persists regarding the authentic encounter with the original artefact, despite its enhanced accessibility through digital surrogates (Lit, 2020). Material attributes present a persistent challenge to achieving full replication. Despite advances facilitated by techniques such as RTI, 3D digitisation, or VR and AR, which offer better experiential immersion and are more effective than two-dimensional representations in addressing certain materiality concerns, the ability to replicate the multifaceted sensory experience associated with the original object, including the palpable emotions and spatial sensation, remains an ongoing endeavour, presenting a complex and multifaceted dimension of a challenge that is quite unlikely and may never be fully feasible (see Endres, 2019).

3.1.2.2 Digitised Cultural Heritage and Digital Heritage

The concepts of digitised CH and digital heritage intersect through the use of digital technology for the preservation, access, and dissemination of CH resources. Digitised CH focuses on converting physical artefacts into digital forms, ensuring their long-term preservation and accessibility through digital means. Conversely, digital heritage includes a broader range of digital tools and resources ‘to preserve, research and communicate cultural heritage’ ((Münster et al., 2021) p. 2, citing (Georgopoulos, 2018)).

Digitised CH acts as a critical bridge, facilitating a transition from traditional or analogue GLAM practices to a digital environment. This shift is pivotal in unlocking the potential of digitised CH. These values extend beyond scholarly pursuits, despite the majority of digitisation efforts being driven by research funding. In doing so, it becomes evident that the creative reuse and data-driven innovation stemming from digitised CH necessitate substantial and sustained investment in the GLAM sector. This investment is fundamental, especially amidst reduced funding due to years of austerity. (Terras et al., 2021) underscore this need, shedding light on the delicate balance required with commercial outcomes. They emphasised that leveraging CH datasets offers vast opportunities for technological innovation and economic benefits, urging professionals from various domains to collaborate and experiment in a low-risk environment.

Digital heritage[21] encompasses a wide range of human knowledge and expression in cultural, educational, scientific and various other domains. In today’s rapidly evolving technological landscape, an increasing amount of this knowledge is either digitally created or in the process of being converted from analogue to digital formats (He et al., 2017). These digital resources cover a wide range, including text, multimedia, software and more, and require deliberate and strategic management to ensure their long-term preservation. This valuable heritage, spread across the globe and expressed in multiple languages (UNESCO, 2009).

In summary, digitised CH not only forges the path to digital heritage but also embodies an ever-evolving cultural landscape. Recognising the transformative potency with digital heritage is essential to enriching our understanding and engagement with our cultural roots. Both concepts are intimately embedded in CH and play a vital role as conduits.

3.1.3 Collectives and Apparatuses

The collaborative efforts of collectives and the operation of various apparatuses play a fundamental part in shaping the preservation, interpretation and dissemination of cultural artefacts and practices. This subsection is concerned with the central contributions of human and non-human actors engaged in cooperative action and the modus operandi of various apparatuses, such as building (digital) infrastructures. Some of these considerations are drawn from STS, which are more fully captured in , serving as the theoretical framework for the thesis.

Bruno Latour’s concept of the importance of collectives and apparatuses (see Latour, 2022, p. 15) can be extrapolated to CHIs. Every institution’s or project’s ultimate success hinges on the collaboration and support of individuals, as well as the tools, systems and technologies they use. Indeed, paralleling CHIs with wider contexts suggests that collective efforts and apparatuses play a critical role in shaping the effectiveness of any institution. This highlights the importance of recognising the influence of both human and non-human entities in institutional functioning and underlines the need for a more comprehensive understanding of the dynamics involved therein.

ANT can be a useful lens to analyse the creation, use, and dissemination of CH data. ANT posits that actors are not independent entities but are instead part of a network that consists of both human and non-human entities. According to ANT, every actor, be it a person or a technology, is a node in the network and contributes to the overall functioning of the network (Callon, 2001; Latour, 2005). When we apply this framework to CHIs, we can identify the different actors involved in the creation, use, and dissemination of CH data. These actors can include individuals, such as curators, conservators, and historians, as well as non-human entities, such as databases, digitisation equipment, and software. Moreover, this approach can help us understand the interactions between these actors and how they shape the overall functioning of CHIs. For instance, digitisation equipment can enable the creation of high-quality digital images of artefacts, which can then be disseminated globally through online platforms. Examining the Notre-Dame de Paris, one can discern the keystones at the summit of its arches as indispensable actors within its architectural narrative. These keystones, imbued with historical narratives and a non-human facet, played a central role in the (digital) rescue and subsequent restoration efforts following the tragic roof fire in April 2019. (Guillem et al., 2023)’s study further elucidates this restoration journey, emphasising how the keystones, with their individual narratives and structural significance, contributed to the (digital) reassembly.

Building on this perspective, we can explore the importance of community involvement in the preservation and management of CH data, thereby increasing the potential for sustainable practices and inclusive engagement.

Local communities have an integral part to play in the management and preservation CH data, especially in the digital age where resources are often scarce for GLAM institutions. Community involvement has several benefits, including increased engagement and participation, access to local knowledge and expertise, and more sustainable and inclusive management and preservation practices (Ridge et al., 2021). For instance, geophysical technologies such as ground-penetrating radar have been used with great success in identifying and evaluating the depth, extent, and composition of CH resources for research and management purposes, easing tensions when working with sensitive ancestral places (Nelson, 2021). Collaborative environments can also help with CH information sharing and communication tasks because of the way in which they provide a visual context to users, making it easier to find and relate CH content (Respaldiza Hidalgo et al., 2011).

Embarking on (Brown et al., 2023) [pp. 6-7]'s insightful analysis, a prominent illustration of exemplary community practice can be found in the sphere of community museums in Latin America: Inicio - Museos Comunitarios de América[22]. The author highlights the role of community engagement and leadership in the creation and operation of these museums. Such engagement ensures that these museums are not imposed from outside, but rather emerge organically as museums the community, resonating with its unique CH and identity. This approach is consistent with the ethos of ‘telling a story’, building a future, which embodies a deep commitment to community empowerment and cultural preservation. This community-centric approach amplifies the museum’s resonance with the community’s lived experiences and historical narratives.

At the same time, institutions can also benefit from collaborating with peer communities like IIIF to promote greater access to their collections. IIIF provides a set of open standards for delivering high-quality digital objects online at scale, which can help memory and academic institutions share their collections with each other and with the wider public (Snydman et al., 2015; Weinthal & Childress, 2019). By adopting IIIF standards, organisations can make their collections more discoverable and accessible to researchers, developers, and other CH professionals (Padfield et al., 2022). Involvement in communities such as IIIF also helps to mitigate costs as they develop shared or adaptable resources and services (Raemy, 2017).

Participation of communities in the management and preservation of CH resources is essential to ensure that CH is protected and accessible for future generations. By involving and participating in communities, GLAMs can tap into local as well as peer knowledge and expertise, making management and preservation practices more sustainable and inclusive. This approach also increases engagement and participation, ensuring that CH is valued and appreciated by the wider community. Thus, memory institutions need to collaborate closely with communities to ensure that CH data, and their underlying infrastructures and services, is being effectively curated (Delmas-Glass & Sanderson, 2020).

Closely related to this context, (Star, 1999) points out the often unacknowledged role of infrastructure within society. She argues that infrastructures are necessary but often invisible and taken for granted:

People commonly envision infrastructure as a system of substrates – railroad, lines, pipes and plumbing, electrical power plants, and wires. It is by definition invisible, part of the background for other kinds of work. It is ready-to-hand. This image holds up well enough for many purposes – turn on the faucet for a drink of water and you use a vast infrastructure of plumbing and water regulation without usually thinking much about it. [(Star, 1999) p. 380]

(Star, 1999) [pp. 381-382, citing (Star & Ruhleder, 1994)] identifies nine dimensions to define infrastructure. They provide a comprehensive framework to comprehend the nuanced nature of infrastructure and its pervasive impact on diverse societal facets. The following dimensions are vital for analysing the often imperceptible, yet deeply embedded structures that constitute the foundational framework of both daily life and broader societal operations[23]:

An appreciation of these dimensions is crucial to the analysis of the network of infrastructural systems that underpin contemporary society, and is necessary for the analysis of any digital infrastructure that manages CH data.

Digital infrastructures – also known as e-infrastructures or cyberinfrastructures – are forms of infrastructure that are essential for the functioning of today’s society (see Jackson et al., 2007; Ribes & Lee, 2010). These kinds of infrastructure need to be understood as socio-technical systems, showcasing the interplay between technological components (such as hardware, software, and networks) and the social and organisational contexts in which they operate (Star & Ruhleder, 1994). According to (Fresa, 2013) [p. 33], digital CH infrastructures should be able to serve the research needs of humanities scholars as well as having dedicated services for education, learning, and general public access. In terms of requirements, (Fresa, 2013) [pp. 36-39] identifies three different layers of services: for content providers, for managing and adding value to the content, and for the research communities. For the latter, several sub-services tailored to research communities are listed. These encompass long-term preservation, PIDs[24], interoperability and aggregation, advanced search, data resource set-up, user authentication and access control, as well as rights management.

Overall, (digital) infrastructures are imperative apparatuses in preserving and sharing CH data. First, they support preservation by archiving digital artefacts and their metadata, protecting them from deterioration and loss. Secondly, these infrastructures facilitate accessibility, allowing a global audience to explore and appreciate cultural heritage online. Finally, they encourage interpretation and engagement, promoting cross-cultural understanding and knowledge sharing.

Moreover, infrastructure is a fundamental component that demands extensive investment, particularly in the creation of streamlined integration layers capable of interacting seamlessly with different systems. This can be exemplified by such institutions as the Rijksmuseum[25] , where a well-constructed infrastructure allows for efficient integration and interaction with various technological and organisational systems (Dijkshoorn, 2023). This investment serves as the foundation for an institution’s functionality, allowing for the smooth flow of data, the coordination of processes and the optimal use of resources. In a similar vein, (Canning et al., 2022) argue that the often invisible structures of metadata, particularly in Linked Data ontologies, play a crucial role in shaping the interpretation of data. These structures, while not immediately apparent, are imbued with value judgements and ideological implications, extending the impact of metadata beyond mere technicalities to encompass diverse and inter-sectional perspectives. This multidimensional ontological approach addresses the complexity and diversity of data sources, paralleling the need for sophisticated infrastructures in institutions like the Rijksmuseum. It underscores the importance of integrating inter-sectional feminist principles in information systems, reflecting a commitment to diverse ways of knowing and nuanced storytelling.

Furthermore, as all (meta)data requires storage, it raises an important concern in terms of the entrenched power dynamics governing knowledge representation within information systems, as pointed out by This perspective, initially centred around museum objects, holds broader implications for all CH resources (see Simandiraki-Grimshaw, 2023). Canning strongly advocates for the essential adaptation of databases to embrace a diverse array of epistemological approaches by introducing new types of affordances. Databases, despite their role in information preservation, wield significant influence that can inadvertently stifle diverse modes of knowledge interpretation and ‘can constrain ways of knowing’. Furthermore, she compellingly argues that modifications to databases extend beyond technical adjustments; they are inextricably linked to shifts in institutional power dynamics and the enduring, often inequitable, power dynamics governing the world of museums – or any CHIs – and their curation.

In understanding the interplay of collectives and apparatuses, it is clear that key actors, including individuals, institutions, local and global communities, as well as the sophisticated fabric of (digital) infrastructures and their components, are deeply entangled and interconnected. These entities, both human and non-human, collectively shape and navigate the rich networks of human interactions and technologies that underpin the foundations of contemporary society.

3.2 Cultural Heritage Metadata

This subsection offers insights into the importance of metadata in CH, underlining its role in enhancing the understanding and accessibility of cultural artefacts. It is structured into three four[26] essential parts. I start with an introductory segment in 3.2.1, then I explore the types and functions of metadata in 3.2.2, thirdly in 3.2.3, I outline some of the most important CH metadata standards, and finally in 3.2.4, I explore the use of KOS, such as generic classification systems and controlled vocabularies.

3.2.1 Data about Data

For curating CH resources, metadata[27], ‘data about data’, is probably one of the key concept that needs to be introduced here. Metadata permeate our digital and physical landscapes, playing a vital role in organising, describing and managing a vast array of information. Rather than being confined to a specific domain, they are ubiquitous and pervade many aspects of our everyday lives (Riley, 2017, pp. 2–3). From websites and databases to social media platforms and online marketplaces, metadata adds meaning to data, enabling users to understand their context, relevance and provenance. As an example, Figure 3.1 shows the metadata of a book[28].

Snapshot from the Swisscovery Platform Showing the Bibliographic Record of (Zeng & Qin, 2022)
Figure 3.1: Snapshot from the Swisscovery Platform Showing the Bibliographic Record of (Zeng & Qin, 2022)

Metadata are central to the management and preservation of CH data, providing essential information to ensure that data can be properly organised, discovered and retrieved. For example, they facilitate the understanding and interpretation of data, enabling scholars and the public to access and use them effectively (Constantopoulos & Dallas, 2008). Metadata also help to ensure the long-term preservation and accessibility of CH data [(Zeng & Qin, 2022) pp. 490-491]. Providing metadata in a structured manner facilitates forms of aggregation, i.e. individuals and institutions being able to harvest and organise metadata from multiple sources or repositories into a centralised location (see Freire et al., 2017, 2021). In addition, the importance of metadata as a gateway to information is particularly compelling when the primary embodiment of a record is either unavailable or lost. In cases where resources, time constraints, sensitive content or strategic decisions prevent the digitisation of an item, metadata becomes the principal means of representation and access. If a physical record is lost or damaged, the metadata associated with that record acts as a proxy for the record.

(Riley, 2017) [p. 5] discusses the transformation of libraries over time. Initially, libraries moved from search terminals to the modern web-based resource discovery systems we use today. This shift was driven by advances in computerisation. Libraries’ basic approach to metadata is ‘bibliographic’, deeply rooted in their traditional expertise in describing books. This approach involves providing detailed descriptions of individual items so that users can easily locate them within the library’s collection.

On the other hand, archives use ‘finding aids’, which are descriptive inventories of their collections, coupled with historical context. These aids are essential for users to understand the material and to find groups of related items within the archive. The metadata used in archives allows for the contextualisation of materials, particularly papers of individuals or records of organisations, providing a richer understanding of the content.

Similarly, museums actively manage and track their acquisitions, exhibitions and loans through metadata. Museum curators use metadata to interpret collections for visitors, explaining the historical and social significance of artefacts and describing the relationships and connections between different objects. This helps to enhance the overall visitor experience and understanding of the artefacts on display or the digital resources on a particular website.

3.2.2 Types and Functions

CHIs share common objectives and concerns related to information management, as highlighted by (Lim & Li Liew, 2011) [pp. 484-485]. These goals typically include facilitating access to knowledge and ensuring the integrity of CH data. However, it is important to note that CHIs also differ widely in how they deal with metadata. Different domains have unique approaches and standards for describing the materials they collect, preserve and disseminate, and even within a single domain there are significant differences.

There have been different attempts to categorise the metadata landscape. For instance, (Gilliland, 2016) identified the following five categories of metadata and their respective functions:

Meanwhile (Riley, 2009), as illustrated in a comprehensive visualisation graph[29], suggested seven functions, i.e. the role a standard play in the creation and storage and metadata, and seven purposes referring to the general type of metadata.

Almost a decade later, (Riley, 2017) [pp. 6-7] summarised metadata types into four groupings instead of the seven purposes previously mentioned. is removed from the list and technical, preservation and rights metadata are now grouped into a newly created administrative metadata category.

  1. Descriptive metadata: For finding or understanding a resource
  2. Administrative metadata: Umbrella term referring to the information needed to manage a resource or that relates to its creation
    • 2.1 Technical metadata: For decoding and rendering files
    • 2.2 Preservation metadata: Long-term management of files
    • 2.3 Rights metadata: Intellectual property rights attached to content
  3. Structural metadata: Relationships of parts of resources to one another
  4. Markup Language: Integrates metadata and flags for other structural or semantic features within content[30].

This classification of metadata types and function differs to the categories identified by (Gilliland, 2016) mostly due to the addition of structural metadata and markup language as their own categories [(Zeng & Qin, 2022) p. 19]. Table 3.1 lists the major types of metadata according to (Riley, 2017) [p 7] and include example properties and their primary uses.

Table 3.1: Types of Metadata According to (Riley, 2017) [p. 7]
Metadata (Sub)type Example properties Primary uses
1. Descriptive metadata Title, Author, Subject, Genre, Publication date Discovery, Display, Interoperability
2.1 Technical metadata File type, File size, Creation date, Compression scheme Interoperability, Digital object management, Preservation
2.2 Preservation metadata Checksum, Preservation event Interoperability, Digital object management, Preservation
2.3 Rights metadata Copyright status, Licence terms, Rights holder Interoperability, Digital object management
3. Structural metadata Sequence, Place in hierarchy Navigation
4. Markup languages Paragraph, Heading, List, Name, Date Navigation, Interoperability

Ultimately, metadata can also be leveraged to create more inclusive and diverse representations of CH. For instance, metadata can be used to document and promote underrepresented communities and their heritage, providing greater visibility and recognition. This approach aligns with the principles of decolonising CH, promoting equity and social justice by recognising and valuing diverse cultural perspectives, especially in the prevailing anglophone and Western-centric standpoint in DH (Mahony, 2018; Philip, 2021).

Moreover, the distinction between data and metadata, as discussed in the work of (Alter et al., 2023), is not always distinct, leading to the concept of ‘semantic transposition’. This complexity reflects in CH where what is considered metadata in one context might be primary data in another, underscoring the necessity for adaptable frameworks in data management. This understanding is crucial for fostering inclusive and diverse representations in CH, ensuring that all cultural narratives are appropriately documented and acknowledged.

3.2.3 Standards

Metadata standards play a crucial role in ensuring that data are organised and consistent, facilitating mutual understanding between different stakeholders (Raemy, 2020). CHIs such as GLAMs typically follow established conventions or standards when organising their resources. Current methods of cataloguing have historical roots dating back to the century, particularly with the development of cataloguing systems such as Antonio Panizzi’s at the British Museum and Charles Coffin Jewett’s efforts to mechanically duplicate entries at the library of the Smithsonian Institution [(Zeng & Qin, 2022) pp. 14-15].

Unique metadata standards, rules and models have been established and maintained within specific sub-fields. In addition, certain standards for information resources have been endorsed by authoritative bodies (Greenberg, 2005), and some are used exclusively within specific domain communities (Hillmann et al., 2008). (Riley, 2017) [p. 5] underscores the predilection of CH metadata – whether these standards emanate from libraries, archives, or museums – toward accentuating descriptive attributes. The foundational CH metadata standards, primarily conceived to [(Zeng & Qin, 2022) p. 11], manifest this thematic focus. Within the CH domain, metadata standards vary widely in scope, and a number of different standards have been developed to meet different needs and priorities[31] (Freire et al., 2018).

The following quoted passage sheds some light on the different approaches and levels of collaboration in metadata standardisation, namely among the library and museum sectors.

Despite the striving for homogeneity, in practice, the production of metadata among information specialists and the use of metadata standards is already marked by considerable diversity. This has come about for very pragmatic reasons. Different types of objects and collections require different types of metadata. The curatorial interest for particular information differs for example between images held in an art gallery and a library, as does the information specialists’ domain expertise. Accordingly, diversity in metadata practice seems to be greatest in museums as they are the institutions that govern the most diverse collections. While the library sector has ‘systematically and cooperatively created and shared’ metadata standards since the 1960s, the museum sector, mostly handling images and objects, has been slower to establish such collaboration and consensus. (Dahlgren & Hansson, 2020, p. 244)

In this context, I want to focus on some metadata standards that have proved vital across libraries, archives, museums and galleries. These standards, which I will briefly describe, serve as the foundation for organising, describing, and enabling efficient access to vast and diverse collections. Of particular interest I will be taking a closer look at CIDOC-CRM as it serves as the cornerstone of Linked Art, a fundamental LOUD standard.

3.2.3.1 Library Metadata Standards

In libraries, several metadata standards have played crucial roles in organising and accessing collections over the years. The most prevalent historical standard, MARC[32], was a pilot project from the 1960s funded by the CLIR and led by the LoC to structure cataloguing data and distribute them through magnetic tapes (Avram, 1968, p. 3). The standard evolved into MARC21 in 1999 [(Zeng & Qin, 2022) p. 418] – as exemplified by Code Snippet 3.1, providing a structured format for bibliographic records and related information in machine-readable form. It uses codes, fields, and sub-fields to structure data. Another significant historical standard is the AACR, published in 1967 and revised in 1978 that provides sets of rules for descriptive cataloguing of various types of information resources.

Code Snippet 3.1: MARC21 Record of (Zeng & Qin, 2022) in the Swisscovery Platform

leader  01424nam a2200397 c 4500
001 991170746542405501
005 20220427104002.0
008 210818s2022    xxu      b    001 0 eng
010 ##$a  2021031231
020 ##$a9780838948750 $qBroschur
020 ##$a0838948758
035 ##$a(OCoLC)1264724191
040 ##$aDLC $bger $erda $cDLC $dCH-ZuSLS UZB ZB
042 ##$apcc
050 00$aZ666.7 $b.Z46 2022
082 00$a025.3 $223
082 74$a020 $223sdnb
100 1#$aZeng, Marcia Lei $d1956- $4aut $0(DE-588)136417035
245 10$aMetadata $cMarcia Lei Zeng and Jian Qin
250 ##$aThird edition
264 #1$aChicago $bALA Neal-Schuman $c2022
300 ##$axxvi, 613 Seiten $bIllustrationen
336 ##$btxt $2rdacontent
337 ##$bn $2rdamedia
338 ##$bnc $2rdacarrier
504 ##$aIncludes bibliographical references and index
650 #0$aMetadata
650 #7$aMetadata $2fast $0(OCoLC)fst01017519
650 #7$aMetadaten $2gnd $0(DE-588)4410512-5
776 08$iErscheint auch als $nOnline-Ausgabe $tMetadata $z9780838937969
776 08$iErscheint auch als $nOnline-Ausgabe $tMetadata $z9780838937952
700 1#$aQin, Jian $d1956- $4aut $0(DE-588)1056085541
856 42$3Inhaltsverzeichnis $qPDF $uhttps://urn.ub.unibe.ch/urn:ch:slsp:0838948758:ihv:pdf
900 ##$aOK_GND $xUZB/Z01/202203/klei
900 ##$aStoppsignal FRED $xUZB/Z01/202203
949 ##$ahttps://urn.ub.unibe.ch/urn:ch:slsp:0838948758:ihv:pdf
    

AACR is no longer maintained and was replaced by RDA[33] around 2010 to be a more adaptive standard to contemporary needs. RDA, while not a markup language like MARC, serves as a content standard that guides the description and discovery of resources, focusing on user needs and facilitating improved navigation of library collections. Its goal is to provide a flexible and extensible framework for the description of all types of resources, ensuring discoverability, accessibility, and relevance for users[34] (Sprochi, 2016, p. 130).

Libraries often leverage other standards to enrich their metadata practices. MODS[35], introduced in 2002, offers a more flexible XML-based schema for bibliographic description, allowing for better integration with other standards and systems. It was initially developed to carry [(Zeng & Qin, 2022) p. 423]. MODS provides a balance between human readability and machine processing, making it suitable for a wide range of resources and use cases (Guenther, 2003, p. 139). METS[36], on the other hand, is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library. METS, developed as an initiative of the DLF, provides a flexible and extensible framework for structuring metadata, allowing for the packaging of complex digital objects (Cantara, 2005, pp. 238–239). While MODS is primarily concerned with bibliographic information, METS focuses on structuring metadata for digital objects, making it particularly useful for digital libraries and repositories.

A further important standard is FRBR, a conceptual framework for understanding and structuring bibliographic data and access points. Originally developed by IFLA in 1997 as part of its functional requirements family of models, FRBR describes three main groups of entities, relationships, and attributes as illustrated by Figure 3.2. The first group of entities are the foundation of the model which characterises four levels of abstraction: WEMI (Denton, 2006, p. 231). FRBR has had a significant impact on the development of RDA, which is loosely aligned with the principles and structures defined by the conceptual framework, but as it isn’t a data model per se; it does not inform how to record bibliographic information in day-to-day practice and focus heavily on textual resources[37] (Sprochi, 2016, pp. 130–131). Furthermore, (Cossham, 2017) [p. 11] asserts that FRBR and RDA, ‘don’t align well with the ways that users use, understand, and experience library catalogues nor with the ways that they understand and experience the wider information environment’.

The FRBR Conceptual Framework. Adapted from (Zou et al., 2018) [p. 36]
Figure 3.2: The FRBR Conceptual Framework. Adapted from (Zou et al., 2018) [p. 36]

A further important standard in the field of library science is the LRM, which was introduced as a comprehensive conceptual framework. It provides a broad understanding of bibliographic data and user-centric design principles, aligning with FRBR. LRM defines key entities, attributes, and relationships important for bibliographic searches, interpretation, and navigation – as shown in Figure 3.3. It operates at the conceptual level and does not dictate data storage methods. Attributes in LRM can be represented as literals or URIs. The model is presented in a structured document format to support LOD applications and reduce ambiguity. During its development, a parallel process created FRBRoo (see 3.2.3.3), a model that extends the original FRBR model by incorporating it into CIDOC-CRM. FRBRoo focuses on CH data and is more detailed than LRM, which is designed specifically for library data and follows a high-level, user-centric approach (Riva et al., 2017, pp. 9–13). The LRM model, known as LRMer[38], was released in 2020 by IFLA [(Zeng & Qin, 2022) p. 163].

Overview of Relationships in LRM (Riva et al., 2017, p. 86)
Figure 3.3: Overview of Relationships in LRM (Riva et al., 2017, p. 86)

BibFrame[39] is another metadata standard in the library domain. It was initiated around 2011 by the LoC to be a successor of MARC, which had become obsolete (see Tennant, 2002) as well as being invisible to web crawlers and search engines preventing adequate discoverability of bibliographic resources (Sprochi, 2016, p. 132). BibFrame is a loosely RDF-based model (Sanderson, 2015), intending to improve the interoperability and discoverability of library resources. While the BibFrame model may not perfectly correspond with the WEMI entities outlined in FRBR, it is possible to effectively link BibFrame resources to FRBR entities, ensuring their compatibility (Sprochi, 2016, p. 133). BibFrame aims to transition from MARC by providing a more web-friendly framework, focusing on the relationships between entities, improving data sharing, and accommodating the digital environment. Conversely, (Edmunds, 2023) argues that BibFrame is unaffordable and leads to elitism within libraries, with the main beneficiaries being well-funded institutions, particularly in North America, while placing a financial burden on others. This approach, endorsed by bodies such as the LoC, is criticised for its high cost, impracticality, inequity and limited benefits for cataloguers, libraries, vendors and the public they serve. In addition, the author highlights BibFrame's lack of user friendliness, regardless of the intended users, and criticises the notion of adopting Linked Data for its own sake without substantial practical benefits.

3.2.3.2 Archival Metadata Standards

For archives, metadata standards like EAD[40] and ISAD(G)[41] have been pivotal. EAD, introduced in the mid-1990s – it originated in 1993 and the first version of EAD was released in 1998, provides a hierarchical structure for representing information about archival collections, offering comprehensive descriptions that aid researchers, archivists, and institutions in managing and providing access to archival records. Its goal is to create a standard for encoding finding aids to improve accessibility and understanding of archival collections (Pitti, 1999, pp. 61–62). On the other hand, ISAD(G), released in its first version in 1994 by ICA, offers a more general international standard for archival description, providing a framework for describing all types of archival materials, including fonds, sub-fonds, series, files, and items (Shepherd & Smith, 2000, p. 57). ISAD(G) aims to establish consistent and standardised archival description practices on a global scale, facilitating the sharing and exchange of archival information.

PREMIS[42], is another metadata standard that was initially released in 2005 – version 3.0 is the latest specification, published in 2016 – and focuses on the preservation of digital objects, consisting of four interrelated entities: Object, Event, Agent, and Rights (Caplan & Guenther, 2005, p. 111). The main objective of PREMIS is to help institutions ensure the long-term accessibility of data by capturing key details about their creation, format, provenance, and preservation events. It is seen as an elaboration of OAIS, which categorises information required for preservation in several functional entities and types of information package (see Lee, 2009, pp. 425–426) – as illustrated by Figure 3.4, expressed through the mapping of preservation metadata onto the conceptual model [(Zeng & Qin, 2022) pp. 493-494].

OAIS Functional Model Diagram by (Mathieualexhache, 2021)
Figure 3.4: OAIS Functional Model Diagram by (Mathieualexhache, 2021)

The latest development in metadata standards for archives is the creation of RiC, which has been developed since 2012 by ICA (Clavaud & Wildi, 2021, pp. 79–80). RiC is structured into four complementary parts (ICA Expert Group on Archival Description, 2023, p. 1) intended to cover and replace existing archival standards such as ISAD(G):

Global Overview of the Core Entities Defined by the RiC Conceptual Model. Slightly Adapted from https://github.com/ICA-EGAD/RiC-O
Figure 3.5: Global Overview of the Core Entities Defined by the RiC Conceptual Model. Slightly Adapted from https://github.com/ICA-EGAD/RiC-O
3.2.3.3 Museum and Gallery Metadata Standards

In the museum and gallery domain, various metadata standards and conceptual models have significantly contributed to the management, organisation, and accessibility of CH objects and artworks. Notable among these are CDWA, CCO, LIDO, CIDOC-CRM, as well as Linked Art.

CDWA[45], developed in the mid-1990s and maintained by the Getty Vocabulary Program, and CCO[46] created by the VRA[47], introduced in the early 2000s, primarily focus on describing art and cultural artefacts, providing a framework for recording essential details like artist, title, medium, date, and provenance. CDWA is a comprehensive set of guidelines for cataloguing and describing various cultural objects, including artworks, architectural elements, material culture items, collections of works, and associated images. While not a data model itself, it offers a conceptual framework for designing data models and databases, as well as for information retrieval. It then evolved into CDWA Lite, an XML schema for data harvesting purposes (Baca & Harpring, 2017, pp. 1–2).

CCO comprises of both rules and examples of the CDWA categories and the VRA Core 4.0 for describing, documenting, and cataloguing cultural works and their visual surrogates[48] (Coburn et al., 2010, pp. 17–18). Both CCO and CDWA are standards that the CIDOC[49] recommends and supports for museum documentation.

LIDO[50] is a CIDOC standard introduced in the early 2000s which offers a lightweight XML-based serialisation used for describing museum-related information – as shown in Code Snippet 3.2. It provides a format for the interchange of data about art and CH objects, complementing CDWA and CCO as it integrates and extends CDWA Lite with elements of CIDOC-CRM (Stein & Balandi, 2019, p. 1025). Ultimately, LIDO's goal is to enhance interoperability, accessibility, and the sharing of collection information, enabling institutions to connect and showcase their collections in diverse contexts (Coburn et al., 2010, p. 3). LIDO is also a CIDOC Working Group, which are created to tackle particular issues or areas of interest[51].

Code 3.2: Example of a LIDO Object in XML from (Lindenthal et al., 2023)

<lido:lido>
    <lido:lidoRecID
        lido:source="ld.zdb-services.de/resource/organisations/DE-Mb112"
        lido:type="http://terminology.lido-schema.org/lido00099">
        ld.zdb-services.de/resource/organisations/DE-Mb112/lido/obj/00076417
    </lido:lidoRecID>
    <lido:descriptiveMetadata xml:lang="en">
        <lido:objectClassificationWrap>
            <lido:objectWorkTypeWrap>
                <lido:objectWorkType>
                    <skos:Concept
                        rdf:about="http://vocab.getty.edu/aat/300033799">
                        <skos:prefLabel
                            xml:lang="en">
                            oil paintings (visual works)
                        </skos:prefLabel>
                    </skos:Concept>
                </lido:objectWorkType>
            </lido:objectWorkTypeWrap>
        </lido:objectClassificationWrap>
        <lido:objectIdentificationWrap>
            <lido:titleWrap>
                <lido:titleSet>
                    <lido:appellationValue
                        lido:pref="http://terminology.lido-schema.org/lido00169"
                        xml:lang="en">
                        Mona Lisa
                    </lido:appellationValue>
                </lido:titleSet>
            </lido:titleWrap>
        </lido:objectIdentificationWrap>
    </lido:descriptiveMetadata>
</lido:lido>
    

CIDOC-CRM[52], developed since 1996 by the CIDOC and more specifically maintained by the CRM-SIG — which convenes quarterly[53], is a formal and top-level ontology that offers a comprehensive conceptual framework for describing CH resources, allowing for a deep understanding of relationships between different entities, events, and concepts for museums (Doerr, 2003, pp. 75–76). It aims to provide a common semantic framework for information integration, supporting robust knowledge representation and fostering collaboration and interoperability within the CH sector as it can also mediate different resources from libraries and archives. The latest stable version of the conceptual model is version 7.1.2[54], published in June 2022, and comprises of 81 classes and 160 properties[55] (see Bekiari et al., 2021).

Within the base ontology of CIDOC-CRM – or CRMBase – and despite the emergence of new developments and gradual changes, there is a fundamental and stable core that can be succinctly outlined. This fundamental structure acts as a basic orientation for understanding the way in which data is structured within CIDOC-CRM. Examining the hierarchical structure of CIDOC-CRM, one can identify the main top-level branches, namely:

Complemented by entities tailored for the documentation of E41 Appellation and E55 Type, the structure – as shown in Figure 3.6 – provides a potent set of means to capture a broad range of general-level CH reasoning in a holistic manner [(Bruseker et al., 2017) pp. 111-112].

CIDOC-CRM Top-Level Categories by (Bruseker et al., 2017) [p. 112]
Figure 3.6: CIDOC-CRM Top-Level Categories by (Bruseker et al., 2017) [p. 112]

CRMBase, is supplemented by a series of extensions – sometimes referred to as the CIDOC-CRM family of models – intended to support various types of specialised research questions and documentation, such as bibliographic records or geographical data. These compatible models[56], ordered alphabetically, include both works in progress and models to be reviewed by CRM-SIG[57]. They comprise as follows:

Figure 3.7 shows CRMbase and eight of the extensions previously outlined in a pyramid shape, where the lower you go in the pyramid, the more specialised the concepts.

CIDOC-CRM Family of Models. Diagram Done and Provided by Maria Theodoridou (Institute of Computer Science, FORTH)
Figure 3.7: CIDOC-CRM Family of Models. Diagram done and provided by Maria Theodoridou (Institute of Computer Science, FORTH)

Linked Art[69], a recent addition to this landscape, is a community-driven initiative and a metadata application profile that has been in existence since the end of 2016 (Raemy, 2022, pp. 136–137). This community – recognised as a CIDOC Working Group – has created a common Linked Data model based on CIDOC-CRM for describing artworks, their relationships, and the activities around them (see 3.5.5).

3.2.3.4 Cross-domain Metadata Standards

There are a few cross-domain standards that have been used to describe CH resources. For instance, the Dublin Core Elements, containing the original core sets of fifteen basic elements, and Dublin Core Metadata Terms[70], its extension, are widely used metadata standards for describing CH resources. It provides metadata properties and classes that are applicable to a wide range of resources (Weibel & Koch, 2000). Another good example is the EDM that has been specified so that national, regional and thematic aggregators in Europe can deliver resources of content providers to Europeana (see Charles & Isaac, 2015; Freire & Isaac, 2019).

Despite the presence of cross-domain standards and efforts to map between standards, whether from one version to another or across different domains, reconciling metadata from various sources remains a significant challenge in the CH sector. Institutions may collect metadata in different ways, using different standards and schemas, making it difficult to merge and compare metadata from different sources. Additionally, metadata may be incomplete, inconsistent, or contain errors, further complicating data reconciliation. To address these challenges, standardised, interoperable metadata are necessary to enable data sharing and reuse. While the use of different metadata standards can present challenges for data reconciliation, the adoption of standardised, interoperable metadata can facilitate data sharing and reuse, promoting the long-term preservation and accessibility of CH resources. Controlled vocabularies – included in what (Zeng & Qin, 2022) [pp. 24-25] called ‘standards for data value’ – such as those maintained by the Getty Research Institute[71]: the AAT, the TGN, and the ULAN, as well as various kinds of KOS (see 3.2.4). These vocabularies provide a common language for describing CH objects and can improve the interoperability of metadata across different institutions and communities.

Alongside metadata reconciliation comes also the question of aggregation. Apart from LIDO in museums, the general and current operating model for aggregating CH (meta)data is still the OAI-PMH (see Raemy, 2020), which is an XML-based standard that was initially specified in 1999 and updated in 2002 (Lagoze et al., 2002). Alas, OAI-PMH does not align to contemporary needs (Van de Sompel & Nelson, 2015), and there are now some alternative and web-based technologies for harvesting resources that are slowly being leveraged such as AS (Snell & Prodromou, 2017), a W3C syntax and vocabulary for representing activities and events in social media and other web application. It can also be easily extended and used in different contexts, such as it is the case with the IIIF Change Discovery API (see 3.5.3.3) or with ActivityPub (Lemmer-Webber & Tallon, 2018), a decentralised W3C protocol being leveraged by Mastodon[72], a federated and open-source social network.

Overall, the evolution of metadata standards in the CH domain paves the way for a more interconnected and accessible digital environment, thereby providing better access to disparate collections and facilitating cross-domain reconciliation. This transformation is complemented by a growing emphasis on web-based metadata aggregation technologies that are more suited to today’s needs.

3.2.4 Knowledge Organisation Systems

KOS, also known as concept systems or concept schemes, encompass a wide range of instruments in the area of knowledge organisation. They are distinguished by their specific structures and functions (Mazzocchi, 2018, p. 54). KOS include authority files, classification schemes, thesauri, topic maps, ontologies, and other related structures. Despite their differences in nature, scope and application, all share a common goal: to facilitate the structured organisation of knowledge and classification of information. According to (Zeng & Qin, 2022) [p. 284], KOS have a more important function: to model the underlying semantic structure of a domain and to provide semantics, navigation, and translation through labels, definitions, typing relationships, and properties for concepts’. This overarching intent underpins the practice of information management and retrieval.

The term KOS ‘became even more popular after the encoding standard Simple Knowledge Organization System (SKOS) was recommended by W3C’, although the use of such systems can be traced back over 100 years, whereas others have been created in the advent of the web [(Zeng & Qin, 2022) p. 188]. According to (Hill et al., 2002) [pp. 46-47, citing (Hodge, 2000)], KOS can be divided into four main groups: term lists, metadata-like models, classification and categorisation, as well as relationship models.

Term lists encompass authority files, dictionaries, and glossaries, serving as controlled sources for managing terms, definitions, and variant names within a knowledge organisation framework. Metadata-like models encompass directories and gazetteers, offering lists of names and associated contact information as well as geospatial dictionaries for named places, with can be extended for representing events and time periods. In the classification and categorisation domain, you find categorisation schemes and classification schemes that organise content, subject headings that represent controlled terms for collection items, and taxonomies that group items based on specific characteristics. Finally, relationship models feature ontologies, semantic networks, and thesauri, each capturing complex relationships between concepts and terms [(Hill et al., 2002); (Zeng, 2008)]. Figure 3.8 represents an overview of the structure and functions of these four main groups, showcasing as well the subcategories of KOS previously mentioned. In this figure, the x characters indicate the extent to which each type of KOS embodies five key functions identified by (Zeng, 2008), such as eliminating ambiguity or controlling synonyms.

In this subsection, I will explore four subcategories of KOS, each representing a continuum from a more linear to a more structured network. These include folksonomy, taxonomy, thesaurus, and ontology. These KOS have been selected due to their significant impact on the organisation and interlinking of data within the contexts of CHI practices and LOD. Furthermore, the intent of these systems is to help bridge the gap between human understanding and machine processing.

Overview of the Structures and Functions of KOS [(Zeng, 2008) p. 161]
Figure 3.8: Overview of the Structures and Functions of KOS [(Zeng, 2008) p. 161]
3.2.4.1 Folksonomy

Positioned at one end of the organisational spectrum, folksonomies, also known as community tagging or social bookmarking, are characterised by their user-generated nature. These systems rely on individual users’ tagging of content with keywords or tags that reflect their personal perspectives and preferences. Folksonomies as integration or reconciliation is often hard to achieve [(Zeng & Qin, 2022) p. 401]. However, they do provide a wealth of source material for studying social semantics [(Zeng & Qin, 2022) p. 403] and can be done in parallel to more structured KOS.

3.2.4.2 Taxonomy

Moving towards the centre of the spectrum, taxonomies present a more structured approach to knowledge organisation. [(Zeng, 2008) p. 169].

Taxonomies employ hierarchical classifications to systematically categorise information into distinct classes and sub-classes, or in a parent/child relationship (SAA Dictionary, 2023) - as shown by Code Snippet 3.3 (NISO, 2010, p. 18). Taxonomy, in this context, extends beyond mere categorisation; it also establishes relationships.

Code Snippet 3.3: Taxonomy Hierarchy

Chemistry
    Physical Chemistry
        Electrochemistry
            Magnetohydrodynamics
    
3.2.4.3 Thesaurus

Moving further along the spectrum, thesauri offer a more detailed and formalised method of organisation. They include not only hierarchical relationships but also explicit semantic connections between terms, making them valuable tools for information retrieval. As defined by (NISO, 2010) [p. 9]:

A thesaurus is a controlled vocabulary arranged in a known order and structured so that the various relationships among terms are displayed clearly and identified by standardized relationship indicators.

For instance, consider a thesaurus related to photography, which encompasses categories for various aspects of photography, including photographic techniques, equipment, and materials. Within this taxonomy, ‘Kodachrome’ could be categorised not only as a specific type of colour film but also as a distinct photographic process. As a type, it could fall under the sub-category of ‘colour film photography’, and as a process, it would fit within the broader framework of ‘photographic techniques’.

The AAT, commonly employed in the CH domain, stands as a significant example of a thesaurus (Harpring, 2010, p. 67). Homosaurus[73] is another example of a thesaurus with a distinct focus on enhancing the accessibility and discoverability of LGBTQ+ resources and related information. Leveraging Homosaurus in metadata can effectively contribute to diminishing biases present in such data, an essential step in promoting inclusivity and equity within information systems (see Hardesty & Nolan, 2021).

3.2.4.4 Ontology

At the structured end of the spectrum, ontologies define complex relationships and attributes between concepts, whereby a series of concepts have been chosen to express what we understand, so that a computer can start making sense of our world. Ontologies are formalised KOS, enabling advanced data integration and KR for more sophisticated applications. The term is drawn from philosophy, where an ontology is a discipline concerned with studying the nature of existence, as articulated by (Gruber, 1993) [pp. 199-200]:

An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an ontology is a systematic account of Existence. For knowledge-based systems, what “exists” is exactly that which can be represented.

There are different kinds of ontologies, including axiomatic formal ontologies, foundational ontologies, and domain-specific ontologies (Beretta, 2022). These different types of ontologies cater to various knowledge representation needs. Foundational ontologies, such as DOLCE [74], provide a high-level framework for modelling knowledge and offers a comprehensive system for representing entities, qualities, and relationships (Borgo et al., 2022; see Masolo et al., 2003).

DLs, a family of formal KR languages, play also a key role in developing ontologies and serve as the foundation for OWL (see 3.4.2), notably by providing a logical formalism. DLs are characterised by their ability to provide substantial expressive power that goes well beyond propositional logic, while maintaining decidable reasoning (Chang et al., 2014).

In computer science, the concepts of ABox and TBox, both statements in KBs, are relevant to the structuring and enrichment of KGs (Giacomo & Lenzerini, 1996)[75]. The ABox, representing the ‘assertion’ or ‘instance’ level, encapsulates concrete data instances and their relationships, contributing to the factual knowledge of a given system. Conversely, the TBox, representing the ‘terminology’ or ‘schema’ level, defines the conceptual framework and hierarchies that govern the relationships and attributes of the instances. These two complementary components work in harmony to improve data interoperability, reasoning and knowledge sharing. Figure 3.9 depicts a high-level overview of a KB representation system.

Knowledge Base Representation System Based on (Patrón et al., 2011) [p. 205]
Figure 3.9: Knowledge Base Representation System Based on (Patrón et al., 2011) [p. 205]

Consider a scenario around artwork provenance held in a museum. The ABox strives to encapsulate the rich narratives of individual artworks, tracing their journey through time, ownership transitions and exhibition travels. At the same time, the TBox creates a conceptual scaffolding, imbued with classes such as Artwork, Creator, and Exhibition, painting an abstract portrait that contextualises each artefact within a broader cultural tapestry. It is here that the DL comes in, harmonising the symphony with its logical relationships and axioms, i.e. a rule or principle widely accepted as obviously true (Baader & Lutz, 2007). The DL is represented as 𝒦 = (𝒯, ℛ, 𝒜), where:

This symbiotic interplay ensures that the provenance of each artwork is not just a static account, but a dynamic, interconnected narrative. The ABox-TBox relationship thrives in the realm of reasoning. Imagine an axiom embedded in the TBox: ‘A work of art presented in an exhibition curated by a distinguished patron is of heightened cultural significance’, or here phrased in DL terms: ∃ curates.Artwork.CulturalSignificancetrue. This axiom serves as a beacon to guide the system’s reasoning. When an ABox instance of an artwork is woven into an exhibition curated by a prominent authority, the DL-informed engine responds by inferring an enriched cultural value that resonates beyond the artefact itself. This is where the TBox takes data and gives it life, producing insights that transcend the boundaries of individual instances. The KB, 𝒦, captures this orchestration, encapsulating the logical relationships for meaningful interpretation and knowledge discovery.

Overall, the relationship between ABox and TBox in DL is vital for achieving semantic clarity, enabling meaningful data integration, and facilitating advanced reasoning mechanisms. The museum provenance scenario showcases a precisely orchestrated convergence of assertion, terminology, and rigorous logical reasoning. This engenders a computational landscape where historical artefacts intricately mesh within the complex network of human history’s data structures, seamlessly aligning with the underlying framework of algorithmic representation. These components enable software developers to harmonise disparate datasets, extract insightful knowledge, and support decision-making processes across a wide range of domains. In essence, the use of DL, ABox, and TBox in ontological KR enhances interoperability between different systems and allows for sophisticated reasoning and decision support.

Moving beyond these foundational concepts, it is noteworthy to consider the work of (Ehrlinger & Wöß, 2016), who address the need for a clear and standardised definition of KGs. They highlight the term’s varied interpretations since its popularisation by Google in 2012 and propose a definitive, unambiguous definition to foster a common understanding and wider adoption in both academic and commercial realms. They define a KG as follows: ‘A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge’. This definition crystallises the essence of KGs as dynamic and integrative systems that not only store but also process and enrich data through advanced reasoning. This conceptualisation underlines the transformative potential of KGs in various domains, bridging the gap between raw data and actionable insights.

Finally, it is important to recognise that the importance of ontologies extends beyond individual systems. Shared ontologies are a cornerstone of semantic interoperability, thus facilitating a paradigm shift in the way systems and applications communicate. As (Sanderson, 2013) argues: ’shared ontologies increases semantic interoperability’ and ‘shared identity makes it possible for graph to merge serendipitously’. This shared understanding ensures that various entities can seamlessly connect and engage in meaningful interactions.

3.3 Trends, Movements, and Principles

Technological trends, scientific movements, and guiding principles have played a crucial role in shaping the landscape of contemporary research. In recent years, there has been an increased emphasis on the need for academic and CH practices to be more transparent, inclusive, and accountable. This shift reflects a broader trend towards integrating advanced technological solutions and open-science principles in heritage management. As such, understanding the evolution of CH becomes imperative to comprehend how these practices have adapted and transformed in response to these guiding trends.

The evolution of CH has been characterised by a series of technological and methodological shifts. Initially, the primary focus was on digitising physical artefacts to preserve information from degrading originals. This phase was crucial for transitioning tangible CH into a digital format, mitigating the risk of loss due to physical degradation. Following this, efforts shifted towards ensuring the persistence of digitised resources. This stage involved addressing challenges related to digital preservation, including data degradation and format obsolescence, to ensure the longevity of digital cultural assets.

The advent of open data principles marked the next phase in CH development. This approach facilitated broader access to information, aligning with contemporary values of transparency and inclusivity in, governmental, academic, and cultural contexts. Subsequently, the focus expanded to enhancing the utility of this data. This stage involved contextualising and enriching CH data, thereby increasing their applicability and relevance across various domains.

The current frontier in CH involves developing applications that leverage rich CH data. These applications serve not only as tools for engagement and education but also as justifications for the ongoing costs associated with data storage and archival. They illustrate the tangible benefits derived from preserving heritage resources, encompassing both cultural and economic returns. In summary, the trajectory of CH development mirrors broader technological and societal trends, transitioning from preservation to active utilisation. This progression underscores the dynamic nature of research and CH processes, highlighting the evolving requirements for transparency, inclusivity, and accountability in CH management.

While automation has significantly enhanced the efficiency of digitisation processes in CH, cataloguing and indexing remain complex challenges. The intricacies involved in accurately understanding and categorising resources necessitate more than just technological solutions; they require context-aware and culturally sensitive approaches. Here, ML offers promising perspectives. ML, particularly in its advanced forms like deep learning, can assist in cataloguing and indexing by analysing large datasets to identify patterns, categorise content, and even suggest metadata. This can be particularly useful in handling large volumes of CH data, where manual processing is time-consuming and prone to human error. Typical applications of ML in this field include image recognition for identifying and classifying visual elements in artefacts, NLP for analysing textual content, and pattern recognition for sorting and organising data based on specific characteristics. Furthermore, prospective developments may entail the refinement of metadata mapping and the enhancement of quality control mechanisms. Moreover, ML algorithms can be trained to recognise stylistic elements, historical contexts, and other nuances that are essential for accurate cataloguing in CH. However, it is crucial to note that the effectiveness of ML depends heavily on the quality and diversity of the training data. Biases in this data can lead to inaccuracies in cataloguing and indexing. Thus, a collaborative approach, where ML is supplemented by expert human oversight, is often the most effective strategy.

Overall, this section provides a comprehensive overview of six three[26:1] technological trends as well as five key scientific movements and guiding principles that are shaping research and how universities and GLAMs should provide environments, services, and tools with a view to collecting and disseminating content. By exploring each of these trends, movements, and principles, we can gain a deeper understanding of how research and CH processes are permeated by dynamic movements and how resources can be made more transparent, inclusive and accountable, as well as how data can be made available to human and non-human users.

I will explore some current and emerging technological trends in CH, organised into three components: Linked Data, big data, and AI. Each represents a critical driver shaping the landscape and practices of heritage data. The three trends have been around for a few decades, with the ‘Linked Data’ principles and underlying standards coming from the late 1990s, ‘big data’ being coined in 1990 and AI in 1956.

Before considering the trends discussed hereafter, note that current technological developments do not exist in isolation, but tend to intertwine and act synergistically. A vivid example of this interplay can be seen in AI and its latent impact on the semantic web, particularly in facilitating more efficient querying and crawling processes such as the LinkedDataGPT proof-of-concept service[76] from Liip on the City of Zurich that combines ChatGPT — a generative AI solution — on top of a Linked Data portal to facilitate querying open datasets (Stocker, 2023). Inversely AI can be fed by data on the web to learn and reason, as outlined by

3.3.1.1 Linked Data

Linked Data, and most precisely LOD, is a set of design principles adhering to RDF which is a significant approach to interconnect data on the web in order to make semantic queries more useful (Berners-Lee et al., 2001). In other words, this standardisation allows data to be not only linked, but also openly accessible and reusable. As noted by (Gandon, 2019) [p. 115, citing (Gandon, 2017)]:

The Web was initially perceived and used as a globally distributed hypertext space for humans. But from its inception, the Web has always been more: its hypermedia architecture is in fact linking programs world-wide through remote procedure calls.

This deeper understanding of the web’s architecture as a conduit for linking programs on a global scale holds profound implications. It signifies that the web is not merely a medium for accessing information but a dynamic environment where data-driven programs interact, exchange data, and collaborate across geographical boundaries. In this context, Linked Data emerges as a powerful enabler, providing a structured and standardised approach for these programs to communicate and share meaningful data (Bizer et al., 2008).

In the context of CH, institutions such as museums, libraries and archives can publish their collections using Linked Data principles, enabling a web of linked information that is accessible to all. As this dissertation’s main topic revolves around Linked (Open) (Usable) Data, two dedicated sections have been written within this literature review in Section 3.4 and Section 3.5.

Beyond formal LOD, CHIs may also link their databases or collections in more informal ways. This interconnection may take the form of shared metadata, common identifiers, or simply hyperlinks. These links can enhance the user experience by supporting a more seamless navigation between related items or pieces of information. For instance, a parallel strategy is the use of graph-based data representation, i.e. property graph which consists of a set of objects or vertices, and a set of arrows or edges connecting the objects, that are most likely not RDF-compliant (see Bermès, 2023). Graph databases, such as Neo4j[77] which is quite prevalent in DH (Darmont et al., 2020; Drakopoulos et al., 2019; see Webber, 2012), allow for efficient storage and retrieval of interconnected data through nodes representing entities and relationships linking them.

3.3.1.2 Big Data

Big Data refers to extremely large and complex datasets that exceed the capabilities of traditional data processing methods and tools. It encompasses a massive volume of structured, semi-structured and unstructured data that is currently flooding across a variety of sectors, companies and organisations (see Emmanuel & Stanier, 2016). The characteristics of big data are often described by the three ⋁ model (Laney, 2001):

In addition to the three ⋁ model, two more characteristics are often included (Saha & Srivastava, 2014, p. 1294):

Regarding the two latter dimensions, (Debattista et al., 2015) argue that that Linked Data is the most suitable technology to increase the value of data over conventional formats, thus contributing towards the value challenge in Big Data. As for veracity, they describe a semantic pipeline with eight key metrics to address the veracity dimension. Building on this technological foundation, the integration of Linked Data and Big Data analytics takes centre stage.

Big data analytics can be employed on CH content to uncover insights and correlations that can be used in decision-making. (Barrile & Bernardo, 2022) [p. 2708] highlight the transformative potential of using big data by investigating how analytical approach can enhance conservation strategies, aid resource allocation and optimise the management of CH resources. (Poulopoulos & Wallace, 2022) [pp. 188-189] emphasise that emerging technology trends, including big data, have a significant impact on related research areas such as CH. Big data primarily originates from sources such as social media, online gaming, data lakes[80], logs and frameworks that generate or use significant amounts of data. They stress that the incorporation of multi-faceted analytics in the CH domain is an area of active research, and present a data lake that provides essential user and data/knowledge management functionalities. However, they emphasise a crucial consideration - the need to bridge the theoretical foundations of disciplines such as cultural sociology with the technological advances of big data.

3.3.1.3 Artificial Intelligence

AI has been coined for the first time by John McCarthy, an American computer scientist and cognitive scientist, during the 1956 Dartmouth Conference, which is often considered the birth of AI as an academic field (Andresen, 2002, p. 84). According to the (Oxford English Dictionary, 2023), AI is described as follows:

The capacity of computers or other machines to exhibit or simulate intelligent behaviour; the field of study concerned with this. In later use also: software used to perform tasks or produce output previously thought to require human intelligence, esp. by using machine learning to extrapolate from large collections of data.

While AI is not the central focus of my PhD thesis, I acknowledge its impact in several instances. As a rapidly developing technology, AI has the potential to significantly transform various aspects of society, including the way we describe, analyse, and disseminate CH resources. It is worth mentioning that I endeavour to engage in a broader discourse concerning the domain of AI. In this context, I use the acronyms AI to talk about the overarching domain or its ethics, and ML to discuss the specifics of methodologies and algorithmic approaches, while refraining from delving into the intricacies of Deep Learning, which is a distinct subdomain within ML.

AI and ML offer great potential for digitising, curating and analysing CH, leveraging the vast digital datasets from CHIs. Some of the examples include text recognition mechanisms using OCR and HTR, NLP and NER for enriching unstructured text, as well as object detection methods for finding patterns within still and moving images (Neudecker, 2022; Sporleder, 2010). Textual works can also be analysed, for instance for sentiment analysis (see Susnjak, 2023), and generated using LLM – a variety of NLP, such as BERT or ChatGPT, which predicts the likelihood of a word given the previous words present in recorded texts. However, challenges such as data quality and biases in AI persist (Neudecker, 2022).

In addition, there are still uncertainties regarding the licensing and reuse of CH datasets by ML algorithms[81]. (Neudecker, 2022) emphasises the importance of well-curated digitised CH resources that are openly licensed, accompanied by relevant metadata, and accessible through APIs or download dumps in various formats. These curated resources have the potential to address the existing gap in this domain.

Building on the theme of enhancing CH through digital technologies, (McGillivray et al., 2020) explore the synergies and challenges found at the intersection of DH and NLP. DH is aptly described as ‘a nexus of fields within which scholars use computing technologies to investigate the kinds of questions that are traditional to the humanities […] or who ask traditional kinds of humanities-oriented questions about computing technologies’ (Fitzpatrick, 2010). This broad characterisation encapsulates the transformative potential of digital tools, including ML techniques, in enriching humanities research.

(McGillivray et al., 2020) highlight the critical need for bridging the communication gap between DH and NLP to drive progress in both fields. They propose increased interdisciplinary collaboration, encouraging DH researchers to actively utilise NLP tools to refine their research methodologies. A primary challenge in this convergence is the application of NLP to the complex, historical, or noisy texts often encountered in DH research. They conclude by advocating for stronger cooperation between practitioners in these fields. This collaborative effort is vital for harnessing the full potential of ML in analysing and interpreting CH.

The use of ML scripts in the context of CH — and beyond — is inherently limited by their applicability, namely when dealing with historical photographs. In such cases, the use of algorithms that are mostly trained and grounded in contemporary image data becomes quite incongruous due to the dissimilarity in temporal contexts. This dilemma is exemplified by datasets such as Microsoft’s Common Object in Context (COCO)[82] (Lin et al., 2014), where the available data are predominantly contemporary photographic content, which is misaligned with the historical nuances inherent in most of the digitised CH images. (Coleman, 2020) corroborates that a sound approach would be for ML practitioners to collaborate with libraries as they can draw practical lessons from critical data studies and the thoughtful integration of AI into their collections, using guidelines from DH. She also advocates that as handing handing over datasets would be a disservice to library patrons and that ‘Librarians need to master the instruments of AI and employ them both to learn more about their own resources—to see and analyze them in new ways—and to help shape applications of AI with the expertise and ethos of libraries.’

Ethical concerns, particularly regarding social biases and racism, are prevalent in technologies like ImageNet, where facial recognition may yield AI statements with strong negative connotations (Neudecker, 2022). Addressing this, (Gandon, 2019) suggest the production of AI services that are ‘benevolent-by-design for the good of the Web and society’. Furthermore, (Floridi, 2023) introduces the double-charge thesis, asserting that all technology design is a moral act, challenging the neutrality thesis. He emphasises that technologies are not neutral and can be influenced by a dynamic equilibrium of values, predisposing them towards morally good or evil directions.

As mentioned previously, the ML training datasets are often not enough representative to be properly leveraged in the CH sector (Strien et al., 2022). Fine-tuning is now a topic though and new ground truth datasets have been created and tailored for the needs of CH, such as Viscounth[83], a large-scale VQA dataset — i.e a dataset containing open-ended questions about images which requires an understanding of vision, language and commonsense knowledge to answer (Goyal et al., 2017) — for CH in English and Italian (see Becattini et al., 2023).

(Jaillant & Caputo, 2022) argue that the governance of AI ought to be carried out in partnership with GLAM institutions. However, while this collaboration has been proposed as a promising way forward, it still requires further exploration and evaluation, particularly with regards to the specific challenges and opportunities that it presents. On the one hand, the involvement of GLAMs in AI governance could enhance the development of digital CH projects that promote social justice and equity. However, on the other hand, this collaboration raises several challenges, such as the need to address issues of privacy, data protection, and intellectual property rights, and to ensure that the values and perspectives of GLAM professionals are adequately represented in the development of AI algorithms and systems. Therefore, it is crucial to examine the specific challenges and opportunities of this collaboration and to develop appropriate frameworks and guidelines that enable effective and ethical governance of AI in the GLAM sector.

One of these platforms that address these issues is AI4LAM, which is an international and participatory community focused on advancing the use of AI in, for and by libraries, archives, and museums[84]. The initiative was launched by the National Library of Norway and Stanford University Libraries in 2018 inspired by the success of the IIIF community. Another agency is the AEOLIAN Network[85], AI for Cultural Organisations, which investigates the role that AI can play to make born-digital and digitised cultural records more accessible to users (Jaillant & Rees, 2023, p. 582).

As an illustrative case, the LoC's exploration into ML technologies, as highlighted by (Allen, 2023), demonstrates a strategic commitment to enhancing the accessibility and utility of its diverse collections. This initiative reflects the LoC's acknowledgement of the transformative potential of ML, balanced with a cautious approach due to the necessity for accurate and responsible information stewardship. The LoC faces several challenges in applying ML, particularly the limitations of commercial AI systems in handling its varied materials and the requirement for substantial human intervention. This cautious exploration into ML is indicative of a broader trend in CHIs, where maintaining a balance between embracing technological advancements and preserving authenticity and integrity is crucial.

The specific experiments and projects undertaken by the LoC in the realm of ML are diverse and illustrative of the institution’s comprehensive approach to innovation. For instance, image recognition systems have been tested for identifying and classifying visual elements in artefacts, a task that requires a nuanced understanding of historical and cultural contexts. In another initiative, speech-to-text technology was employed to transcribe spoken word collections, confronting challenges such as accent recognition and audio quality variation. Additionally, the LoC explored the potential of ML in enhancing search and discovery capabilities through projects like Newspaper Navigator[86], which aimed to identify and extract images from digitised newspaper pages.

These experiments not only highlight the potential of ML in transforming the way LoC manages and disseminates its collections but also reveal the complexities and limitations inherent in these technologies. As (Allen, 2023) notes, the ongoing research and experimentation in ML at the LoC are critical in revolutionising access and discovery in the cultural heritage sector. These efforts, while facing challenges, represent a diligent integration of advanced technologies, upholding principles of responsible custodianship and setting a precedent for similar institutions globally in the adoption and adaption of ML and AI in CHIs.

The integration of LLM and KG presents a groundbreaking opportunity, particularly within the realm of CHIs, where there is already considerable expertise. This is aptly demonstrated in the work of (Pan et al., 2023), which elucidates the harmonisation between explicit knowledge and parametric knowledge, i.e. knowledge derived from patterns in data, as learned by models such as LLMs. The authors highlight three key areas for the advancement of KR and processing:

  1. Knowledge Extraction, where LLMs improves the extraction of knowledge from diverse sources for applications such as information retrieval and KG construction;
  2. Knowledge Graph Construction, which involves LLMs in tasks such as link prediction and triple extraction from data, albeit with challenges in precision and management of long tail entities;
  3. Training LLMs Using KGs, where KGs provides structured knowledge for LLMs, helping to build retrieval-augmented models on the fly, enriching LLMs with world knowledge and increasing its adaptability.

In a report for the University of Leeds in the UK, (Pirgova-Morgan, 2023) explores the potential and practical implications of AI in libraries. The project, forming part of the university’s ambitious vision for digital transformation, aims to understand how AI can be effectively integrated into library services. This research looks at both the use of general AI for long term strategic planning and specific AI applications for improving UX, process optimisation and enhancing the discoverability of collections. The methodology used in this study involves a multi-faceted approach including desk-based assessments, a university-wide survey and expert interviews. Specifically, the study highlights the following key findings:

These insights from the University of Leeds report illustrate the complex impact of AI on library services, from enhancing user interaction to influencing strategic decision-making, while also emphasising the importance of adapting AI applications to specific institutional needs.

It must be also stated that AI lacks inherent intelligence and consciousness, and have been ultimately built by people. An important concern, namely with LLM, is the perceptual illusion of cognitive interaction, where the machine appears to be engaging in dialogue and reasoning, when in fact it is generating content through predictive algorithms (see Ridge, 2023). Furthermore, regarding the topic of data colonialism, poor people in underprivileged nations are often burdened with the responsibility of cleaning up the toxic repercussions of AI, shielding affluent individuals and prosperous countries from direct exposure to its harmful effects[87].

Concluding this segment, it is essential to perceive ML algorithms as uncertain ‘socio-material configurations’, which can be seen as both powerful and inscrutable, demanding an axiomatic and problem-oriented approach in their understanding and application. (Jaton, 2017) elaborates on this by examining how these algorithms, while technologically complex, are firmly rooted in and shaped by the social, material, and human contexts in which they are developed. Beyond their computational complexity, these algorithms are deeply embedded in the process of constructing . These ground truths are not inherent or fixed; instead, they emerge from collaborative efforts that reflect the varied inputs of actors. This process underscores the algorithms as socio-material constructs, influenced by the characteristics and contexts of their creators. Understanding algorithms in this light highlights their deep integration with human actions and societal norms, offering a more nuanced view of their design and implementation (see Jaton, 2021, 2023).

3.3.2 Scientific Movements and Guiding Principles

First, 3.3.2.1 examines the movement towards more open and transparent forms of research. Open scholarship is a broad concept that encompasses practices such as open access publishing, open data, open source software, and open educational resources. The subsection explores the benefits and challenges of open scholarship, and how it can help to increase the accessibility and impact of research data.

Then, 3.3.2.2 explores the growing trend of involving members of the public in scientific research. Citizen science and citizen humanities involve collaborations between scientists and non-expert individuals, with the aim of generating new knowledge or solving complex problems. The subsubsection examines the benefits and challenges of citizen science and citizen humanities, and how they can help to democratise research.

3.3.2.3 examines the set of guiding principles designed to ensure that research outputs are FAIR. It explores the importance of each data principle for research integrity, reproducibility, and collaboration, and provides examples of how they can be implemented in practice.

3.3.2.4 explores the importance of ethical and culturally sensitive data governance practices for indigenous communities that are materialised through CARE. These principles provide a framework for managing data in a way that is consistent with the values and cultural traditions of indigenous communities. This part explores as well the challenges and opportunities of implementing the CARE Principles for Indigenous Data Governance.

Finally, 3.3.2.5 explores the concept of ‘Collections as Data’, a perspective that has emerged from the practical need and desire to improve decades of digital collecting practice. This approach re-conceptualises collections as ordered digital information that is inherently amenable to computational processing.

3.3.2.1 Towards Open Scholarship

According to the FOSTER[88], Open Science can be described as ‘[…] the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods.’ (FOSTER, 2019).

In recent years, the principles of Open Science, that historically include Open methodology, Open source, Open data, OA, Open peer review, as well as open educational resources, have become increasingly important as they emphasise transparency, collaboration and accessibility in scientific research (Bezjak et al., 2019). Open methodology refers to the sharing of research processes and methods, allowing other researchers to reproduce and build on existing work (see Vicente-Saez & Martinez-Fuentes, 2018). Open source software and tools enable researchers to collaborate, while open data practices promote the sharing of research data in ways that are accessible, discoverable and reusable by others[89]. Open access seeks to remove financial and other barriers to accessing scientific knowledge, while open peer review provides greater transparency and accountability in the publication process. Finally, open educational resources encourage the sharing of teaching and learning materials, thereby facilitating the dissemination of knowledge and skills.

(UNESCO, 2019) conducted a preliminary study of the technical, financial and considerations related to the promotion of Open Science. This research underscored the necessity for a holistic approach to Open Science and stressed the significance of tackling international legal matters, as well as the existing challenges stemming from unequal access to justice, which can hinder global scientific collaboration. This study laid the groundwork for a recommendation on making ‘[…] multilingual scientific knowledge openly available, accessible and reusable for everyone, to increase scientific collaborations and sharing of information for the benefits of science and society, and to open the processes of scientific knowledge creation’ (UNESCO, 2021, p. 7). UNESCO identified five types of access related to Open Science: infrastructures, societal actors, as well as associated and diverse knowledge systems where dialogue is needed. This includes acknowledging the rights of indigenous peoples and local communities to govern and make decisions on the custodianship, ownership, and administration of data on traditional knowledge and on their lands and resources. Figure 3.10 provides a visual summary of this.

Open Science Elements, Redrawn Slide from Presentation of Ana Persic [(Morrison, 2021) citing (Persic, 2021)]
Figure 3.10: Open Science Elements, Redrawn Slide from Presentation of Ana Persic [(Morrison, 2021) citing (Persic, 2021)]

While Open Science offers numerous benefits, it also presents challenges and potential drawbacks that warrant careful consideration. One major concern is the risk of exacerbating inequities between researchers from well-resourced institutions and those from less privileged backgrounds. Open access publishing often entails significant costs in the form of article processing charges, which can disproportionately burden researchers without adequate funding support (Burchardt, 2014). Additionally, Open Science practices relying on open protocols may be vulnerable to misuse, such as automated bots excessively crawling open repositories or datasets. This can lead to overloading systems, unauthorised data extraction, or unintended uses of research outputs (see Irish & Saba, 2023; Li et al., 2021). These risks underscore the importance of balancing openness with safeguards that ensure equitable participation and secure, sustainable access to research materials.

These challenges are particularly relevant in the context of DH, a field that harnesses the promise and impact of digital technologies and methodologies for the study and understanding of cultural phenomena. The adoption of Open Science principles has contributed to greater collaboration, transparency and accessibility in research practices in this field. Open data practices are particularly relevant, as they allow scholars to work with large and complex datasets, including digitised archives and social media data. Open educational resources can also be used to support the dissemination of CH literacy and skills, enabling wider audiences to engage with such resources. However, ensuring that such openness does not exacerbate inequities or introduce vulnerabilities requires thoughtful implementation.

In addition to the principles of Open Science, the concept of Open Scholarship has been introduced by (Tennant et al., 2020) as a broader approach that encompasses the arts and humanities and goes beyond the research community to the wider public. Open Scholarship emphasises the importance of making research and scholarship accessible to a wider audience, including non-experts, educators and policy makers. It can be particularly relevant to the arts and humanities, as they often deal with complex cultural materials and narratives that have wider societal implications. By making their work openly accessible and engaging with non-experts, humanities researchers can contribute to public discourse, promote cultural understanding, and inform policy and decision-making. Open scholarship can also support greater collaboration and innovation within the Arts and Humanities by enabling researchers to work collaboratively across disciplines and with a wide range of constituents. For instance, open educational resources can be used to develop collaborative teaching and learning materials that draw on the expertise of scholars and practitioners from different disciplines, while open data practices can facilitate the sharing and reuse of CH materials.

Conversely, (Knöchelmann, 2019) advocates for the term Open Humanities as a dedicated discourse that would within the humanities. Notably, he argues that Open Humanities should adapt key Open Science elements to the Humanities’ unique context. In the case of preprints, the challenges in the humanities, such as limited discipline-specific preprint servers and linguistic diversity, require tailored solutions to encourage adoption. Open peer review in the humanities should accommodate the field’s subjectivity and diverse perspectives. Concerns about liberal copyright licenses revolve around potential misrepresentation and plagiarism, highlighting the importance of maintaining scholarly integrity regardless of the chosen license. Knochelmann’s proposal underscores the need for context-sensitive approaches to promote openness and collaboration while respecting humanities’ distinct characteristics.

Overall, the principles of Open Science provide a framework for promoting greater collaboration, transparency and accessibility in research practices. Yet, the challenges discussed underscore the need for careful adaptation to address inequities, cybersecurity concerns, and field-specific nuances. The concept of Open Scholarship, which stresses the importance of making research and scholarship accessible to wider audiences, can be instrumental in broadening the impact of research in both natural sciences and the humanities, as Open Science encourages greater collaboration and innovation across disciplines. Ultimately, this underscores the need for adaptation and positions all academic disciplines as essential contributors to societal understanding, cultural preservation and informed decision-making, while ensuring the sustainability and integrity of open practices.

3.3.2.2 Citizen Science, Citizen Humanities

Citizen Science and Citizen Humanities are approaches that involve the public in scientific and humanities research, respectively. They have become increasingly popular in recent years as a means of democratising research and engaging the public in academic initiatives.

Citizen Science, as articulated by (Irwin, 1995), embodies a fundamental commitment to sourcing knowledge beyond the confines of academia, with a deliberate focus on addressing the concerns and interests of the public. This perspective underscores the transformative power of Citizen Science, making it a catalyst for a more democratic approach to scientific endeavours.

(Bonney, 1996)’s perspective complements this vision by framing Citizen Science as a collaborative process where amateur enthusiasts actively participate in data collection for academic science, all the while gaining a deeper understanding of scientific principles and processes. In this light, Citizen Science emerges as an ideal vehicle for science education and a potent tool for enhancing public appreciation of scientific pursuits.

These viewpoints loosely align with the Oxford English Dictionary’s definition, characterising Citizen Science as ‘scientific work undertaken by members of the general public, often in collaboration with or under the direction of professional scientists and scientific institutions’ and traces the earliest evidence of the term in 1989 (Oxford English Dictionary, 2023). As such, Citizen Science stands as a harmonious intersection of public engagement, education, and scientific inquiry, amplifying the voice of non-academic contributors and democratising the scientific landscape.

The public can play a vital role in data collection, analysis, and interpretation. This involvement can take the form of participating in wildlife sightings tracking, monitoring water quality, or assessing air pollution. By participating in these activities, citizens become direct contributors to the generation of valuable scientific data. The transformative power of Citizen Science extends across a wide spectrum of scientific disciplines, emphasising its capacity to democratise and broaden the reach of scientific endeavours (see Pelacho et al., 2021).

Citizen Science is a form of co-creation, whether viewed as an innovation-oriented means of value creation (Jansma et al., 2022) or as a more radical form of empowerment, reinforces the democratisation of the research process (Metz et al., 2019). It amplifies the voice of non-academic participants in scholarly pursuits, reflecting a profound shift in the way science is conducted. This collaborative model demonstrates how public engagement enriches the scientific landscape, allowing for the inclusion of different perspectives and a wider range of voices in the pursuit of knowledge. Furthermore, engaging in participatory practices also involves elements of ‘phronesis’ [90] (see Mehlenbacher, 2022), encompassing moral, affective, and care-oriented dimensions.

Trust is also a foundational and indispensable element in the landscape of participatory initiatives (see Dahlgren & Hansson, 2020). The success and sustainability of projects within Citizen Science heavily rely on establishing and maintaining trust among all stakeholders involved. This trust extends in multiple directions. First and foremost, participants must trust the project organisers and platforms that host these initiatives. They must have confidence that their contributions will be used responsibly and ethically, with respect to their time and effort. When contributors are assured that their involvement is valued and that the data they provide serves a meaningful purpose, their motivation to participate and provide accurate information is bolstered. Conversely, project organisers and institutions also need to instil trust in participants. Transparency in project objectives, methodology and data use is paramount. Clear and consistent communication is essential to address participants’ concerns and provide feedback on the impact of their contributions. This two-way trust is the foundation of successful participatory projects and facilitates long-term engagement.

Citizen Humanities where members of the public can participate in activities such as crowdsourced transcription, tagging, and annotation of digital CH materials. These activities can help to uncover new knowledge and insights, as well as to make CH materials more accessible to a wider audience (Strasser & Haklay, 2018). It is important to note that within the context of these terms, Citizen Science is often regarded as the broader concept, encompassing both Citizen Science and Citizen Humanities. While the primary distinction between the two may, in some cases, appear to be terminological, in practice, they both exemplify the principles of open and inclusive research, akin to the concepts of Open Science and Open Humanities discussed in the preceding subsection. These approaches foster collaboration and engagement between researchers and the public, deepening the public’s understanding and appreciation of the research process as a whole (Zourou & Ziku, 2022). This inclusive perspective, even if those participatory activities have been more widely used in natural sciences than in the humanities (Lowry & Stepenuck, 2021), underscores Citizen Science as an umbrella term encompassing both scientific and humanities endeavours, each enriched by the active participation of the public.

While Citizen Science involves the public in research, they differ from crowdsourcing projects in several ways. Crowdsourcing typically involves the outsourcing of tasks to a large group of people, often through online platforms, with the aim of completing a specific task or project (Ridge, 2017). In contrast, Citizen Science focuses more on engagement and collaboration, with the goal of involving the public in the research process and generating new knowledge. That being said, there is also a convergence between Citizen Science with crowdsourcing projects. In many cases, Citizen Science initiatives may also involve crowdsourcing tasks, such as collecting or annotating data. Similarly, crowdsourcing projects may involve elements of Citizen Science, particularly when they aim to engage the public in scientific or CH research (Ridge et al., 2021).

For instance, (Haklay, 2013) [pp. 115-116] distinguish four categories or levels of participation in Citizen Science projects, each serving as a rung on the ladder of public engagement. The levels are as follows:

When applied in the context of Citizen Humanities, public participation takes diverse forms. This involvement can encompass activities such as the public’s engagement in archaeological finds recording, as demonstrated by the Finnish Archaeological Finds Recording Linked Open Database (SuALT) project (Wessman et al., 2019). Another illustration is the case of the Citizen-Led Urban Environmental Restoration project where ‘young citizen scientists [in Jamaica and the United States] worked closely with museum scientists to restore two environmentally degraded urban sites’ (Commock & Newell, 2023). In terms of crowdsourcing of CH data or more broadly in the humanities, (Owens, 2013) [p. 121] discusses two primary challenges associated with integrating the concept. He highlights that both the terms and pose certain problems. Successful crowdsourcing initiatives in libraries, archives, and museums, as he notes, typically do not rely on extensive crowds, and they are far from resembling traditional labour outsourcing endeavours. Furthermore, Owens emphasises that the central focus of such initiatives is not on amassing large crowds but rather on cultivating engagement and participation among individuals in the public who have a genuine interest.

As Citizen Humanities broadens its scope to encompass a wider public engagement in DH and CH research, successful collaborations between DH and relevant research infrastructures have shown promising results (Fišer & Wuttke, 2018; Simpson et al., 2014). Furthermore, the integration of scientific and curatorial knowledge plays a pivotal role in CH and humanities studies, uncovering previously unknown contextual information within original materials (France & Toth, 2014). As illustrated by institutions like the National Library of Estonia, the shift towards human-centred approaches and the development of DH services exemplify the expansion of Citizen Humanities (Andresoo, 2018).

Incorporating user-generated or user-enhanced metadata still presents several challenges (Raemy, 2021). One major challenge is ensuring the quality and consistency of the data. Another challenge is managing the large volume of data generated by users. With increasing numbers of participants and contributions, it can become difficult to process and organise the data in a way that is useful for research and for the broader public. As (Dahlgren & Hansson, 2020) argue:

Participatory metadata production has been valued for its potential to reduce the workload of the heritage institutions and make possible speedier digitization. However, in practice, little of the resulting metadata has been reinserted into the institutions databases and used in-house by information specialists.

This challenge is compounded by the fact that user-generated metadata may be unstructured, making it more difficult to analyse and interpret. To address these challenges, it can be helpful to have a robust data curation strategy, maintained by a team that can communicate with participants on a regular basis, as well as tools and technologies that enable efficient data processing and analysis. LOD can also be a useful approach for organising and linking diverse sources of information, enabling researchers to incorporate different perspectives and opinions into their analysis. This form of participation often involves micro-tasks, akin to ‘puzzle-like’ tasks, connecting users closely with the subjects they are describing (see Ridge, 2023). The dynamics of participatory projects are intriguing and multifaceted. As expressed by (Dahlgren & Hansson, 2020):

[It] is the often tightly curated top–down design of crowdsourcing platforms where participation is wide in terms of numbers of participants but small in terms of what those participants are allowed to do. The second involves the preconception that the crowd per se, because of its sheer size, in some ways represents a diversity of perspectives and experiences, an idea which is often put forward as one of the benefits of participatory metadata production.

Five recommendations outlined by (Ridge et al., 2023), specifically geared towards the CH domain, can serve as valuable guidance for various participatory endeavours. These recommendations encompass:

These recommendations offer a versatile framework that can be applied to various participatory efforts, transcending the boundaries of specific domains and promoting a more inclusive and effective approach to public engagement in research and collaborative initiatives. By adhering to these measures, Citizen Science projects can better flourish, fostering a collaborative and proficient community of practitioners. Rather than creating new infrastructure, research projects should leverage and extend existing ones, such as Zooniverse[91], a generic Citizen Science portal and FromThePage[92], a transcribing platform.

In summary, both Citizen Science and Citizen Humanities represent participatory methods of inquiry. While they have gained popularity, critical discussions regarding their potential limitations, notably in terms of diversity, are integral to their ongoing development. These critical discussions encompass issues like the challenge of addressing notions of volunteer — thus unpaid – labour, lack of diversity, and countering the dominance of traditional, often exclusive scientific practices (see Stengers & Muecke, 2018). These conversations serve as essential drivers for the evolution of participatory approaches, prompting a reevaluation and refinement of their methodologies to ensure greater inclusivity and equity (Lewenstein, 2022).

3.3.2.3 FAIR Data Principles

The FAIR data principles[93] were developed to ensure that three types of entities – namely data, metadata, as well as infrastructures – are Findable, Accessible, Interoperable, and Reusable. The four key principles of FAIR and their underlying 15 sub-elements or facets are as follows (Wilkinson et al., 2016):

F. Findable — (Meta)data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services.

A. Accessible — Once the user finds the required data, she/he/they need to know how they can be accessed, possibly including authentication and authorisation.

I. Interoperable — The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.

R. Reusable — The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.

Originally introduced to improve data management and sharing in the life sciences, the FAIR principles have evolved into a widely adopted framework that transcends research disciplines. They have been adopted in a wide range of fields, including astronomy (O’Toole & Tocknell, 2022), genomics (Corpas et al., 2018), environmental science (Crystal-Ornelas et al., 2022) and the humanities. In particular, FAIR principles have been applied to make historical archives, artworks or linguistic datasets more openly available for human users and search engines. Moreover, CHIs have embraced FAIR principles as guidelines and best practices, employing them in the deployment of repositories, virtual research environments or data platforms (Beretta, 2021; Hahnel & Valen, 2020).

Yet, the concept of FAIR data management practices in the humanities is not always straightforward as demonstrated by (Gualandi et al., 2022) at the University of Bologna in Italy. The study, involving 19 researchers from the Department of Classical Philology and Italian Studies were interviewed to investigate the concept of ‘data’ in the humanities, particularly in relation to the FAIR principles. The study identified 13 types of research data based on participant input, such as publications, primary sources (manuscripts, artworks), digital representations of CH resources, but also websites, events, or standards. Thus, suggesting that within FAIR, should encompass all inputs and outputs of humanities research. The research also emphasised the importance of methodologies and collaboration in managing research effectively, emphasising the need for clarity and consensus in applying FAIR data principles within the field.

Indeed, implementing FAIR can be complex due to the variety of data types and existing practices. Such complexity requires structured methods. As (Jacobsen et al., 2020, p. 11) point out, FAIR is not a standard, it is a guide that needs implementations based on interpretations.

Similarly, (Dunning et al., 2017) [pp. 187-188] emphasise the multifaceted nature of the FAIR Data Principles and the need to view compliance as an aspirational objective. Their research reveals challenges in achieving full compliance, with specific difficulty in the Interoperable and Re-usable facets. They advocate for basic policy implementation in areas like PID, metadata, licensing, and protocols, alongside transparent documentation. Additionally, the authors stress the importance of using HTTPS – an extension of HTTP (see 3.4.1) – to ensure secure data transmission and accessibility. Finally, the authors stress the importance of collaboration between (data) archivists and researchers.

(GO FAIR, 2016) – an initiative that aims to implement the FAIR data principles – outlined a seven-step FAIRification process, which includes essential stages in the transformation of data, as illustrated by Figure 3.11. These steps begin with the retrieval of non-FAIR data, followed by in-depth analysis to understand the content and structure of the data. The process then requires the creation of a semantic model that accurately defines the meaning and relationships of the data in a computational way, often involving the integration of existing ontologies and vocabularies. The fourth step involves linking data through the application of Semantic Web technologies, thereby improving interoperability and integration with disparate data sources. In addition, the assignment of a clear licence is highlighted as a separate step, emphasising its key role in enabling data reuse and open access. As a sixth step, metadata need to be assigned in order to support data discovery and access. Finally, the FAIRified data is deployed or published with its associated metadata and licence, ensuring that it can be accessed and discovered by search engines, even if authentication and authorisation requirements are in place. As a result, the new FAIR dataset can now be more conveniently aggregated with other data sources, making it more straightforward to raise research questions across multiple sources.

The FAIRification Process. Adapted from (GO FAIR, 2016)
Figure 3.11: The FAIRification Process. Adapted from (GO FAIR, 2016)

A further illustration of how FAIR can be deployed is the conceptualisation of the FDO, which includes a strong binding of various types of metadata (Schultes & Wittenburg, 2019, pp. 7–9). The members of the (European Commission. Directorate General for Research and Innovation., 2018) [p. 35] underline that the establishment of a FAIR-compliant ecosystem hinges on the FDO concept, an implementation framework to develop scalable cross-disciplinary capabilities. As illustrated in Figure 3.12, data must be assigned PIDs and accompanied by detailed metadata to ensure reliable discoverability, usability and citation. They also argue to use widely accepted file formats and adhering to community-specific metadata standards as well as vocabularies to support interoperability and reuse.

The FDO Model [(European Commission. Directorate General for Research and Innovation., 2018)]
Figure 3.12: The FDO Model [(European Commission. Directorate General for Research and Innovation., 2018)]

(Soiland-Reyes et al., 2022) highlight the potential of LOD to drive the adoption of FDO within research infrastructures. While this approach provides specifications and tools, the proliferation of standards and metadata vocabularies poses challenges to interoperability and implementation. To address these hurdles, the authors present the use of FAIR Signposting[94], which enables straightforward navigation to core FDO properties, without the need for complex content negotiation heuristics.

In summary, the FAIR data principles comprise four key principles or 15 facets that provide a comprehensive framework for data management and sharing. While processes are in place to facilitate their implementation, the path to FAIRness can be complex, with interoperability and compliance challenges. A key element is the thoughtful mapping of different metadata standards and the strategic incorporation of Linked Data technologies. The FDO approach is equally relevant to the CH sector, supporting the preservation, accessibility and sharing of CH data and resources. The sharing of code, accompanied by comprehensive documentation, also enhances such an ecosystem by facilitating the exchange of valuable technical knowledge and resources.

3.3.2.4 CARE Principles for Indigenous Data Governance

The CARE[95] Principles were developed to protect Indigenous data sovereignty (Carroll et al., 2020) as complementary guidelines to FAIR. The principles are as follows:

The concept of adhering to the CARE principles is vital for promoting equitable data practices. CARE are built upon existing data reuse principles like FAIR but also integrate the efforts of Indigenous-led networks focused on Indigenous data governance and research control. While FAIR emphasise data accessibility, CARE go beyond that by considering actions aligned with the needs and intentions of individuals and communities connected to the data (Carroll et al., 2021). By embedding CARE-informed data practices into project design, the ethical and responsible use of Indigenous data can be enabled to improve inclusive policies and services (Robinson et al., 2021).

3.3.2.5 Collections as Data

In the same vein of FAIR and CARE should be mentioned the that originated from a meeting of GLAM practitioners in Vancouver, Canada in April 2023 that builds on the [96] done in 2017 (Padilla et al., 2017). The statement highlights the growing global engagement with collections as data. It promotes the responsible computational use of collections to empower memory, knowledge and data practitioners. It emphasises ethical concerns, openness and participatory design, as well as the need for transparent documentation and sustainable infrastructure. The statement, comprising of ten recommendations, also recognises the potential impact of data consumption by AI, and the importance of considering climate impacts and exploitative labour (Padilla et al., 2023). More specifically, the following ten principles have been established for anyone with (meta)data stewardship responsibilities:

  1. Collections as Data development aims to encourage computational use of digitised and born digital collections.
  2. Collections as Data stewards are guided by ongoing ethical commitments.
  3. Collections as Data stewards aim to lower barriers to use.
  4. Collections as Data designed for everyone serve no one.
  5. Shared documentation helps others find a path to doing the work.
  6. Collections as Data should be made openly accessible by default, except in cases where ethical or legal obligations preclude it.
  7. Collections as Data development values interoperability.
  8. Collections as Data stewards work transparently in order to develop trustworthy, long-lived collections.
  9. Data as well as the data that describe those data are considered in scope.
  10. The development of collections as data is an ongoing process and does not necessarily conclude with a final version.

In a final report, (Padilla et al., 2023) underscore the transformative potential of the Collections as Data paradigm, particularly in the context of GLAMs. The principles and case studies highlighted in the report offer a roadmap for organisations to responsibly and ethically engage with their collections in the digital era. It is imperative to recognise that the journey towards fully realising its potential is ongoing and requires a commitment to continual evaluation and adaptation. This involves not only adhering to established principles but also being responsive to emerging technological trends, societal changes, and evolving ethical considerations. The role of AI in shaping the future of Collections as Data is particularly noteworthy. As AI continues to advance, it offers both opportunities and challenges in terms of enhancing access and insights into collections while also necessitating careful consideration of ethical implications, such as bias and privacy. Furthermore, the growing emphasis on climate impacts and sustainable practices in data stewardship is a crucial aspect that aligns with global efforts towards environmental responsibility.

Building on the discussion of the principles and initiatives surrounding Collections as Data, an in-depth analysis was carried out to assess the compliance of repositories, projects and platforms from six organisations with the checklist, namely the British Library[97], the National Library of Scotland[98], LoC[99], the Royal Danish Library[100], Meemoo[101], and the Miguel de Cervantes Virtual Library[102] (Candela et al., 2023, p. 13). Although several institutions have opened access to their collections through APIs, such as IIIFs capabilities, challenges remain in fully embracing the Collections as Data principles. Barriers include resource limitations and the balance between making collections widely available through simplified access and downloads. In addition, different items within the checklist may require different levels of maturity and prioritisation, often requiring collaborative efforts. Initial results show that the checklist is a valuable tool for identifying relevant issues for individual institutions, although prioritisation may vary according to context and user needs. Collaborative initiatives between institutions are underway to improve the practical implementation and user experience, particularly in the structuring of datasets (Candela et al., 2023, pp. 20–21).

While there are still relatively few examples of institutions that have fully adopted the Collections as Data principles, several case studies – such as at the Royal Library of Belgium, which is materialised through DATA-KBR-BE[103] (see Chambers et al., 2021) – and initiatives offer valuable insights. For instance, (Candela et al., 2023) [p. 7] outline a checklist tailored to GLAM institutions to publish Collections as Data[104]. They devised 11 criteria, including the provision of clear licensing for dataset reuse without restrictions, citation guidelines, comprehensive documentation, the use of public platforms, sharing examples of dataset use, structuring the data, providing machine-readable metadata, participation in collaborative edition platforms, offering API access to the repository, developing a dedicated portal page, and defining clear terms of use. These recommendations serve as a structured framework to enhance accessibility, usability, and interoperability, fostering engagement with cultural and historical collections.

A further notable advancement that has been done in the area of publishing Collections as Data is the contribution of They have outlined a series of recommendations for developing datasheets or modular templates designed for CH datasets. This initiative holds significant importance for GLAMs, facilitating the structured organisation of their data, notably for seamless integration with ML tools where they propose to provide a description of how content have been influenced by digitisation. Their work highlights the need for documentation, focusing on tailored metrics, biases, and system integration. The proposed datasheets aim to detail the creation, selection, and digitisation processes, enhancing transparency and addressing the distinctive challenges of digital CH data. Emphasising a narrative approach to articulating biases, the author acknowledges the complex historical context and ethical implications.

3.4 Open Web Platform and Linked Data

The web, created at the CERN in 1989 by Tim Berners-Lee[105], has enabled scholars and CH practitioners to access and analyse vast amounts of data in new ways, thereby opening the door to the creation of federated datasets and KGs. At the heart of this transformation are two pivotal concepts: the Open Web Platform and Linked Data. The Open Web Platform refers to a set of technologies and standards that allow for the creation and sharing of content on the web. Linked Data, on the other hand, refers to a set of principles and technologies that enable the publication and interlinking of data on the web, creating a web of data that can be easily navigated and used by humans and machines alike.

Recognising the Web as an environment that supports a wide range of applications beyond traditional browser-based interactions is becoming increasingly important. Platforms such as social networks like Facebook, Twitter, Instagram, and Mastodon, streaming services such as Netflix and Disney+, as well as cloud-based applications, all leverage web technologies even when not accessed via a traditional web browser. These platforms are integral to the web ecosystem, highlighting the web’s role as a foundational platform for diverse digital interactions and data exchanges.

In the context of the Internet, it is important to note that much of what we know today about it is the result of developments by many individuals and organisations. However, a significant milestone was the development of the TCP/IP protocol by Vinton Cerf and Robert E. Kahn in the 1970s (see Cerf & Kahn, 1974). This protocol became the standard networking protocol on the ARPANET in 1983, marking the beginning of the modern Internet (Leiner et al., 1997). Understanding the differentiation between the Internet and the web is crucial. The former is a global network of interconnected computers that communicate using Internet protocols, forming the infrastructure that enables online communication. The web, or World Wide Web, is a service built on top of the Internet, leveraging HTTP to transmit data. While the Internet provides the underlying connectivity, the web offers a way to access and share information through websites and links. This differentiation is vital in comprehending how the web, as a part of the Internet, has evolved into a versatile and ubiquitous platform supporting a wide array of applications.

This section, divided into five subsections, explores some of the key concepts underlying the Open Web Platform and Linked Data, and their applications in the CH field.

First, 3.4.1 examines the foundational principles and technologies that underpin the Open Web Platform. This includes an overview of principles, protocols such as HTTP, and the use of URIs to identify resources on the web. This part also explores the different types of web architectures such as the client-server model or the concept of web services, which allow for the exchange of data and functionality across different applications and systems.

3.4.2 explores the vision of the web as a giant, interconnected database of structured data that can be queried and manipulated by machines. The subsection examines the technologies and standards that make up the Semantic Web, including RDF, RDFS, OWL, and SPARQL.

Subsection 3.4.3 examines the set of principles designed to promote the publication and interlinking of data on the web. The subsection explores the four principles of Linked Data - using URIs to identify resources, using HTTP to retrieve resources, providing machine-readable data, and linking data to other data.

Subsection 3.4.4 examines the set of criteria for publishing data on the web in a way that makes it easily discoverable, accessible, and usable. The subsection describes the Five-Star and Seven-Star deployment schemes, which include criteria such as providing data in a structured format, using open standards, and providing a machine-readable license.

Finally, 3.4.5 explores the specific application of LOD in the CH domain. The subsection provides examples of how CHIs such as museums, libraries, and archives are using Linked Data to make their datasets more accessible and discoverable on the web.

Overall, this section provides a comprehensive overview of the key concepts and technologies underlying the web as an open and linked platform, and their applications in the CH field and more broadly for any scientific endeavours as that the web started with ‘the philosophy that much academic information should be freely available to anyone’ (Berners-Lee, 1991; Nelson & Van de Sompel, 2022). Through exploring these concepts, we can gain a deeper understanding of how the web is evolving into a more open, interconnected, and data-driven platform, and how this evolution is transforming the way we access, use, and share information.

3.4.1 Web Architecture

The web architecture has played a very important role in the development of scholarly research and CH practices, enabling new forms of collaboration, data sharing and interdisciplinary research. By providing a standardised and interoperable framework based on open standards for sharing and accessing data (Berners-Lee, 2010), it has facilitated the open exchange of information, even if citation, i.e. has always been an issue, particularly for scholarly outputs (Lagoze et al., 2012, p. 2223).

Web architecture is a conceptual framework led by the W3C that underpins and sustain the World Wide Web (Jacobs & Walsh, 2004), created to be (Berners-Lee et al., 1994). It encompasses the architectural bases of identification, interaction, and format – also referred to as representation where HTTP provides the technical mechanisms for transmitting and accessing information.

The web architecture is based on a set of identifiers, such as URIs, which are used to uniquely identify resources on the web. These identifiers play a crucial role in enabling users to find, access, and share information on the web, and they help to ensure that web-based systems are both user-friendly and interoperable. Here, it is valuable to distinguish between three key concepts: URI, URL, and URN, as a URI can be further classified as a locator, a name, or both (Berners-Lee et al., 2005, p. 7) – as shown in Figure 3.13.

Overlap and Difference between URI, URL, and URN
Figure 3.13: Overlap and Difference between URI, URL, and URN

Interaction between web agents, i.e. a person or a piece of software acting in the information space on behalf of a person, entity or process, over a network involves URIs, messages and data. Web protocols, such as HTTP, are message-based. Messages can contain data, resource metadata, message data, and even metadata about the metadata of the message, typically for integrity checking (Jacobs & Walsh, 2004).

The Web Architecture allows for multiple Representations of a Resource. In this context, a data format specification becomes pivotal, encapsulating an agreement on how to correctly interpret the representation of data, as articulated by (Jacobs & Walsh, 2004):

A data format specification embodies an agreement on the correct interpretation of representation data. The first data format used on the Web was Hypertext Markup Language (HTML). Since then, data formats have grown in number. Web architecture does not constrain which data formats content providers can use. This flexibility is important because there is constant evolution in applications, resulting in new data formats and refinements of existing formats. Although Web architecture allows for the deployment of new data formats, the creation and deployment of new formats (and agents able to handle them) is expensive. Thus, before inventing a new data format (or “meta” format such as XML), designers should carefully consider re-using one that is already available.

Access can also be mediated by content negotiation, which is a mechanism employed in web communication to determine the most appropriate representation of a resource to be sent to a client based on the client’s preferences and the available representations (Lagoze et al., 2012, pp. 2223–2224).

At its core, web architecture is based on a set of architectural principles that guide the design and development of web-based systems and applications. These principles include concepts such as orthogonality, extensibility, error handling, and protocol-based interoperability. Orthogonality allows the evolution of identification, interaction, and representation independently. Extensibility is key, enabling technology to adapt without compromising interoperability. Error handling addresses diverse errors, from predictable to unpredictable, ensuring seamless correction. Finally, the web’s protocol-based interoperability fosters communication across varied contexts, outlasting entities and facilitating the longevity of shared technology (Jacobs & Walsh, 2004). Overall, these principles help to ensure that the web remains robust, reliable, and flexible.

Web architectures can be categorised into several types, each offering a specific approach to designing and structuring web-based systems. Here, I will focus on the following three types of architectures – shown in Figure 3.14: the client-server model, the three-tier model, and SOA.

Types of Web Architectures: Client-server Model, Three Tier Model, SOA
Figure 3.14: Types of Web Architectures: Client-server Model, Three Tier Model, SOA

The client-server model partitions the responsibilities between two key components: the client, which represents the user interface or user-facing part of the system, and the server, which is responsible for storing and serving data. In this model, clients and servers communicate to perform various functions, such as requesting and delivering information (Oluwatosin, 2014, pp. 67–68).

The three-tier model is another significant web architecture that introduces an additional layer between the client and server, resulting in a three-part structure. This architecture is designed to further segregate and manage the system’s components (Wijegunaratne & Fernandez, 1998, pp. 41–42). The three tiers typically consist of the presentation tier (the user interface), the application tier (responsible for logic and processing), and the data tier (where data storage and retrieval occur).

SOA is a web architecture that emphasises the creation and utilisation of services as the central building blocks of a system. Services in this context are self-contained, modular units of functionality that can be accessed and used independently by various components of a web application. These services are designed to be loosely coupled, meaning they can interact with other services without a deep dependency on one another. Overall, SOA is a paradigm for organizing and packaging units of functionality as distinct services, making them available across a network to be invoked via defined interfaces, and combining them into solutions to business problems.’ (Laskey & Laskey, 2009, p. 101).

SOA can encompass various communication protocols, such as REST, which is a prominent architectural style for designing networked applications (Fielding, 2000), primarily leveraging HTTP. RESTful services, i.e. applications that complies with the REST constraints, are designed to work with existing capabilities rather than creating new standards, frameworks and technologies (Battle & Benson, 2008, p. 62). These services are built around a set of constraints, including statelessness, a uniform interface, resource-based identification, and the use of standard request methods such as GET, POST, PUT, and DELETE (Tilkov, 2017).

The following are all the specified request methods enabling clients to perform a wide range of operations on resources[106] (Fielding et al., 2022, p. 72):

RESTful services, with their emphasis on using standardised HTTP methods and resource-based identification, offer a versatile means of designing web services and APIs. Their simplicity and compatibility with the web’s core protocols make them a practical choice for implementing various web-based applications. In the context of the Semantic Web, RESTful services can serve as a crucial component for accessing and exchanging graph data (see Lee & Kim, 2011).

3.4.2 The Semantic Web

The Semantic Web is (Berners-Lee et al., 2001, p. 35). It was already in (Berners-Lee, 1999)'s vision and prediction that the web, in its next phase, could be understood by machines, i.e. shifting from a traditional web of documents to a web of data. (Bauer & Kaltenböck, 2012) [p. 25] articulates that ‘[t]he basic idea of a semantic web is to provide cost-efficient ways to publish information in distributed environments. To reduce costs when it comes to transferring information among systems, standards play the most crucial role.’.

At the heart of the Semantic Web lies the foundation of RDF. The original RDF specification, known as the RDF Model and Syntax, serves as the underlying mechanism that establishes the basic framework of RDF. This framework provides the cornerstone to facilitate the exchange of data among automated processes (Lassila & Swick, 1999). A fundamental component within RDF is the RDF triple as shown in Equation 3.1, comprising three essential elements: the subject (s), the predicate (p), and the object (o). In an RDF triple, the subject is the resource or entity about which a statement is made, the predicate is the relationship or property describing that statement, and the object is the value or resource associated with the statement.

s p o

Equation 3.1: Triple Pattern Notation

RDF statements are reminiscent of the semiotic triangle of (Ogden & Richards, 1930) [p. 11] — as illustrated in Figure 3.15 — where the referent is tantamount to the predicate of a triple. This analogy emphasises the intrinsic relationship between communication, representation and knowledge organisation. It highlights how both language and structured data rely on the establishment of connections and relationships to effectively convey meaning.

The Semiotic Triangle by [(Ogden & Richards, 1930)].
Figure 3.15: The Semiotic Triangle by [(Ogden & Richards, 1930)].

Figure 3.16 is an RDF graph about myself and where I was born leveraging mostly Schema.org[107], a collaborative project and Linked Data vocabulary used to create structured data markup on websites. This graph consists of vertices and edges, where vertices can be either URIs or literal values, and the edges represent relationships between them. In plain language, the graph asserts that there is a person represented by the URL https://www.example.org/julien-a-raemy, who has a given name ‘Julien Antoine’ and a family name ‘Raemy’. The person’s birthplace is specified as an URL from Wikidata, which is of type schema:Place. Additionally, there’s a statement indicating that the birthplace is named ‘Fribourg’.

Example of an RDF Graph
Figure 3.16: Example of an RDF Graph

In the subject-predicate-object syntax of RDF, the subject can be either a URI or a blank node[108]. The predicate is an URI, like schema:givenName, and its aim is to establish connections between subjects and objects, describing the nature of the relationship. The object is either an URI, a blank node or a literal, such as or . Objects can also act as subjects if they are identifiable, allowing for the expansion and interconnection of RDF graphs.

The original specification proved too broad, leading to confusion and a subsequent effort yielded an updated specification and new documents such as RDF/XML (Beckett, 2004), which express an RDF graph as XML, a syntax specification recommendation in 2004 and later revised in 2014 as part of the RDF 1.1 document set (Gandon & Schreiber, 2014), which also introduces the notion of an RDF dataset that can represent multiple graphs (Cyganiak et al., 2014). Code Snippet 3.4 is the RDF/XML serialisation of the earlier graph.

Code 3.4: RDF/XML Serialisation of the RDF Graph

<?xml version="1.0" encoding="utf-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:schema="http://schema.org/">
  <schema:Person rdf:about="https://www.example.org/julien-a-raemy">
    <schema:givenName>Julien Antoine</schema:givenName>
    <schema:familyName>Raemy</schema:familyName>
    <schema:birthPlace>
      <schema:Place rdf:about="https://www.wikidata.org/entity/Q36378">
        <schema:name>Fribourg</schema:name>
      </schema:Place>
    </schema:birthPlace>
  </schema:Person>
</rdf:RDF>

(Idehen, 2017) highlights a significant concern regarding the earlier representations of the Semantic Web and how it is portrayed. These portrayals often place undue emphasis on the pivotal role of XML as an ostensibly obligatory component in Semantic Web development. To him, this historical perspective, particularly prominent around the year 2000, erroneously positioned XML as a superior alternative to HTML for constructing the Semantic Web.

As illustrated by Figure 3.17, (Idehen, 2017)’s revision embodies a Semantic Web layer cake that encompasses several technical or conceptual components.

Tweaked Semantic Web Technology Layer Cake by (Idehen, 2017)
Figure 3.17: Tweaked Semantic Web Technology Layer Cake by (Idehen, 2017)

Following on from the components outlined in Figure 3.17, I will look in more detail at further RDF features, serialisations, and RDF-based standards for representing, querying or validating graphs. In doing so, I will touch on some considerations related to the inference and reasoning of RDF graphs.

Code Snippet 3.5 is a Turtle serialisation of Figure 3.16. Turtle, a W3C standard, is a notation is a way to express this data in a structured and machine-readable format (Wood et al., 2014, p. 44). It is a common syntax used for representing RDF data. It allows people to create statements in a more friendly manner than in an RDF/XML serialisation (Beckett et al., 2014). Here are some of the most important features:

Code Snippet 3.5: Turtle Serialisation of the RDF Graph

@ prefix rdf:  .
@ prefix schema:  .

https://www.example.org/julien-a-raemy; rdf:type schema:Person ; schema:givenName “Julien Antoine” ; schema:familyName “Raemy” ; schema:birthPlace https://www.wikidata.org/entity/Q36378 .

https://www.wikidata.org/entity/Q36378 rdf:type schema:Place ; schema:name “Fribourg” .

To query RDF-based graphs, SPARQL is leveraged. It is a query language as well as a protocol designed for querying and manipulating RDF data. It allows users to retrieve specific information from RDF datasets, making it a fundamental tool for working with Linked Data (Wood et al., 2014, pp. 99–100). The SPARQL query provided in Code Snippet 3.6 is aimed at extracting the name of a person from a given RDF graph. The query uses RDF predicates associated with the given name and family name properties, concatenating them to form the person’s complete name. The answer to the query, based on the provided RDF data, would be the complete name of the person, which in this case is ‘Julien Antoine Raemy’. It showcases the versatility of SPARQL, offering a flexible and expressive means of interacting with RDF data. By specifying patterns and conditions, this query identifies and combines relevant information.

Code Snippet 3.6: SPARQL Query Against the RDF Graph

PREFIX rdf: 
PREFIX schema: 

SELECT ?name WHERE { https://www.example.org/julien-a-raemy schema:givenName ?givenName ; schema:familyName ?familyName . BIND(CONCAT(?givenName, " ", ?familyName) AS ?name) }

While RDF's simplicity is one of its strengths, it can also pose limitations when it comes to expressing nuanced details, tracing the origins of statements, and addressing the intricacies of logical reasoning. RDF's foundational model, which relies on triples to establish relationships between resources, may not always capture the full depth of knowledge and inferences that more complex KR languages or systems can achieve.

As such, RDF reification, as expounded by (Massari et al., 2023), emerges as a pivotal mechanism to enrich RDF. It stands as a mechanism that allows for the explicit representation of statements about statements in RDF. This technique facilitates the modelling of metadata or provenance information about RDF triples. In other words, RDF reification allows you to describe when and by whom a particular statement was made, adding a layer of context and traceability to RDF data. This is particularly valuable in scenarios where data lineage, trustworthiness, and attribution need to be maintained.

Furthermore, RDF 1.2 introduces the concept of ‘quoted triples’, i.e. ‘an RDF triple that is used as the subject or object of another triple’ (Hartig et al., 2023). Quoted triples are a powerful extension to the RDF model, enabling the inclusion of entire RDF triples within other triples. This feature adds a layer of expressiveness to RDF, making it more versatile in representing complex relationships and data structures.

Deduction of entity class membership is an essential aspect of KR, and in this regard, RDFS plays a pivotal role (Brickley & Guha, 2014). RDFS, with its foundation in formal semantics, provides a framework for verifying properties like rdfs:domain and rdfs:range associated with given properties (Bruns et al., 2023).

Code Snippet 3.7 showcases how RDFS can be leveraged. In this example, RDFS class and property are specified. For instance, schema:Person and schema:Place are classes using rdf:type rdfs:Class, and property characteristics, including domains and ranges, using rdf:Property, rdfs:domain, and rdfs:range have been defined to create basic RDFS.

Code Snippet 3.7: Example of using RDFS to Determine Class Membership and Property Definitions

@ prefix rdf:  .
@ prefix rdfs:  .
@ prefix schema:  .

schema:Person rdf:type rdfs:Class . schema:Place rdf:type rdfs:Class .

schema:givenName rdf:type rdf:Property ; rdfs:domain schema:Person ; rdfs:range rdfs:Literal .

schema:familyName rdf:type rdf:Property ; rdfs:domain schema:Person ; rdfs:range rdfs:Literal .

schema:birthPlace rdf:type rdf:Property ; rdfs:domain schema:Person ; rdfs:range schema:Place .

schema:name rdf:type rdf:Property ; rdfs:domain schema:Place ; rdfs:range rdfs:Literal .

RDFS, when used in conjunction with other reasoning methods, enhances the overall inferential capabilities of RDF, enabling more profound insights and KR. For example, OWL allows authors to decide how expressive they want to be, given the computational realities involved, which is not the case with RDFS.

Indeed, for modelling ontologies in the Semantic Web, OWL is the preferred language. It is a powerful and widely used language for developing ontologies that can be leveraged as metadata vocabularies on the web [(Zeng & Qin, 2022) p. 63] and provides a framework for creating, sharing, and reusing ontologies. OWL is a key component in the Semantic Web, enabling the formal representation of knowledge and complex relationships. OWL possesses several key features that make it a powerful language for modelling ontologies. Firstly, it provides a well-defined formal semantics, allowing computers to infer new knowledge and make logical deductions based on the information encoded in the ontology. Additionally, OWL supports a rich vocabulary of constructs for defining classes, properties, and individuals, enabling the creation of complex and structured ontologies.

Code Snippet 3.8 is an example of an ontology using OWL in Turtle. Two classes, ex:Person and ex:Place, along with two properties: hasName and hasBirthPlace have been defined. The ex:Person class is constrained to have a name, which is specified as a datatype property with a string value. Additionally, each ex:Person is required to have a hasBirthPlace, which is defined as an object property with a reference to the ex:Place class.

Code Snippet 3.8: Example of a OWL-based Ontology

@ prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@ prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@ prefix owl: <http://www.w3.org/2002/07/owl#> .
@ prefix ex: <http://example.org/ontologies#> .

ex:Person rdf:type owl:Class . ex:Place rdf:type owl:Class .

ex:hasName rdf:type owl:DatatypeProperty . ex:hasBirthPlace rdf:type owl:ObjectProperty .

ex:Person rdfs:subClassOf [ rdf:type owl:Restriction ; owl:onProperty ex:hasName ; owl:someValuesFrom xsd:string ] .

ex:Person rdfs:subClassOf [ rdf:type owl:Restriction ; owl:onProperty ex:hasBirthPlace ; owl:someValuesFrom ex:Place ] .

OWL is grounded in DL (Ma & Hitzler, 2009, p. 197), striking a balance between expressiveness and computational complexity, making it well-suited for automated reasoning. There are different versions of OWL, including OWL Lite, OWL DL, and OWL Full, which vary in terms of their expressive power and reasoning capabilities. OWL 2, introduced in 2009, extended the original OWL specification with additional features and improvements (W3C OWL Working Group, 2012). Figure 3.18 provides an overview of OWL, highlighting its main building blocks in terms of syntax as well as semantics.

Structure of OWL 2
Figure 3.18: Structure of OWL 2 (W3C OWL Working Group, 2012)

Turning to SHACL, a W3C standard defined in 2017, it provides a valuable framework for validating data graphs against specific shapes and constraints. Shapes are templates that specify the expected structure, properties, and validation rules for a particular class of resources, while constraints are rules or conditions applied to data instances to ensure they conform to the structure and validation criteria defined by a shape. Overall, SHACL is primarily used for data validation and quality assessment (Knublauch & Kontokostas, 2017).

Code Snippet 3.9 is an example of SHACL could be used to validate the RDF data graph of . The following shapes and constraints have been defined:

Code Snippet 3.9: Example of a SHACL Document in Turtle that can be Used to Validate the RDF Graph

@ prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@ prefix schema: <http://schema.org/> .
@ prefix sh: <http://www.w3.org/ns/shacl#> .

schema:PersonShape a sh:NodeShape ; sh:targetClass schema:Person ; sh:property [ sh:path schema:givenName ; sh:minCount 1 ; ] ; sh:property [ sh:path schema:familyName ; sh:minCount 1 ; ] ; sh:property [ sh:path schema:birthPlace ; sh:minCount 1 ; sh:nodeKind sh:IRI ; ] .

schema:PlaceShape a sh:NodeShape ; sh:targetClass schema:Place ; sh:property [ sh:path schema:name ; sh:minCount 1 ; ] .

One thing to realise, however, is that there are limitations when it comes to reasoning with controlled vocabularies and constraints in SHACL. While it excels in data validation and quality assurance, SHACL is not inherently a comprehensive reasoning framework. In their study, (Sacramento et al., 2022) [p. 129] noted that there is still a challenge in harnessing the potential of ontologies’ axioms to deduce information and derive implicit facts. While ontologies are instrumental in structuring knowledge and relationships between concepts, the efficient utilisation of axioms for reasoning remains an ongoing challenge.

3.4.3 Linked Data Principles

By exploiting the core structure of the web’s architecture, Linked Data maximises its potential to enable the widespread distribution of information on a global scale. This approach makes effective use of the fundamental elements of the Web and is consistent with its principles of simplicity, decentralisation and openness. As a result, Linked Data exploits the optimal structural equilibrium of the web, allowing content from disparate servers to be assembled seamlessly into a unified repository of global knowledge. This harmonious convergence of principles has driven the remarkable growth of the Web over the past few decades, demonstrating its ability to provide an integrated platform for information exchange. The four principles of Linked Data, as outlined by (Berners-Lee, 2006) are:

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (e.g. RDF, RDFS, SPARQL, etc.)
  4. Include links to other URIs so that they can discover more things.

By following these principles, Linked Data enables structured data to be published and interlinked on the Web, creating a graph of data that can be accessed and navigated by both humans and machines. These principles are closely related to the Semantic Web, which is an extension of the web that aims to give information well-defined meaning, enabling computers and people to work together more effectively (Heath & Bizer, 2011).

The glue that holds together the traditional document Web is the hypertext links between HTML pages. The glue of the data web is RDF links. (Bizer et al., 2008, p. 1265)

Some of the challenges highlighted by (Bizer et al., 2009) as a retrospective ten years after the launch of RDF were and still remain an issue:

These challenges serve as a foundation for exploring LOD practices in the subsequent subsection, where I introduce two deployment schemes.

3.4.4 Deployment Schemes for Open Data

LOD is a potent blend of Linked Data and Open Data that is both linked and uses open sources. LOD is defined as Linked Data released under an open licence that does not prevent it from being reused free of charge.

In 2010, Tim Berners-Lee introduced the Five-Star Open Data Deployment Scheme[113], also known as Five-Star Open Data or Five-Star Linked Data, to provide a structured framework for publishing and promoting data on the web. The scheme, as illustrated by Figure 3.19, comprises five progressively more demanding criteria that define the level of openness and accessibility of data.

Five-Star Deployment Scheme for Open Data
Figure 3.19: Five-Star Deployment Scheme for Open Data
  1. The first star indicates that the data are available on the web in any format, setting a basic level of accessibility.
  2. Moving on to the second star, data should be made available in a machine-readable, structured format that allows for easier processing and analysis. This eliminates the need to interpret unstructured information and promotes efficient use of data.
  3. The third star builds on the second by emphasising the importance of using non-proprietary formats such as CSV rather than vendor- or platform-specific formats such as Microsoft Excel. This promotes interoperability and avoids vendor lock-in, ensuring wider access and use of the data.
  4. The fourth star takes the criteria further by requiring that data be published using open standards specified by the W3C, in particular RDF and SPARQL. These standards improve the integration and linking of data across the Web, enabling richer and more meaningful analysis.
  5. The highest standard, represented by the fifth star, incorporates all of the previous criteria and requires that data not only be in a non-proprietary format and conform to W3C standards, but also be linked to other LOD. This interconnectedness enables a broader and richer web of data, increasing its value and utility in a global context.

By adhering to these progressively stringent criteria, data providers can contribute to a more open and connected web, fostering a collaborative environment for data consumption and analysis. The Linked Data Principles and the Five-Star Open Data Deployment Scheme share commonalities, including adherence to open standards and linking data with other sources. However, the rating system offers greater specificity and provides a well-defined path for data publishers to improve openness and interconnection of data.

Yet, (Hasnain & Rebholz-Schuhmann, 2018) noted in their study around assessing FAIR against the Five-Star Deployment Scheme that while LOD emphasises data openness, FAIR additionally requires a stated licence for access, incorporating the concept of reusability with consideration of licensing agreements and contextual information such as provenance. As such, FAIR can be seen as an extension of LOD, focusing on user needs and incorporating aspects of separation of core data and metadata, including licensing considerations, while LOD remains an idealistic approach centred on open data.

In complement and in response to the Five-Star model, (Hyvönen et al., 2014) proposed an extended Seven-Star model to address some further key challenges in dataset reuse: the difficulty in assessing how suitable the data are for a particular application purpose. They highlighted that datasets often lack clear definitions or descriptions of the schemas or vocabularies they use, making it difficult to effectively understand the characteristics of the data. Furthermore, even when both the data and its schema are available, assessing the data’s conformance to the schema is hampered by the data quality issues that are prevalent in the Semantic Web. To mitigate these challenges and to encourage data publishers, they introduced two additional stars.

  1. The sixth star is awarded if the schemas or vocabularies used in the dataset are explicitly described and published alongside the dataset, unless these schemas are already available elsewhere on the Web. This documentation improves the understanding of the structure and organisation of the data.
  2. To attain the seventh star, the quality of the dataset in terms of the schemas it uses must also be explicitly described. This information enables users to assess whether the quality of the data meets their specific requirements and provides a valuable tool for determining the suitability of the dataset for their intended applications.

The Seven-Star model thus extends Berners-Lee’s Five-Star Open Data Deployment Scheme to provide a comprehensive framework for improving the clarity, usability and applicability of open data on the web[114].

3.4.5 Linked Open Data in the Cultural Heritage Domain

In this subsection, I will present several illustrative cases of CH projects or CHIs that have engaged or are engaging with LOD, offering insights into some of the processes, benefits or challenges, such as issues related to sustainability and community engagement.

The application of LOD in the CH domain has received considerable attention due to its ability to improve the quality and visibility of data provided by institutions, especially from GLAMs (Candela et al., 2018, p. 481). Equally, the development of DH has seen a growing interest in Linked Data, where research projects have increasingly adopted semantic enrichment approaches to improve the quality, annotation, and visibility of their data such as the Pelagios network[115] (see Isaksen et al., 2014), while at the same time encouraging a more self-sustaining environment (see Zeng, 2019).

According to (Davis & Heravi, 2021) [p. 21:5], the CH sector’s involvement in Linked Data research began approximately a decade to 15 years ago, with a notable peak around 2015. (Davis & Heravi, 2021) [pp. 21:7-21:8] identified some key players within the CH sector that have been involved in the implementation of Linked Data, among the most prominent: universities, university libraries, national libraries, government bodies, as well as museums. Within this context, the authors identified eight prominent themes that emerged to signify the motivations behind this interest. These themes included meeting research needs, exploring Linked Data, meeting the needs of users, improving discoverability, fostering interoperability, educational initiatives, meeting the needs of GLAM institutions, as well as prioritising preservation efforts [(Davis & Heravi, 2021) pp. 21:9-21:10]. Moreover, (Candela et al., 2023) assert that LOD is also a motivation using advanced methods and techniques for publishing and reusing digital collections.

The deployment of LOD is primarily evident through publication of datasets, which I will highlight here in three different ways, ranging from the most rudimentary form of publication to the most complex. First, records may be accompanied by a representation conforming to the RDF syntax, which can be downloaded in RDF/XML or Turtle. A second approach involves establishing a content negotiation mechanism that enables both humans and machines to access metadata. Thirdly and on a more occasional basis, platforms have implemented a SPARQL service allowing for federated queries, either on a dedicated web interface or through a CLI [(Raemy, 2022) pp. 132-133, citing (Papadakis et al., 2015)].

This publication effort is carried out not only by libraries, archives, and museums but also by aggregation platforms such as Europeana[116] (Purday, 2009), which provides access to millions of digitised books, images, and audiovisual resources from European CHIs. Europeana uses LOD to create rich, interconnected descriptions of these resources, making it easier for users to discover and explore related resources across different institutions and collections [(Poulopoulos & Wallace, 2022) p. 6].

Europeana is a project funded by the European Union that aims to promote a sense of European identity by aggregating metadata from national and local heritage institutions onto a single digital platform (Freire et al., 2018; Raemy, 2020). The EDM is the core of the project, which leverages standardised thesauri and vocabularies to provide semantic contextualisation for CH resources and allow for semantic operations on metadata (Charles & Isaac, 2015). However, the overarching nature of EDM may not deliver the level of granularity that all CHIs require to accurately document their resources. Despite this, memory institutions accept the sacrifice of accuracy to be part of a Europe-wide collection that promotes a sense of Europeanness. The study by (Capurro & Plets, 2020) traces how a standardised European metadata structure plays a role in governing local and national heritage institutions and enacting a European mindset. The EDM may allow heritage stakeholders to benefit from Europeana’s online exposure while also contributing to the creation of a European identity.

The landscape of LOD research project predominantly originates from prominent institutions and national aggregators within the Europeana network. For instance, the BnF has made substantial contributions through its data service[117] (Simon et al., 2013). Additionally, the Swedish National Heritage Board[118] for metadata aggregation has played a crucial role in metadata aggregation, as outlined in the work of (Smith, 2021) [pp. 67-69]. Furthermore, the Dutch Digital Heritage Network[119] has been actively engaged in the management of KOSs (Scharnhorst et al., 2023).

Another example of LOD in the CH domain that has already been mentioned is the Getty Research Institute’s controlled vocabularies, which provides access to data on art history, architecture, and related fields. The initiative uses Linked Data principles to create a network of interlinked resources, including artists, artworks, and publications, allowing researchers to explore complex relationships between different CH resources (Cobb, 2015; Harpring, 2010). Similarly, authority records such as the GND, the LCSH, or a collaborative structured service such as Wikidata, a collaboratively edited KG, have gained significance and have become hubs connecting various identifiers and promoting the use of related concepts (Chardonnens, 2020; Thalhath et al., 2021). Moreover, (Lodi et al., 2017) highlight that interconnected CH resources should be managed separately due to diverse definitions and classifications in the GLAM sector.

While quite a few (well-funded) libraries have played a crucial role in advancing Linked Data in the CH domain, there remain substantial challenges. (Raza et al., 2019) [p. 11] underline these challenges, including issues related to cataloguing, the adoption of new standards, a proliferation of vocabularies and ontologies, and the absence of data access agreements. Additionally, challenges include a scarcity of Linked Data tools and expertise, mapping difficulties, ownership disputes, and the management of multilingual and heterogeneous data. Similarly, (Banerjee, 2020)’s critique of LOD highlights that Linked Data terminology often seems unnecessarily complex and obscure. He emphasises that Linked Data is most effective for technical challenges with well-maintained metadata, ontologies and vocabularies. He considered for instance that the application of Linked Data in general library use quite inappropriate due to the complexity mismatch and limited practical benefits. The author suggests that its use should be reserved for specific domains with comprehensive vocabularies, and warns against its indiscriminate adoption for tasks better suited to simpler, established methods.

For instance, in Switzerland, efforts were made to enhance data accessibility and interoperability through the development of linked.swissbib.ch. This platform was designed to serve as the LOD counterpart to Swissbib, a comprehensive meta-platform of bibliographic resources (Prongué & Schneider, 2015, pp. 125–126). The initial process of creating this data complement involved transforming bibliographic MARC/XML records into JSON-LD, a structured data format conducive to Linked Data integration (Hipler et al., 2018, p. 166). This transformation played a foundational role in preparing the data for inclusion in linked.swissbib.ch. However, it is important to note that the meta-platform Swissbib, along with its envisioned Linked Data service, linked.swissbib.ch, is no longer in existence and at the time of writing, there is no active Linked Data service directly equivalent to linked.swissbib.ch for libraries in Switzerland.

Transitioning from the conventional logic of data publication, (Hyvönen, 2020) [p. 187-190] suggests that there is a need for a transition in the use of Semantic Web portals, shifting from a data publication logic to one of analysis and the serendipitous discovery of knowledge. He categorises three generations of portals:

  1. Portals for search and browsing which can handle data harmonisation, aggregation, search, and navigation.
  2. Portals with tools for distant reading that provides users with an integrated set of tools for interactive research problem-solving.
  3. Portals for serendipitous knowledge discovery based on AI and which may automatically address research queries based on constraints set by scholars and savvy users.

While (Hyvönen, 2020) asserts that while DH and CHIs have contributed to the deployment of second-generation systems, the path forward necessitates a greater emphasis on source criticism and the proficient use of sophisticated computing tools to reach the third stage. This progression is essential to expand the scope and capabilities of DH and beyond.

(Bernasconi et al., 2023) [p. 12] corroborate this viewpoint some extent as they assert that LOD-driven GUIs should be able to integrate different interaction paradigms ‘to support the creation and exploration for data models by domain experts who may not pose a strong technical background on ontologies and RDF’. For example, the LOD4Culture project[120] is an initiative aimed at accessing CH data. LOD4Culture has implemented a RESTful API as detailed by However, it is important to note that, even though an API is used, the underlying technology driving the system remains SPARQL.

As highlighted earlier, Linked Data’s journey in the CH domain has witnessed significant attention, with its potential benefits and capabilities as well as its limitations. (Davis & Heravi, 2021) [p. 21:2, citing (Linden, 2015)] discuss that Gartner’s 2015 Hype Cycle[121] placed Linked Data in the and predicted that it would reach the within the next 5 to 10 years. Linked data has the potential to improve interoperability, information sharing and standard practices in the CH sector, in line with its goals of sharing resources and advancing knowledge.

Furthermore, (Davis & Heravi, 2021) [pp. 21:13-21:14]'s literature review on CH leveraging LOD reflects the alignment of collaboration, research needs, education, discoverability and interoperability with the ideals of the Linked Data movement and the overarching vision of the Semantic Web. However, their emphasis on the need for CHIs to prioritise collaboration for effective Linked Data project implementation underscores a critical aspect of this landscape.

In the context of my literature review, a notable research gap emerges, highlighting the importance of interactions with the data and the paramount role of collaboration. Furthermore, it is crucial to acknowledge that a number of LOD projects or platforms have not been actively maintained, and those that have are more likely to belong to wealthy institutions. Therefore, a comprehensive exploration of the LOUD design principles and their underlying specifications, is necessary. This exploration is particularly valuable because beneath the surface of LOUD standards lies the potential for community-driven practices enabling different degrees of semantic interoperability, ranging from basic to sophisticated.

3.5 Linked Open Usable Data

According to (Sanderson, 2016), the Semantic Web remains a compelling concept, with RDF graphs offering powerful capabilities for modelling reality, enabling computers to infer novel information more effectively. However, Sanderson contends that querying RDF graphs is complex, requiring a prior understanding of the structure. He notes that , highlighting challenges such as the lack of a clear start and end in XML serialisation compared to graphs, the difficulty of handling documents in triplestore storage, and sub-optimal visualisation practices. This aligns with (Target, 2018)’s argument that achieving the Semantic Web on a global scale is quite challenging. Yet, despite these challenges, some of the Semantic Web’s intent is now achievable thanks to the creation and deployment of JSON-LD. Notably, this serialisation can be treated as regular JSON, a lightweight data interchange format heavily used by software developers, or as a graph (Raemy & Sanderson, 2023), providing flexibility in representation and addressing some of the issues mentioned previously.

Transitioning into the domain of community-driven and JSON-LD-based specifications and their transformative potential, LOUD emerges as a key player. It is a term coined by (Sanderson, 2019), who has been involved in the conception and maintenance of the standards discussed in this section, which is organised into six parts.

Beginning with 3.5.1, I explore the five core design principles or characteristics of LOUD. Then, 3.5.2 presents a systematic review of LOUD. Subsection 3.5.3 provides an overview of IIIF, encompassing its inception, community, specifications, and the IIIF ecosystem. Subsection 3.5.4 delves into the specifics of WADM, a W3C standard provides a structured framework for representing annotations on the web. Then, 3.5.5 goes into the history, conceptual model, API and standards for Linked Art. It also explores the community around Linked Art and its current or planned implementations. Finally, 3.5.6 discusses the implications and opportunity that LOUD brings to the field of CH and the humanities at large.

3.5.1 LOUD Design Principles

One of the main purposes of LOUD is to make the data more easily accessible to software developers, who play a key role in interacting with the data and building software and services on top of it, and to some extent to academics. As such, striking a delicate balance between the dual imperatives of data completeness and accuracy, which depend on the underlying ontological construct, and the pragmatic considerations of scalability and usability, becomes imperative. Similar to Tim-Berners Lee’s Five Star Open Data Deployment Scheme, five design principles underpin LOUD[122].

  1. The right Abstraction for the audience: Developers do not need the same level of access to data as ontologists, in the same way that a driver does not need the same level of access to the inner workings of their car as a mechanic. Use cases and requirements should drive the interoperability layer between systems, not ontological purity.
  2. Few Barriers to entry: It should be easy to get started with the data and build something. If it takes a long time to understand the model, ontology, SPARQL query syntax and so forth, then developers will look for easier targets. Conversely, if it is easy to start and incrementally improve, then more people will use the data.
  3. Comprehensible by introspection: The data should be understandable to a large degree simply by looking at it, rather than requiring the developer to read the ontology and vocabularies. Using JSON-LD lets us to talk to the developer in their language, which they already understand. In other words, the data can be treated as a document serialised in JSON and/or as a graph.
  4. Documentation with working examples: You can never intuit all of the rules for the data. Documentation clarifies the patterns that the developer can expect to encounter, such that they can implement robustly. Example use cases allow contextualisation for when the pattern will be encountered, and working examples let you drop the data into the system to see if it implements that pattern correctly.
  5. Few exceptions, instead many consistent patterns: Every exception that you have in an API (and hence ontology) is another rule that the developer needs to learn in order to use the system. Every exception is jarring, and requires additional code to manage. While not everything is homogeneous, a set of patterns that manage exceptions well is better than many custom fields.

The concerns articulated by (Hyvönen et al., 2014) in their Seven-Star Model in terms of schema and data validation are also indirectly addressed by the LOUD design principles. Conceptualisations of LOUD specifications and their representation, mostly through usable APIs – echoing human-centred approach for developing them as articulated by (Myers & Stylos, 2016), address schema concerns. For data validation, best practices and validators developed by the IIIF and Linked Art communities come into play. Moreover, LOUD indirectly respond to the needs of scientists who advocate that and emphasise treating research objects, data, as first class citizens for reproducibility purposes (see Bechhofer et al., 2013).

Highlighting the success of the practices that guided IIIF, (Sanderson, 2020) identifies three systems adhering to the LOUD design principles: IIIF and specifically the third version of the Presentation API, WADM, and Linked Art. These three systems are complementary and can be used either separately or in conjunction. Figure 3.20 illustrates an high-level overview of an infrastructure combining all three of the LOUD specifications.

Example of a LOUD-Driven Infrastructure
Figure 3.20: Example of a LOUD-Driven Infrastructure (Felsing et al., 2023, p. 43)

In summary, the LOUD design principles, guided by considerations of accessibility, ease of use, comprehensibility, documentation, and consistency, not only address crucial concerns raised by Linked Data practitioners but also respond to the evolving needs of the CH and scientific communities emphasising data reproducibility. In the following subsection, I will explore the presence and impact of LOUD and their underlying principles in the scholarly landscape.

3.5.2 LOUD: Systematic Review

This subsection deals with a systematic review of academic references and ongoing projects related to LOUD. It examines how the concept is mentioned in the academic literature.

To conduct this systematic review, I employed the weight of evidence framework[123] developed by (Gough, 2007) [p. 218-219]. This framework consists of nine criteria, starting with formulating the review question and developing a protocol. In this case, the review question was formulated as a Boolean query, as shown in Equation 3.2, due to the absence of an existing systematic review for LOUD. The criteria also include defining inclusion and exclusion criteria for papers and conducting a systematic search strategy across academic databases. Screening, mapping, data extraction, quality and relevance appraisal, synthesis, and communication and engagement are other essential steps in this framework.

For the systematic review, I used a specific Boolean search query[124] to identify relevant research papers and resources. The query, as presented in Equation 3.2, was applied across academic databases, including Google Scholar[125], Semantic Scholar[126], Web of Science, and Scopus. Specifically, the query seeks references containing either ‘Linked Open Usable Data’ or ‘LOUD Design Principles’ or ‘LOUD Principles’ or instances of ‘LOUD’ combined with any of the three recognised standards adhering to its design principles: IIIF, Linked Art, and WADM.

Q = “Linked Open Usable Data” ∨ “LOUD Design Principles” ∨ “LOUD Principles”
     “LOUD” ∧ “IIIF” ∨ “LOUD” ∧ “Linked Art” ∨ “LOUD” ∧ “Web Annotation”

Equation 3.2: Executed Boolean Query

In Google Scholar, 60 results were returned, with 41 deemed relevant. Semantic Scholar’s results were not available due to difficulties with the search process as it yielded more than 800k results. Web of Science returned no relevant results, and Scopus provided four results, of which three were unique items. I also queried the Zenodo repository[127], resulting in 25 hits, including two conference papers, one journal article, and one working paper. The remaining results included presentations or datasets, mostly associated with my own work.

Table 3.2: Summary of Database Search Results on LOUD
Database Total Items Relevant Entries Unique Entries
Google Scholar 60 41 41
Semantic Scholar N/A N/A N/A
Web of Science 0 0 0
Scopus 4 4 3
Zenodo 25 4 2
Total 89 48 46

As of mid-November 2023, the systematic review yielded a total of 46 results, covering the period between 2018 and 2023. These results are summarised in Table 3.2. For more detailed information about each reference, including their type, language, source database, and categorisation, refer to Table 3.3. For instance, the references were predominantly written in English (37), with additional contributions in German (5), French (2), and one reference being bilingual in German and French. Additionally, one paper was in Japanese. I also highlighted in yellow the nine references I took part in.

Table 3.3: Summary of the LOUD Systematic Review
Reference Type Lang. DB Cat.
(Alexiev, 2018) Journal Article eng Google Scholar α1
(Harpring, 2018) Conference Paper eng Scopus α1
(Newbury, 2018) Conference Paper eng Google Scholar α3, β
(Pohl et al., 2018) Journal Article ger Google Scholar α4, γ
(Cossu, 2019) Journal Article eng Google Scholar α1, β
(France & Forsberg, 2019) Conference Paper eng Scopus α1, β
(Hofmann et al., 2019) Journal Article ger Google Scholar α1
(Klammt, 2019) Journal Article ger, fre Google Scholar α2
(Klic, 2019) PhD Thesis eng Google Scholar α1, β
(Nakamura, 2019) Conference Paper jpn Google Scholar α1, β
(Sanderson, 2019) Conference Paper eng Google Scholar α3, β
(Thiery, 2019) Working Paper eng Zenodo α4
(Ginhoven & Rasterhoff, 2019) Journal Article eng Google Scholar α1
(Brown & Martin, 2020) Conference Paper eng Google Scholar α1
(Delmas-Glass & Sanderson, 2020) Journal Article eng Google Scholar α3, β
(Paquet, 2020) Journal Article eng Google Scholar α2
(Romein et al., 2020) Journal Article eng Google Scholar α1
(Schmidt & Thiery, 2020) Conference Paper eng Google Scholar α1
(Thiery, 2020) Journal Article ger Google Scholar α1
(Thiery et al., 2020) Conference Paper eng Google Scholar α1, γ
(Thiery et al., 2020) Conference Paper eng Google Scholar α1, γ
(Brown et al., 2021) Conference Paper eng Google Scholar α1
(Delmas-Glass, 2021) Book Chapter eng Google Scholar α2, β
(Pohl, 2021) Journal Article eng Google Scholar α1
(Raemy, 2021) Conference Paper eng Google Scholar α1, β
(Roberts et al., 2021) Journal Article eng Google Scholar α1
(Adamou, 2022) Journal Article eng Google Scholar α3
(Binding et al., 2022) Journal Article eng Google Scholar α1
(Hopkins, 2022) Journal Article eng Google Scholar α1
(Lemos et al., 2022) Journal Article eng Scopus α1
(Middle, 2022) PhD Thesis eng Google Scholar α4
(Raemy, 2022) Conference Paper fre Google Scholar α3, β
(Roke & Tillman, 2022) Journal Article eng Google Scholar α1
(Schmidt et al., 2022) Journal Article eng Google Scholar α2, β
(Wigg-Wolf et al., 2022) Journal Article ger Google Scholar α1
(Cornut et al., 2023) Journal Article eng Google Scholar α3, β
(Felsing et al., 2023) Conference Paper eng Google Scholar α2, β
(Llewellyn et al., 2023) Conference Paper eng Zenodo α1
(Manz et al., 2023) Conference Paper eng Google Scholar α3, β
(Petz, 2023) Journal Article ger Google Scholar α3
(Raemy, 2023) Report eng Google Scholar α2, β
(Raemy & Gautschy, 2023) Conference Paper fre Google Scholar α1, β
(Raemy et al., 2023) Conference Paper eng Google Scholar α1, β
(Raemy & Sanderson, 2023) Pre-print eng Google Scholar α4, β
(Tóth-Czifra et al., 2023) Report eng Google Scholar α1, β
(Vitale & Rainer, 2023) Book Chapter eng Google Scholar α1

To categorise the papers, I employed a classification scheme with three main categories: α, β, and γ. Within α, there are four subcategories: α1 (Mention) for papers mentioning LOUD, α2 (Description) for those providing descriptions, α3 (Principles) for explanations of the design principles, and α4 (Analysis) for comparative analyses. β (Standards) represents papers mentioning recognised LOUD standards, i.e. IIIF, WADM, and Linked Art, while γ (Application) covers LOUD applications or reflection to other standards. Some papers fall into multiple categories. Figure 3.21 displays the distribution of references across these categories, categorised by year.

LOUD References Ordered by Year and α Sub-categories
Figure 3.21: LOUD References Ordered by Year and α Sub-categories

Figure 3.22 depicts Venn diagrams illustrating all six intersections of these categories, revealing how different aspects of LOUD are covered in the literature.

LOUD References as Venn Diagrams for Each of the Six Intersections
Figure 3.22: LOUD References as Venn Diagrams for Each of the Six Intersections

Several papers related to archaeology also made references to LOUD in the context of the CAA Data Dragons Interest Group’s work on semantics and LOUD[129]. Furthermore, a search on popular search engines, specifically Bing and Google, revealed the existence of a research project named simply LOUD. This project, initiated in 2019 by the Royal Museum for Central Africa in Belgium[130], was delayed due to the COVID-19 pandemic and is now being considered for future re-initiation.

Further research should explore projects that implement a combination of the above standards, without necessarily mentioning LOUD. It is important to note that a quick search will show many IIIF implementations that reference WADM, but it is mainly due to the Presentation API data model’s dependence on this specification. However, such mentions of WADM often do not indicate active adoption of web annotations for broader purposes. In these cases, while these projects still adhere to the design principles of LOUD by implementing IIIF, I would argue that the full potential of LOUD is realised through a combination of specifications, especially when it comes to achieving semantic interoperability[131].

As I move on to explore IIIF, WADM, and Linked Art in the following subsequent subsections, it is worth noting that these systems, as exemplified by (Sanderson, 2020), embody and adhere to the foundational principles of LOUD, showcasing their practical applicability in diverse contexts.

3.5.3 International Image Interoperability Framework (IIIF)

In essence, the IIIF[132] serves as both a model for presenting and annotating content, as well as a global community that develops shared APIs, implements them in software, and exposes interoperable content.

To explore these aspects, this subsection on IIIF is divided into four parts. First, in 3.5.3.1, I look at the genesis of the framework. Next, 3.5.3.2 unravels the IIIF community. Moving forward, 3.5.3.3 offers an overview of the existing and forthcoming APIs, extensions, and the development process of IIIF standards. Finally, in 3.5.3.4, I outline the ecosystem, encompassing some of the compliant software, as well as the institutions and projects that have implemented IIIF specifications – sometimes referred to as the IIIF universe.

3.5.3.1 Inception

IIIF was initially established in 2011. Its birth emerged as a community-based initiative that crystallised from the convergence of two pivotal endeavours. One strand of this narrative revolved around the imperative to facilitate the seamless exchange of high-definition images over the internet (Snydman et al., 2015, p. 16). This aspiration arose as a practical solution to mitigate the proliferation of duplicated images required for distinct projects. The desire to avert the necessity of sending substantial volumes of image data via conventional methods, such as mailing TB of data on hard drives, led to the contemplation of a web-based approach for sharing images that could break down silos.

In his talk during the 2019 IIIF Annual Conference, (Cramer, 2019) underscored the genesis of this facet of IIIF's inception, also highlighting the [133] where a plan was hatched informally in 2011 over dinner between technologists from the Stanford University Library, Oxford University and the British Library (Raemy, 2017, p. 13) for distributing at scale image-based resources. As (Emanuel, 2018) [p. 125] asserts: ‘a clear, focused emphasis on collaboration and interoperability, between both institutions and technical approaches, can provide significant relief from this conundrum[134], and can provide value for organisational members and technological end-users alike’ .

The second strand, interwoven with the first, emanated from the explorations and experiments surrounding the interoperability of digitised medieval manuscripts. The DMSTech at Stanford, operational from 2010 to 2013, provided the fertile ground for these reflections (Robineau, 2019). These deliberations ultimately coalesced into the formulation of the Shared Canvas Data Model[135], a model which employs a Linked Data approach based on Open Annotations and OAI-ORE[136] (see Sanderson et al., 2011) – a set of standards and protocols to facilitate the description and exchange of aggregations of web resources – to collaboratively describe digital facsimiles of physical objects, primarily in the CH domain where instances of the model are consumed by rendering platforms with a view to ‘understanding the relationships between the constituent text, image, audio or other resources’ (Sanderson & Albritton, 2013). The roots of this modelling effort can be traced back to (Sanderson, 2003)’s PhD thesis, where the first manuscript description was crafted based on an electronic edition of Froissart’s Chronicles, a prose history of the Hundred Years’ War written in the century.

Robert Sanderson’s recollections, shared on the IIIF Slack Workspace[137] in April 2022, provide an invaluable account of the early phases. The narrative unfolds in the mid-2000s when discussions between Tom Cramer, Stuart Snydman (both at Stanford University at the time), and himself at conferences such as the DLF Forum commenced. This discourse centred on the challenges and prospects of interoperability for medieval manuscripts. Concurrently, LANL was engaged in developing Djatoka, an image server leveraging JPEG2000 and an OpenURL-based API[138] (Chute & Van De Sompel, 2008). Sanderson’s move to Los Alamos in 2008 marked a turning point, as the convergence of ideas and resources paved the way for tangible progress. Funding from the Mellon Foundation[139] catalysed the evolution of IIIF. A parallel Mellon grant bolstered the development of Open Annotation, a precursor to the Web Annotation standard. This confluence set the stage for IIIF's embryonic phase. As the project progressed, collaboration expanded to include a wide range of partners. Lessons learned from manuscript interoperability were extrapolated and refined, culminating in pragmatic adjustments. The evolution included key refinements, including the adoption of JSON-LD as the sole manifest format and the establishment of a coherent API structure (Snydman et al., 2015, p. 18).

Tracing the development of IIIF, as highlighted in this timeline[140], reveals a sequence of pivotal milestones. Among these, the establishment of the Shared Canvas Data Model, a subject previously touched upon, stands out as a significant step. This timeline, while informative, is not all-encompassing but effectively underscores the essential moments shaping IIIF. The journey from visionary dialogues to the concerted collaborative efforts showcases the collective ingenuity pivotal in enhancing digital image accessibility and interoperability. A notable milestone in this journey was the first officially listed IIIF working meeting, which took place in Cambridge, England, in September 2011. This event symbolises the global commitment and the evolutionary stride of the IIIF initiative.

Stanford University played a pivotal role in rebuilding the Mirador viewer[141], a key milestone that exemplifies the spirit of community-driven effort in the IIIF initiative (Zundert, 2018, p. 2). This endeavour, later augmented through a collaborative partnership with Harvard University Library, highlights the cooperative dynamics and collective contributions that have been fundamental to the success and advancement of IIIF. The UV[142] served as an additional client, validating IIIF's interoperability aspirations (Raemy, 2017, p. 30). The availability of the people behind OSD[143], facilitated these advancements (Raemy & Sanderson, 2024, p. 76).

The maturation of IIIF was marked by incremental achievements, with Stanford’s active stewardship from 2014 to 2016 playing a pivotal role. The project’s multifaceted requirements found a harmonious synthesis within a supportive community. Financial resources, technical infrastructure, a coherent data model, and the collective dedication of engaged participants converged to transform IIIF from an abstract concept into a tangible reality (Hadro, 2019).

3.5.3.2 IIIF Community

The IIIF community stands as the cornerstone of the framework’s ongoing development and widespread adoption. It encompasses a broad spectrum of institutions and individuals. Their collaborative efforts are pivotal in crafting, upholding, and advocating for the IIIF specifications and tools. This community-led approach guarantees that IIIF aligns with the dynamic requirements of CH entities and the wider DH sphere.

While the majority of IIIF participants hail from North America, the United Kingdom, and Western Europe, the framework’s reach is continually extending globally, with new adopters implementing IIIF-compliant solutions worldwide (Raemy, 2017, pp. 14–15). Initially centred around GLAMs, IIIF is increasingly garnering interest from various domains, including organisations in the STEM sector (Kiley & Crane, 2016; see Moutsatsos, 2017), such as delivering digital pathology data by leveraging the IIIF Image API behind the scenes through an in-browser tiling mechanism (Bhawsar et al., 2023) or by converting tiles from storage instances compliant with the DICOM[144] standard (Jodogne, 2023).

Within the IIIF community, several dedicated groups focus on both community engagement and technical specifications[145]. These groups, fundamental to the discourse and evolution of IIIF-related matters, convene regularly and welcome participation from all interested parties (Hadro, 2022).

The structure of these groups falls into two categories. Community Groups engage in discussions on themes and topics pertinent to their focus areas, often showcasing demos of related work and recent implementations. Current active groups include 3D, Archives, A/V, Maps, Museums, and Outreach, with the Manuscripts, Design, and Newspapers groups currently on hiatus. TSGs are concentrated on collaborative efforts towards specific objectives related to the IIIF APIs. Current active TSGs are the 3D TSG, Maps TSG, Authorization Flow, and Content Search. Furthermore, IIIF has witnessed the successful completion of several TSGs. The Discovery TSG completed its work in 2023, leaving behind a number of specifications and deliverables. The Text Granularity TSG completed its mission in 2019 with the publication of the Text Granularity Extension. The A/V TSG concluded its work with the release of the Presentation API 3.0 in June 2020, and subsequently evolved into the A/V Community Group. Both group types thrive on an ethos of open community interaction, encouraging active involvement from all members of the IIIF community. New groups can emerge in accordance with the IIIF Groups Framework[146].

IIIF has also orchestrated various events[147], including working group meetings, training sessions, online gatherings, and annual in-person conferences rotating amongst continents (at the moment, IIIF conferences have only been held in either North America or Europe), to facilitate ongoing development and broaden institutional engagement.

The IIIF-C[148], established in 2015 by 11 founding institutions — including the initial three original institutions — aims to achieve several objectives to steer and sustain the IIIF initiative (Cramer, 2015): providing uniform and rich access to digitised resources, defining and maintaining APIs that support interoperability, and developing shared technologies for an enhanced UX in handling digitised materials. Although originating within libraries, the IIIF community has expanded to encompass museums, archives, CH aggregators, commercial entities, and technology companies, thereby fostering novel opportunities for interaction and collaboration across different sectors.

The IIIF community’s structure also includes key committees for editing the specifications and coordinating community processes, each playing a vital role in the framework’s governance and progress.

Although originating within libraries, the IIIF community has expanded to encompass museums, archives, commercial entities, and technology companies, thereby fostering novel opportunities for interaction and collaboration across different sectors. As of the current writing, the IIIF-C consists of 67 members, illustrating the growing global interest and investment in the IIIF initiative. The governance of the IIIF-C is overseen by three key committees, each playing a distinct role in the consortium’s operations:

These committees, through their collective efforts, play a crucial role in the structured and effective governance of IIIF-C. Their guidance and strategic decision-making are instrumental in steering the mission of IIIF and ensuring the fulfilment of its overarching goals. This foundation of leadership and collaboration within the consortium sets the stage for the development and refinement of IIIF specifications, the subject of the next subsection. As I transition to this topic, I will explore the technical backbone of the framework, highlighting how these specifications have become essential in advancing interoperability across CH platforms.

3.5.3.3 IIIF Specifications

Currently, IIIF has defined six APIs[152]. The most prominent are the Image and Presentation APIs, both updated to version 3 in June 2020, often referred to as the core IIIF APIs. Complementing these are the Content Search and Authorization Flow APIs, both in version 2 (released in 2022 and 2023, respectively), and slated for future updates to align with the core APIs. In addition, the Change Discovery and Content State APIs, both in version 1.0, are relevant to the discovery and aggregation and/or sharing of IIIF resources. All (recent) specifications are working with JSON-LD 1.1., which is for instance consistent with i18n language[153].

The development of the IIIF specifications is firmly rooted in the IIIF design principles (Appleby et al., 2018). These principles, which have influenced similar methodologies such as the LOUD design principles, are articulated across 13 main components. They encapsulate a methodology that prioritises clear, practical goals, aligning with shared use cases to ensure the specifications meet the developer’s needs effectively. This approach, focusing on simplicity, minimal barriers to entry, and adherence to good web and Linked Data practices, has been crucial in making the APIs both user-friendly and technically sound. Notably, the design principles advocate for ease of implementation and internationalisation, promoting a flexible yet cohesive structure. These principles do not directly express usability but rather represent objective constructs that naturally lead to usable and accessible data[154] (Raemy & Sanderson, 2023, pp. 7–8).

The Editorial Committee of IIIF, responsible for writing these specifications, adheres to these principles and a specific process[155], ensuring that the resulting APIs are not only performant but also seamlessly integrate within existing technologies and standards. The IIIF Editorial Process encompasses two primary components: the Community Process and the Editorial Committee Process. The Community Process involves open discussions on IIIF-Discuss[156], GitHub[157], and various group meetings for proposals and feedback. Essential to this process is the demonstration of use cases by at least two institutions, rigorous evaluation and testing of new features, and a thorough community review overseen by TRC. Specifications are subject to continuous revision with a release frequency guided by semantic versioning, balancing community input with the need for stability.

The editors make use of web best practices, including terminologies as defined in RFC 2119 (see Bradner, 1997) of the IETF, a standards organisation mainly responsible for the Internet protocol suite. Additionally, the process involves clear documentation of editorial participation, with a structured approach for suggesting changes primarily via GitHub. Changes undergo strict acceptance criteria, requiring consensus among editors before merging. Regular meetings, both in-person and virtual, are crucial for editors to discuss ongoing work and coordinate efforts. This dual-component process ensures thoughtful, inclusive, and responsive evolution of IIIF specifications, catering to the diverse needs of the community.

In the following paragraphs I will undertake an exploration of each of the IIIF APIs. Special emphasis will be placed on the core APIs due to their integral role in the IIIF ecosystem. While providing a rich insight into the Image and Presentation APIs, I will also discuss the particularities of the Content Search, Authorization Flow, Change Discovery, and Content State APIs, highlighting their contributions to the overall breadth of the IIIF specifications.

The IIIF Image API, defined as a RESTful web service, responds to standard HTTPS requests with an image (Appleby et al., 2020), in other words, it is an agreement between an image client and an image server to get the pixels (Robson, 2021).

The API uses its own URI syntax and offers two primary modes of interaction – both of which are hereafter exemplified with the syntax and an URL working with the specification – so that users can easily pan, deep zoom, manipulate, cite, and share regions of interest of an image.

The two modes of interaction are as follows:

Figure 3.23 shows the sequence of parameters of the URI syntax that needs to be interpreted by compliant software (region then size then rotation then quality then format), as well as showcasing different manipulations that are available. In this case, the image has been cropped to 125,15,120,140 as its coordinate, has a width of 90 pixels, is mirrored and then rotated by 345 degrees, and returned in a grey – or rather according to the specification – quality. This would give the following URL: https://example.org/image42/125,15,120,140/90,/!345/gray.jpg

IIIF Image API Order of Implementation
Figure 3.23: IIIF Image API Order of Implementation (Appleby et al., 2020)

Furthermore, it is mandatory for the image information document to delineate the extent of API support, i.e. the set of parameters and features that are supported by the image service, by denoting the compliance level as the value of the profile property. This compliance level must align with one of the three levels (0, 1, or 2) outlined in the Image API Compliance document[158].

The chosen compliance level should represent the highest level that meets all the requirements. Some of these go beyond image manipulation and include HTTP features such as server response behaviours and CORS, which is a mechanism that allows or restricts resources on a web page to be requested from another domain outside the domain from which the first resource was served. CORS is a protocol that is particularly relevant for IIIF services, as it involves accessing and displaying images hosted on different servers or domains, necessitating careful management to ensure seamless access while maintaining web security protocols (see Kesteren, 2023). To assist in achieving or verifying compliance with the IIIF Image API, an image service can refer to the Image API Validator[159], a tool designed to assist service providers in achieving compliance with the specification.

The result is that the IIIF Image API provides a well-defined image retrieval format that is easy to implement. It provides sufficient detail to drive zoomable image viewers. The URLs are designed to be intuitive, allowing easy manipulation by users. In addition, these URLs are optimised for caching and scalability, improving performance and efficiency.

As for the IIIF Presentation API, based on JSON-LD, it provides the necessary information about the structure and layout of objects or collections to drive a remove viewing experience. As such, its primary purpose is to display human-readable descriptive information, i.e. some minimal descriptive and legal metadata, rather than offering semantic metadata for search engines[160] (Appleby et al., 2020). Loosely based on the Shared Canvas Data Model, it incorporates four main resource types: Collection, Manifest, Canvas, and Range, each serving distinct functions within the API framework. In addition, the specification make use of types that are defined by WADM, such as AnnotationCollection, AnnotationPage, Annotation, and Content (see 3.5.4). A diagram of the Presentation API data model is shown in Figure 3.24.

IIIF Presentation API Data Model
Figure 3.24: IIIF Presentation API Data Model (Appleby et al., 2020)

The IIIF Manifest plays a key role as it encapsulates the structure and attributes of a compound object. It contains essential data required for a client application to accurately display the content to users. Typically, each Manifest is dedicated to delineating the presentation of a single complex entity, which could be diverse in nature, ranging from a book or a sculpture to a musical album. The Manifest thus serves as a crucial intermediary, translating complex object properties into a format that is both accessible and comprehensible to users engaging with digital representations of these objects. In the context of the Presentation API, the Manifest is perhaps one of the most important resource types, acting as the cornerstone for how digital content is structured and presented within the IIIF ecosystem. Code Snippet 3.10 shows an example of a basic Manifest from the cookbook recipes[161], which are published by the community to encourage publishers to adopt common patterns when modelling IIIF resources. To provide a clearer understanding, the following is a detailed breakdown of the structure this IIIF Manifest.

Code Snippet 3.10: Example of a IIIF Manifest

{
  "@ context": "http://iiif.io/api/presentation/3/context.json",
  "id": "https://iiif.io/api/cookbook/recipe/0005-image-service/manifest.json",
  "type": "Manifest",
  "label": {
    "en": [
      "Picture of Göttingen taken during the 2019 IIIF Conference"
    ]
  },
  "items": [
    {
      "id": "https://iiif.io/api/cookbook/recipe/0005-image-service/canvas/p1",
      "type": "Canvas",
      "label": {
        "en": [
          "Canvas with a single IIIF image"
        ]
      },
      "height": 3024,
      "width": 4032,
      "items": [
        {
          "id": "https://iiif.io/api/cookbook/recipe/0005-image-service/page/p1/1",
          "type": "Annotation",
          "items": [
            {
              "id": "https://iiif.io/api/cookbook/recipe/0005-image-service/annotation/p0001-image",
              "type": "Annotation",
              "motivation": "painting",
              "body": {
                "id": "https://iiif.io/api/image/3.0/example/reference/918ecd18c2592080851777620de9bcb5-gottingen/full/max/0/default.jpg",
                "type": "Image",
                "format": "image/jpeg",
                "height": 3024,
                "width": 4032,
                "service": [
                  {
                    "id": "https://iiif.io/api/image/3.0/example/reference/918ecd18c2592080851777620de9bcb5-gottingen",
                    "profile": "level1",
                    "type": "ImageService3"
                  }
                ]
              },
              "target": "https://iiif.io/api/cookbook/recipe/0005-image-service/canvas/p1"
            }
          ]
        }
      ]
    }
  ]
}

In summary, this IIIF Manifest describes a Canvas with a single IIIF image, and the image is associated with an an Annotation that provides additional information about the image. The Manifest also includes metadata such as a label and unique identifiers for various components.

In addition to the core functionalities, the IIIF Presentation API is augmented by three formally approved extensions:

IIIF Presentation API resources such as Manifest and Collection can be validated against a dedicated Presentation API Validator[163]. This tool ensures that these resources conform to the specification, guaranteeing their correctness within the framework.

If Version 3.0 of the Presentation API was released only a few years ago, the need for a specification that can formally display and disseminate 3D objects has been widely recognised across various domains. Addressing this demand, ongoing work on an upgraded version of the IIIF Presentation API (Version 4.0) to accommodate 3D resources holds great promise (Haynes, 2023). While particularly relevant to the CH sector, as highlighted by (Raemy & Gautschy, 2023) and (Manz et al., 2023) where conventional dissemination approaches struggle to effectively convey the depth and complexity of 3D artefacts, there are specific challenges beyond visual representation. These challenges include accurately representing the lighting conditions that affect the object’s appearance and capturing the spatial orientation, such as handedness (defining the coordinate system: where does the coordinate x, y, z have to be and in which direction). Such enhancements of the forthcoming IIIF Presentation API have the potential to empower institutions and industries beyond CH.

Certain key features have already been firmly established, particularly in regard to the integration of 3D content. A new resource type known as a Scene will be introduced to better accommodate 3D content and sits besides Canvas in the updated data model. This Scene is characterised by an infinite boundless space, where the origin (0,0,0) is at the centre rather than on the top-left corner as it is the case with still images, and is structured around a right-handed coordinate system. Within this system, movement along the Y-axis is interpreted as upward, while movement along the Z-axis signifies forward progression. Importantly, Canvases can be effectively nested within these Scenes, and likewise, Scenes have the capacity to be nested within other Scenes (Haynes et al., 2023). As (Rossenova, 2023) importantly argues, ‘the 3D specification would need to ensure that viewers can both implement a “minimum viable product” for user interaction and remain interoperable, as well as include more complex interactions and still specify those in a standards-compliant way’. This advancement could potentially enable the presentation of collections in immersive formats, transforming access, engagement and cross-platform interoperability.

Figure 3.25 illustrates the integration of the core APIs within Mirador. Elements highlighted in blue are rendered via the Image API, while those in red are facilitated by the Presentation API. It is important to note that the Image API can function independently or in conjunction with the Presentation API for additional viewing capabilities. Additionally, there is no direct counterpart to the Image API for A/V content; instead, A/V specifics are conveyed using the duration property of the Presentation API.

The Interaction of the Core IIIF APIs in Mirador
Figure 3.25: The Interaction of the Core IIIF APIs in Mirador

Shifting focus to the additional IIIF APIs, the Content Search API plays a vital role in enhancing the framework’s functionality. The specification facilitates searching within textual annotations and provides the mechanism for performing these searches within the IIIF context (Appleby et al., 2022). To add a Content Search service to a IIIF Manifest, an URI must be provided through the service property. Then the client, such as Mirador or the UV, will use it to create a valid query. An auto-complete service can also be embedded, further enriching the UX by streamlining the search process.

The Authorization Flow API outlines four mechanisms for accessing restricted content to either initiate an interaction with an access control system or to give a client enough information bout the user’s state with respect to the content provider. These mechanisms are: access service, token service, probe service, as well as logout service. The access service typically manifests as a user interface provided by the content host, which the client opens in a new tab, usually leading to the provider’s login page. The token service plays a crucial role wherein the client acquires an access token, necessitating presentation of the same authorising credentials to the token service as the browser would to the related access-controlled resource. The probe service, as defined in this specification, is an endpoint used by the client to ascertain the user’s association with the access-controlled resource it represents. This is applicable for both access-controlled content resources and image information documents (info.json) that produce access-controlled image responses. Additionally, there is an optional logout service, designed to facilitate user logout (Appleby et al., 2023).

The Change Discovery API is designed for machine-to-machine interfaces, primarily serving IIIF-aware systems. It enables content providers to efficiently publish lists of links to their content, which facilitates easy discovery and synchronisation for consuming systems (Appleby et al., 2021). The API is based on the W3C AS standard, ensuring a consistent and universally understandable pattern for publishing content changes. The API designates IIIF Collections and Manifests as primary access points for published content. Activities describing changes to these resources, and potentially to other types like IIIF Image API endpoints.

Central to the API's utility is its ability to allow content providers to articulate the specifics of how their content has evolved. This includes providing details on content that has been deleted or is no longer available. This optimisation significantly helps consuming systems in retrieving only those resources that have been modified since their last update, thereby streamlining the update process. The specification defines three distinct levels of conformance, each offering a different depth of information about the changes to IIIF resources.

Code Snippet 3.11: Example of a Level 2 Activity Compliant with the IIIF Change Discovery API (Appleby et al., 2021)

{
  "type": "Create",
  "object": {
    "id": "https://example.org/iiif/1/manifest",
    "type": "Manifest"
  },
  "endTime": "2017-09-20T00:00:00Z"
}

In the Change Discovery API, the rdfs:seeAlso property from the Presentation API is a vital feature that enhances the discoverability and understanding of IIIF resources (Raemy, 2020, p. 16). This property is used to reference external documents containing richer and more detailed information about the content being presented. The specification also provides a structured approach for the organisation and processing of OrderedCollection and OrderedCollectionPage. This includes clear directives on how activities within these structures are managed and accessed.

In terms of network considerations, the Change Discovery API addresses critical aspects such as handling media types and activities for access-restricted content. These considerations ensure that the network interactions remain efficient and reliable, even in scenarios where content accessibility is dynamically changing. Additionally, the IIIF community deployed a IIIF Registry[164] to give access to IIIF resources and streams that are compatible with the Change Discovery API.

Finally, the Content State API enables deep-linking into objects and annotations from search results, aiding users in sharing specific IIIF content (Appleby et al., 2022). Specifically, a content state is a JSON-LD data structure that can be leveraged by IIIF clients to load and present a particular part of a resource, such as region of a Canvas in a Manifest, playing a particular point in a recording, displaying multiple targets from a comparison view, and embedding search results. Because the content states are serialised in JSON, these must be encoded in the base64url format, a variant of the base64 encoding designed to be safe for use in URLs as it avoids using special characters (see Josefsson, 2006). In summary, as (Crane, 2020) argues, the specification is .

Figure 3.26 displays a IIIF Manifest, specifically of The Life of Buddha, first book (Shaka no Honji, jō), hosted by e-codices[165]. This resource is accessed through Jalava[166], a directory tree-based interface that aggregates IIIF-compliant repositories. It also includes a custom viewer, which enables the sharing of resources based on the IIIF Content State API in JSON and base64url formats.

Cologny, Fondation Martin Bodmer, Cod. Bodmer 600a: The Life of Buddha, First Book (Shaka no Honji, jō) Embedded in Jalava, a Customised Client from Durham University Compliant with the IIIF Content State API
Figure 3.26: Cologny, Fondation Martin Bodmer, Cod. Bodmer 600a: The Life of Buddha, First Book (Shaka no Honji, jō) Embedded in Jalava, a Customised Client from Durham University Compliant with the IIIF Content State API

Collectively, these APIs, extensions and validators are not only fundamental components that enable the framework, they are also anchored by a set of well-defined design principles and an editorial process. This combination ensures improved interoperability and utility across different digital platforms, while maintaining consistent and usable standards that reflect the evolving needs of the community.

In order to effectively deploy IIIF, it is important for the system building blocks to have a comprehensive understanding of the different APIs involved. For a basic IIIF solution, the images need to be delivered in compliance with the Image APIs through a web service. Then, a series of scripts can be used to generate IIIF resources, which combine the relevant metadata and data structure. These can be displayed within IIIF-compliant viewers. Deploying a more advanced IIIF architecture that can manage and save annotations, restrict access to certain objects, inform the IIIF community of changes made within its own collection, or search for OCR text requires a more complex environment. This involves deploying multiple types of servers and different libraries. Figure 3.27 illustrates an example of a IIIF architecture with the six APIs.

Example of a IIIF Architecture Implementing the Six APIs
Figure 3.27: Example of a IIIF Architecture Implementing the Six APIs

As I move from the underlying technical structure to its practical application, the next subsection takes a closer look at the IIIF ecosystem. This includes a diverse range of compliant software and implementations, each contributing to the widespread adoption and versatility of IIIF. By exploring these elements, it is possible to gain a comprehensive understanding of how IIIF is being operationalised and integrated into various environments.

3.5.3.4 IIIF Ecosystem

This segment explores the composition of the IIIF ecosystem, particularly focusing on how its suite of shared APIs has been integrated into software and the institutions that have made their resources compliant with the specifications.

The adoption of IIIF by diverse organisations marks a significant turning point plays a crucial role in dismantling silos. This fosters enhanced communication and interoperability between various software platforms. As a result, users gain access to a richer array of options for viewing and interacting with digital resources. This transformation, catalysed by the widespread adoption of IIIF, is illustrated in Figure 3.28, showcasing the dynamic and interconnected nature of the IIIF ecosystem.

Example of a IIIF Ecosystem of Software
Figure 3.28: Example of a IIIF Ecosystem of Software

The suite of IIIF-compliant software encompasses a diverse range of tools, each catering to specific aspects of the framework. Central to this suite are image servers, predominantly tailored to align with the IIIF Image API, such as IIPImage[167], Cantaloupe[168], or SIPI[169]. Complementing these servers are image viewers, designed to be compatible with either the Image API alone, like OSD and Leaflet-IIIF[170], or with both of the core IIIF APIs. With the release of the version 3.0 of the Presentation API, A/V players have also been developed, which frequently comply with the Image API as well. Examples of known clients that support the Image and Presentations APIs 3.0 are Mirador, the UV, Annona[171], Clover[172], and Ramp[173]. As the support and behaviour vary, a dedicated viewer matrix[174] was created by Glen Robson, the IIIF-C Technical Coordinator.

Additionally, the ecosystem includes an array of tools and libraries which offer pre-written code segments essential for constructing and manipulating IIIF resources. For narrative and website creation, there are specialised exhibition and guided viewing tools such as Storiiies[175] and Exhibit[176], adding a storytelling dimension to the experience. Moreover, validators play a critical role in ensuring compliance and accuracy, while annotation servers and clients offer advanced functionalities for enriching digital resources with contextual and interpretive information. Together, these components form a comprehensive and versatile toolkit, enabling a wide range of applications and enhancing the utility of IIIF resources.

The evolving software landscape has witnessed a notable expansion in IIIF compatibility, where a larger number of software applications now align with the IIIF specifications. This shift is evident in the growing support for one or more of the IIIF APIs, with a particular emphasis on the Image API or both core APIs (Rabun, 2016). In May 2016, recognising the need to provide guidance and support to both newcomers and software developers within the IIIF community, a curated collection titled [177] was initiated. This resource compiles a comprehensive array of IIIF-compliant tools, tutorials, and descriptive links, organised to facilitate easy navigation and understanding of the IIIF ecosystem. It serves as an invaluable resource, offering insights and directions for those embarking on their journey with IIIF, as well as helping developers in finding the necessary tools and information to integrate IIIF specifications effectively into their projects.

IIIF has been embraced by a variety of institutions worldwide, spanning from state and national libraries to aggregators, museums, archives, and universities. Examples include Austria’s national library, the British Library, BnF, Israel’s national library, aggregators like Europeana, the Digital Public Library of America, and the Wikimedia Foundation, museums such as the Victoria & Albert Museum, the alters Art Museum, and the YCBA, archives such as the Swiss Federal Archives, the Blavatnik Foundation Archive, and the National Film and Sound Archive of Australia, as well as academic institutions such as Leiden University, University of Edinburgh, and Université Paris Cité. These implementations showcase the versatility of IIIF, illustrating its ability to cater to a wide spectrum of CH needs.

In Switzerland, IIIF has seen significant uptake, primarily within the library and academic sectors. Institutions like DaSCH, e-manuscripta, e-rara, Kunstmuseum Basel, the Swiss Federal Archives, and the University of Geneva have all adopted IIIF to some extent. Notably, both core IIIF APIs, in their Version 2.0, saw their initial implementation by e-codices[178], the Virtual Manuscript Library of Switzerland, in December 2014 (Raemy, 2017, p. 24). It marked a strategic shift towards integrating modern open-source technologies, such as OSD, in the overhaul of their digital platform (Fritschi, 2017, pp. 245–246). This early adoption not only demonstrated e-codices’ commitment to using the latest state-of-the-art digital tools, but also paved the way for the wider adoption of IIIF standards in digital libraries.

Scoping the whole so-called IIIF universe is complex as there is no mandatory requirement – and there probably shouldn’t be one – to tell the IIIF community that you have implemented the APIs or that you have developed tools that are compliant with the specifications. In terms of specifications, mostly the core IIIF APIs are implemented, such as shown in different investigations (see Raemy, 2023; Raemy & Schneider, 2019). The IIIF community has also carried out several surveys, targeted at institutions implementing the standards in 2017, 2020, and 2023 (Raemy, 2023, p. 2). In 2017, the number of images compatible with the IIIF Image API was estimated to be over 335 million (Raemy, 2017, p. 27), probably under-representing the real number. At the time of writing, it is likely that there are over 2 billion images compliant with the Image API, as the Internet Archive alone hosts over 1 billion of them.

The latest 2023 IIIF Implementation Survey, receiving 80 responses, showcased a diverse range of institutions including 26 libraries, 11 museums (some encompassing combinations like museum and research library), 16 universities or research institutions, seven archives, five service providers, as well as four software providers. It is quite representative to what is known to IIIF-C and the wider community, with participation and implementations mostly from libraries, museums, and universities from North America and Europe. Most surveyed institutions currently support IIIF, with a few in the planning stage. The number of IIIF-compatible resources varies widely, ranging from fewer than a thousand to over a million in some cases. Popular IIIF-compliant image servers include Cantaloupe and IIIPImage. Mirador and UV are commonly used viewers. JPEG2000 and TIFF are the predominant image formats used across the surveyed institutions. A portion of the institutions actively contribute staff time or funding to shared IIIF tools and projects, highlighting a collaborative approach within the IIIF community.

Navigating the world of IIIF resources can be a challenging endeavour, given their vast and varied nature. To assist in this task, a comprehensive guide has been developed, providing valuable insights and directions on how to locate IIIF resources effectively[179]. As an illustration, Figure 3.29 presents the entry of the Cultural Japan platform[180], which aggregates IIIF-compliant of and about Japan (see Machiya et al., 2023).

Screenshot of the Cultural Japan Platform in the IIIF Guides
Figure 3.29: Screenshot of the Cultural Japan Platform in the IIIF Guides

While the IIIF ecosystem is dynamic and evolving, it is important to note that its expansion and development are predominantly observed within GLAM and academic institutions, mostly in wealthy countries. This skew indicates that while IIIF specifications have gained traction and proving to be influential in the DH and CH domains, their adoption is not uniformly distributed globally. The concentration of IIIF tools and implementations in certain regions underscores the need for broader outreach and support to enable more diverse and widespread adoption. Despite its current geographical and sectoral concentration, the framework’s capability to enhance access, interaction, and preservation of digital resources remains significant, offering a promising avenue for the future of digital archival work on an international scale.

3.5.4 Web Annotation Data Model

The WADM[181], a W3C standard established in 2017 and influenced by the outcomes of the OAC[182] resulting from several years of effort (see Haslhofer et al., 2011), provides an extensible and interoperable framework for creating and sharing annotations across various platforms.

WADM is designed to support both simple and complex use cases, such as linking content to specific data points or multimedia resources. It standardises annotations for interoperability and easy integration into existing collections, using JSON-LD for serialisation, which integrates annotations into the web’s structured data ecosystem. The model adopts Linked Data principles, emphasising interoperable and flexible structures for annotations (Sanderson et al., 2013).

Annotations in WADM link resources, referred to as body and target. The target is the resource being annotated, while the body contains the annotation content, which can be text, images, or other media. This structure allows a single annotation to relate to multiple resources, ensuring compatibility with web architecture and facilitating cross-platform sharing (Sanderson et al., 2017). Figure 3.30 illustrates this three-part model.

WADM High-level Overview
Figure 3.30: WADM High-level Overview (Sanderson et al., 2017)

The model outlines various components of an annotation (Sanderson et al., 2017):

Code Snippet 3.12 is an example of a basic annotation.

Code Snippet 3.12: Example of Basic Annotation According to WADM (Sanderson et al., 2017)

{
  "@ context": "http://www.w3.org/ns/anno.jsonld",
  "id": "http://example.org/anno1",
  "type": "Annotation",
  "body": "http://example.org/post1",
  "target": "http://example.com/page1"
}

Annotations can link to different types of resources, including external web resources, embedded text, or specific segments of a resource. They can also include lifecycle information, agent details, intended audience, accessibility features, motivations, rights information and other identities.

Code Snippet 3.13 shows an example of an annotation with a motivation and a purpose.

Code Snippet 3.13: Example of an Annotation According to WADM with a motivation and a purpose (Sanderson et al., 2017)

{
  "@ context": "http://www.w3.org/ns/anno.jsonld",
  "id": "http://example.org/anno15",
  "type": "Annotation",
  "motivation": "bookmarking",
  "body": [
    {
      "type": "TextualBody",
      "value": "readme",
      "purpose": "tagging"
    },
    {
      "type": "TextualBody",
      "value": "A good description of the topic that bears further investigation",
      "purpose": "describing"
    }
  ],
  "target": "http://example.com/page1"
}

Annotations can be organised into collections or pages for easy reference and management. An AnnotationCollection manages a potentially large number of annotations, maintaining its own descriptive and creation information, and includes references to the first page of annotations. Each AnnotationPage contains an ordered list of some or all annotations within the collection.

WADM has played a key role in several areas, particularly in extending the functionality of the IIIF Presentation API data model. Beyond the IIIF space, the influence of WADM extends into other areas, demonstrating its versatility and wide applicability.

In the field of knowledge management, (Rossenova et al., 2022, p. 10) uses WADM to annotate 3D content within an instance of Wikibase[183], a suite of KB software used by Wikidata for managing LOD. Additionally, WADM finds application in the field of musicology, where it aids in the annotation of musical notes, thus enriching the study and understanding of music through a digital lens (Weigl et al., 2021, pp. 26–27). The importance of WADM has been also important in the domain of data quality assessment (Wei et al., 2016, p. 24). Moreover, the VR gaming industry has also embraced WADM to enrich the generated KGs (Rousi et al., 2021, pp. 100–101). This application demonstrates the standard’s potential in creating immersive and interactive digital experiences.

Incorporating all the discussed aspects of WADM, it is evident that this model offers a comprehensive and flexible framework for web annotations. The WADM's capacity to handle diverse annotation types, from simple text to complex multimedia segments, and its consideration of aspects like resource lifecycle, rights information, and accessibility, make it a robust tool for web-based annotations. Envisioning WADM-driven storytelling, it could serve as a potent means to convey hypotyposes — picturesque descriptions of scenes or events. This aligns with the concept of multimedia assemblage of archival objects, as discussed by (Bachimont, 2021) and revisited in the context of web annotations[184], highlighting the potential of WADM in transforming the presentation and interpretation of content into dynamic, contemporary narratives.

In contrast to IIIF and Linked Art, WADM stands out as the only recognised LOUD standard that doesn’t concurrently function as an active community. In the event that updates to WADM are required, a dedicated W3C working group would need to be re-established or revamped[185]. This necessity contrasts with the community-driven evolution seen in IIIF and Linked Art, where ongoing community engagement and collaboration play a crucial role in their development and updating processes.

3.5.5 Linked Art

Linked Art[186] is a community-driven initiative working together to define a LOUD specification for describing CH, primarily artworks, designed to operate across CHIs, facilitating the publication and use of a common body of knowledge. As such, Linked Art presents itself as ‘providing a standards based metadata profile, which consistently solves problems from real data, is designed for usability and ease of implementation, which are prerequisites for sustainability’ (Sanderson, 2023). Specifically, the community develops and maintains a shared data model, or profile, based on CIDOC-CRM as well as a JSON-LD-based API to interact with the data [(Cornut et al., 2023), p. 10].

Linked Art presents a layered framework that distinguishes between the conceptual and implementation aspects of its model. This stratification is key to understanding how the Linked Art profile and its API are situated within a broader context of shared abstractions and sustainable implementations (Sanderson, 2020).

Figure 3.31 illustrates these layers, delineating the transition from shared abstractions (in blue) to their sustainable implementations (in green) in the Linked Art ecosystem.

Abstraction and Implementation Level of Linked Art. Adapted from (Sanderson, 2021)
Figure 3.31: Abstraction and Implementation Level of Linked Art. Adapted from (Sanderson, 2021)

Embarking on an exploration of Linked Art, this subsection unfolds in four distinct yet interconnected segments. First, 3.5.5.1 offers a historical perspective on the development and evolution of Linked Art. This is followed by 3.5.5.2, which provides an in-depth examination overview of the model, detailing its structure and patterns. Subsequently, 3.5.5.3 explores the architectural and design principles of the Linked Art API, including its protocol, core constructs, and the array of entity endpoints it recommends. The subsection culminates with 3.5.5.4, where I look at the community dynamics surrounding Linked Art and how the specification has been, or will be, implemented.

3.5.5.1 History

Linked Art officially came into existence in 2017 (Raemy, 2023, pp. 1–2). However, its roots can be traced back to November 2016 with the creation of its first GitHub repository[187], marking the beginning of the community. The initiative gained significant momentum in January 2019, when it commenced hosting open community calls, inviting broader participation and engagement[188].

In the same year, Linked Art received financial support in the form of grants from both the Kress Foundation[189], known for its dedication to advancing the field of art history, and the AHRC[190], a major funding body in the United Kingdom. This support from two esteemed organisations has strengthened Linked Art’s capabilities and resources, allowing it to expand its scope and organise five face-to-face meetings; three in 2019 and two in 2023 – delayed due to the COVID-19 pandemic.

The official recognition of Linked Art as a CIDOC Working Group in 2020 marks another milestone in its development. This recognition not only confirms the importance and relevance of Linked Art within the international museum sector, but also increases its visibility.

The evolution of Linked Art was significantly influenced by two key precedents: the AAC[191] and Pharos[192], the International Consortium of Photo Archives. AAC, a consortium of fourteen American art museums, played a critical role in promoting the adoption of agreed-upon LOD practices for artworks (Knoblock et al., 2017). On the other hand, Pharos brought together a network of photographic archives (Delmas-Glass & Sanderson, 2020).

(Fink, 2018) [p. 35] highlights the philosophical alignment between Linked Art and AAC. She notes that ‘[Linked Art] describes the philosophy that shaped the AAC target model. It indicates the model will be updated by applying it to other data sets, such as the Getty Museum and Pharos, The International Consortium of Photo Archives, with the intention of having it serve as a resource for the broad museum community’. While AAC’s alignment with Linked Art’s philosophy is evident, Pharos’ engagement with the community has followed a distinct path. Initially closely aligned with Linked Art, Pharos eventually branched off, adopting unique modelling practices and decided to deploy their own instance of ResearchSpace[193], an open source semantic web platform, to aggregate and align data from partner institutions.

The creation of AAC's dates back to 2014, and in 2016 it published its target model based on CIDOC-CRM. This model embodied a design philosophy similar to that of Linked Art, characterised by its adaptability to different museum infrastructures. The well-structured and strategic design of the AAC target model laid a solid foundation for the development of functional applications and influenced the approach and methodology adopted by Linked Art in its subsequent evolution.

Linked Art’s development and growth are also significantly intertwined with the trajectory of IIIF (see Daga et al., 2022). This connection is evident in the way Linked Art has emulated IIIF's principles and methodologies. In addition, a few core people collaborate in both initiatives (Raemy, 2023, p. 11).

The release of the Linked Art API V1.0, scheduled for the course of December 2024[26:2] February 2025[194], represents a significant turning point in the evolution of the standard, but it is worth noting that several earlier implementations had already laid the groundwork for its adoption. These implementations not only provided valuable insights into the data model and the API's potential but also served as a catalyst for its refinement and enhancement.

3.5.5.2 Linked Art Data Model

At its core, Linked Art is a data model[195] or metadata application profile[196] that draws extensively from the RDF implementation of the version 7.1.3 of CIDOC-CRM (see Bekiari et al., 2024). The Getty Vocabularies – namely AAT, ULAN, and TGN – are leveraged as core sources of identity for domain-specific terminology. JSON-LD 1.1 is chosen as the preferred serialisation format, promoting clarity and interoperability. This framework constructs common patterns, integrating conceptual models, ontologies, and vocabulary terms. These elements are derived from real-world scenarios and contributions from the diverse participants and institutions within the Linked Art community (Sanderson, 2019).

In the area of data provenance, the work of (Ram & Liu, 2009) on the W7 model stands as an example. This ontological framework conceptualises data provenance through seven interconnected facets: ‘what’, ‘when’, ‘where’, ‘how’, ‘who’, ‘which’, and ‘why’. Each of these elements serves as a means to monitor and understand events impacting data throughout its existence. Importantly, the W7 model demonstrates flexibility and adaptability in capturing the nuances of provenance for data across diverse domains. The Linked Art data model, in its domain-specific application, particularly resonates with five of these facets: ‘what’, ‘where’, ‘who’, ‘who’, and ‘when’ [(Sanderson, 2020)]. This focus facilitates a nuanced tracking and interpretation of the key aspects of data provenance, as visually captured in Figure 3.32 [197].

Linked Art from 50,000 Feet
Figure 3.32: Linked Art from 50,000 Feet [(Raemy et al., 2023) adapted from [(Sanderson, 2020)]]

There are a few base patterns[198] that every resource should have for it to be a useful part of the world of Linked Data (Raemy & Sanderson, 2023, p. 11). The core properties, types and classifications are listed hereafter.

The JSON-LD serialisation displayed in Code Snippet 3.14 represents a practical application of the properties previously discussed, exemplifying the modelling of Rembrandt’s The Night Watch[199]. This particular use case was not only a central focus during the Linked Art face-to-face meeting held in Amsterdam in October 2023 (Sanderson, 2023) but also serves as an illustrative example in various core scenario modellings highlighted here. This extract and the ones that follow offer a glimpse of how Linked Art effectively captures artwork metadata and presents it in a structured format.

Code Snippet 3.14: The Night Watch: Representation in the Linked Art Data Model Including URI; Class; Label; and Classification

{
  "@ context": "https://linked.art/ns/v1/linked-art.json",
  "id": "https://example.org/object/42",
  "type": "HumanMadeObject",
  "_label": "The Night Watch",
  "classified_as": [
    {
      "id": "http://vocab.getty.edu/aat/300033618",
      "type": "Type",
      "_label": "Painting"
    },
    {
      "id": "http://vocab.getty.edu/aat/300133025",
      "type": "Type",
      "_label": "Work of Art"
    }
  ]
}

Linked Art categorises controlled vocabulary terms[200] into three distinct classes to facilitate validation and interoperability. This classification is key to ensuring that data adheres to consistent rules across different implementations.

Code Snippet 3.15 shows how names, identifiers and statements are represented in Linked Art. Names, distinct from the _label property which serves as internal documentation, are essential for user-facing resources. Every entity that should be visible to an end-user, be it an object, person, group, or event, is recommended to have at least one specific name. This is achieved using the identified_by property with a Name resource, where the actual name is specified in the content property. Identifiers are handled similarly but employ the Identifier class. They are often classified to distinguish between various types, like internal system numbers and accession numbers. This classification helps in differentiating and understanding the origins and nature of each identifier. Statements come into play where data does not support the specificity that the full ontology allows or when information is best conveyed in human-readable form. This could include descriptions like medium or materials, rights or usage statements, dimensions, or edition statements. In scenarios where there is descriptive text about the resource, regardless of the type of description, the referred_to_by property is used to record this textual information.

Code Snippet 3.15: The Night Watch: Representation in the Linked Art Data Model Including Names; Identifiers; and Statements

{
  "@ context": "https://linked.art/ns/v1/linked-art.json",
  "id": "https://example.org/object/42",
  "type": "HumanMadeObject",
  "_label": "The Night Watch",
  (...),
  "identified_by": [
    {
      "type": "Name",
      "_label": "The Night Watch (English)",
      "content": "The Night Watch",
      "classified_as": [
        {
          "id": "http://vocab.getty.edu/aat/300404670",
          "type": "Type",
          "_label": "Primary Name"
        }
      ],
      "language": [
        {
          "id": "http://vocab.getty.edu/aat/300388277",
          "type": "Language",
          "_label": "English"
        }
      ]
    },
    {
      "type": "Identifier",
      "_label": "Night Watch Object Identifier",
      "content": "SK-C-5",
      "classified_as": [
        {
          "id": "http://vocab.getty.edu/aat/300404621",
          "type": "Type",
          "_label": "Repository number"
        }
      ]
    }
  ],
  "referred_to_by": [
    {
      "type": "LinguisticObject",
      "classified_as": [
        {
          "id": "http://vocab.getty.edu/aat/300435429",
          "type": "Type",
          "_label": "Material Statement",
          "classified_as": [
            {
              "id": "http://vocab.getty.edu/aat/300418049",
              "type": "Type",
              "_label": "Brief Text"
            }
          ]
        }
      ],
      "content": "Oil on Canvas",
      "language": [
        {
          "id": "http://vocab.getty.edu/aat/300388277",
          "type": "Language",
          "_label": "English"
        }
      ]
    }
  ]
}

The use of an intermediate Activity entity, rather than direct links, allows for richer descriptions and associations. This approach enables linking multiple actors, places, techniques, and timespans to a single activity, enhancing the depth of cultural heritage data representation. Here are the following properties:

Furthermore, Linked Art incorporates a minimal TimeSpan model for activities, utilising properties begin_of_the_begin and end_of_the_end. These properties record the beginning and end of the TimeSpan, respectively. Both activities, all encapsulated within the produced_by property, and timespans are exemplified in Code Snippet 3.16.

Code Snippet 3.16: The Night Watch: Representation in the Linked Art Data Model Including Activities and Timespans

{
  "@ context": "https://linked.art/ns/v1/linked-art.json",
  "id": "https://example.org/object/42",
  "type": "HumanMadeObject",
  "_label": "The Night Watch",
  (...),
  "produced_by": {
    "type": "Production",
    "carried_out_by": [
      {
        "id": "http://vocab.getty.edu/ulan/500011051",
        "type": "Person",
        "_label": "Rembrandt, Harmensz van Rijn"
      }
    ],
    "took_place_at": [
      {
        "id": "http://vocab.getty.edu/tgn/7006952",
        "type": "Place",
        "_label": "Amsterdam"
      }
    ],
    "technique": [
      {
        "id": "http://vocab.getty.edu/aat/300053343",
        "type": "Type",
        "_label": "Painting"
      }
    ],
    "timespan": {
      "type": "TimeSpan",
      "begin_of_the_begin": "1642-01-01",
      "end_of_the_end": "1642-12-31",
      "identified_by": [
        {
          "type": "Name",
          "content": "1642",
          "classified_as": [
            {
              "id": "http://vocab.getty.edu/aat/300404669",
              "type": "Type",
              "_label": "Display Label"
            }
          ]
        }
      ]
    }
  }
}

In essence, Linked Art’s basic patterns revolve around a few principles: URIs for web identification of entities and records, a small yet impactful set of classes (ontology) coupled with an extensive array of classifications (vocabulary). Core elements such as names, identifiers, statements, and classifications are universally applicable to every entity. The concepts of activities and partitioning enable connections and detailed specificity between entities.

Within the data model, further key patterns have been recognised and established. These include object descriptions, people and organisations, places, digital integration (such as describing IIIF-compliant resources and pointing to their JSON-LD API context – as shown in Figure 3.33), provenance of objects, collections and sets, exhibitions of objects, primary sources of information, assertion level metadata, and dataset level metadata (Raemy & Sanderson, 2023, p. 12). This classification provides a robust framework to distinguish between the physicality of objects, the abstract nature of works, and the human or collective agents involved.

Linked Art Digital Integration: Possible Description of Available IIIF resources
Figure 3.33: Linked Art Digital Integration: Possible Description of Available IIIF resources

Particularly, objects are differentiated as HumanMadeObject[201], physical entities with tangible presence, and DigitalObject, representing digital files. Works include PropositionalObject (abstract concepts not tied to text or visuals), LinguisticObject (textual works like ‘The Lord of the Rings’), and VisualItem (visual works, such as ‘The Night Watch’). Actors are classified as Person (individuals with intentional action) or Group (collectives).

Linked Art extends the notion of actors to non-humans who are responsible, albeit temporarily, for activities under the category of persons. Place entities are defined as fixed geographical locations with identifiable coordinates. For concepts, Type represents broad categories or classifications, supplemented by specific concepts such as Language, Material, Currency and MeasurementUnit. The Set class is defined as an unordered group that is unique to Linked Art, as opposed to E78 Curated Holding in CIDOC-CRM, which emphasises physical and purposefully preserved sets.

Furthermore, Activity in Linked Art, especially Provenance and Exhibition, offer explicit connections between entities, setting Linked Art apart from other data models. Here, objects are distinct from works and provide a context, while people, groups, and places are entities in their own right (Sanderson, 2023).

Each of these patterns has an essential function in the structured representation and organisation of a wide range of data. This includes data about artworks, the artists who created them, the places associated with those artworks, digital representations, historical background, as well as information about collections, exhibitions and the rich metadata that supports this rich tapestry of CH (Raemy & Sanderson, 2023, p. 12).

3.5.5.3 Linked Art API and Standards for Linked Art

In Linked Art, a clear distinction exists between the model and the API. The model is inherently flexible, allowing for a wide range of representations, while the API is tailored for software developers, providing a more defined and structured interaction mechanism. In this section I will look at the characteristics of the Linked Art API and other specifications relevant to Linked Art, such as HAL or AS.

Linked Art, drawing inspiration from IIIF, espouses a set of design principles[202] aimed at fostering usability and interoperability (Sanderson, 2023). These principles emphasise the importance of defining scope through shared use cases and designing for internal utility, while adhering to the axiom of simplicity. They advocate for making easy tasks straightforward and complex tasks achievable, without over-reliance on specific technologies. Embracing REST and leveraging the benefits of web caching, these guidelines are grounded in RESTful architecture and the efficient use of network resources. As such, the Linked Art API revolves around the use of URIs as both identifiers and locators. It is also important not to infer information about the publisher based on the URI structure and to prefer HTTPS for security. Additionally, the principles prioritise designing for JSON-LD, employing LOD methodologies, and adhering to established standards and best practices wherever feasible. A key focus is on defining criteria for success to facilitate extensibility, thereby ensuring a robust and adaptable framework for Linked Art’s implementation.

Further delving into the specifics of the Linked Art API in terms of requirements, four key areas are listed: trivial to implement, consistency across representations, division of information, and URI requirements (Raemy & Sanderson, 2023, p. 13).

Linked Art advocates for trivial implementation. The framework is designed such that even hand-crafted files on disk are feasible for deployment, though automation may be preferred for managing larger data volumes. A key aspect is the consistency across representations, ensuring that each relationship is contained within a single document. This approach aids in maintaining clarity and coherence in data structuring. Moreover, Linked Art emphasises the division of information across different representations, adopting a structured approach that scales from many to few. For example, in the context of a book, the structure would flow from the page level to the book, and eventually to the collection, delineating a clear hierarchy of information. Additionally, the identity and URI requirements in Linked Art are strategically designed. One-to-one relationships are embedded directly and do not necessitate separate URIs. This simplifies the data model and enhances its accessibility. Furthermore, the URIs for records are devoid of any internal structure, adhering to a principle of simplicity and straightforward usability. This approach ensures that the model remains user-friendly and easily navigable, making it an efficient tool for representing and managing CH data.

The division of the graph in Linked Art ensures no duplication of definitions across records. Full URIs are used for references, simplifying client processing. Embedded structures, even if potentially identifiable, do not carry URIs to avoid unnecessary complexity. At the moment, there are eleven endpoints[203] in the Linked Art API, loosely based on the conceptual model presented previously:

  1. Concepts: Types, Materials, Languages, and others, as full records rather than external references
  2. Digital Objects: Images, services and other digital objects
  3. Events: Events and other non-specific activities that are related but not part of other entities
  4. Groups: Groups and Organisations
  5. People: Individuals
  6. Physical Objects: Physical things, including artworks, buildings or other architecture, books, parts of objects, and more
  7. Places: Geographic places
  8. Provenance Activities: The various events that take place during the history of a physical thing
  9. Sets: Sets, including Collections and sets of objects used for exhibitions
  10. Textual Works: Texts worthy of description as distinct entities, such as the content carried by a book or journal article
  11. Visual Works: Image content worthy of description as distinct entities, such as the image shown by a painting or drawing

Each of these endpoints is accompanied by detailed documentation outlining the required and permitted patterns, complemented by a corresponding JSON schema[204].

Linked Art has investigated to use of both HAL, an IETF RFC, and AS for managing back-links and harvesting content. HAL links, which are media types for representing resources and their relations with hyperlinks (see Kelly, 2023), are simple and reliable, with existing tooling including validation. It should play a fundamental role in addressing redundancy for Linked Art by mitigating response bloating between API responses. They are placed at the top level of JSON-LD documents, not treated as properties but as part of the API (Raemy & Sanderson, 2023, p. 17)[205]. AS, already implemented by the IIIF Change Discovery API, provides a common paging model across various standards, allowing for flexible data aggregation.

Overall, APIs for the Linked Art prioritise JSON for syntax, appealing to the software developer audience. By implementing Linked Art records in JSON-LD and using HAL links for back-links, along with HAL for search and aggregation, the specifications offer consistency, usability, and ease of implementation without the need for specialised technologies, yet retaining semantic richness. Essentially, implementing those specifications means adopting the model itself.

3.5.5.4 Community and Implementations

The Linked Art community[206] boasts a diverse and international membership primarily from museums and universities in North America and Europe. Key institutions such as The National Gallery of Art (USA), The J. Paul Getty Trust, the Museum of Modern Art, the Frick, the Rijksmuseum, and the Victoria and Albert Museum, alongside universities and research centres like Oxford, Yale, FORTH, DHLab, and ETH Zurich, are among the earlier and active participating institution in the community. Other organisations such as Europeana and the American Numismatics Society have also been involved.

Linked Art has seen significant project-based support and investigation, notably from the University of Oxford e-Research Centre[207]. This includes initiatives like the [208] and [209], both supported by AHRC funding. These projects exemplify the collaborative and research-focused ethos at the heart of the Linked Art initiative. In the area of collaborative projects, Linked Art lists partners such as Pre-Raphaelites Online[210] and Linked Conservation Data[211] on its website. These partnerships highlight the diverse applications of Linked Art across different domains, from art history to conservation practices, underlining its adaptability and relevance to a wide range of CH endeavours.

From a library and archives perspective, the LD4 Art & Design Affinity Group in the USA[212], extending beyond museums to encompass libraries and archives, has listed Linked Art as a relevant initiative emphasising its cross-disciplinary appeal.

Several institutions have already adopted various implementations of Linked Art, each tailoring it to their specific needs and due to the release of Version 1.0 being delayed. Yale’s LUX Collection Discovery platform (see ) stands out as a flagship implementation (Raemy, 2023, p. 26). Additionally, digital platforms like The Art and Life of Georgia O’Keeffe, the Getty Museum Collection, the Getty Vocabularies, Van Gogh Worldwide, and the Rijksmuseum have either implemented or are planning to implement Linked Art.

However, the adoption of Linked Data in European museums and beyond remains limited, with few institutions displaying digital objects online. For instance, (France & Forsberg, 2021) argued that structuring data internally using the Linked Art model might be premature due to its ongoing development. This highlights the importance of establishing a stable implementation of Linked Art to facilitate broader adoption and unlock its full potential in the CH sector. Moreover, while (Klic, 2019) [p. 23] mentions that Linked Art utilises the CIDOC-CRM ontology to its full expressiveness, there appears to be a lack of clarity or common understanding among readers regarding the precise nature and characteristics of Linked Art, which is a streamlined profile.

One potential driver for wider implementation is the integration of Linked Art with existing standards, akin to the EODEM[213], a framework that enables museum databases to export an import object data more easily. Establishing standardised mechanisms for data exchange could enhance the accessibility and applicability of Linked Art for more institutions. Additionally, the SARI Reference Models[214] and platforms like Arches[215], developed by the Getty Conservation Institute[216] and World Monuments Fund[217], present promising opportunities for implementing Linked Art in various software and partnership settings.

Overall, while Linked Art does not aim for perfect semantic interoperability, it strives for practical generalisation, positioning itself as a potential benchmark for accessing and aggregating digital collections. Its development and implementation across a spectrum of CHIs underscore its role as a transformative tool in the field, as suggested by As the community continues to grow and evolve, Linked Art holds the promise of shaping the future of digital CH, fostering greater connectivity and understanding across diverse collections and practices.

3.5.6 Implications and Opportunities for the Cultural Heritage Domain and the Humanities

In envisioning the ideal scenario for the CH domain and the humanities, the integration of LOUD technologies presents unprecedented opportunities. While IIIF has laid a foundational path, the full spectrum of LOUD specifications potentially catalyses a more profound transformation in how we interact with and understand CH data. This transformation prompts us to ponder two critical questions:

The implications of LOUD extend beyond technological innovation; they invite a reexamination of our methodologies, perspectives, and frameworks. KR is inherently subjective, influenced by the context and intent behind data collection and modelling. While it is crucial to understand the epistemic underpinnings of knowledge, the LOUD principles advocate a balanced approach with usability at the forefront.

Does LOUD, and its precursor LOD, represent a paradigm shift in the CH domain and for DH at large, akin to the transformative changes described by Thomas Kuhn in his seminal work on the structure of scientific revolutions (Kuhn, 1994)? This ecosystem of models suggests a substantial shift not only in technological capabilities but also in epistemological frameworks. LOUD serves as a unifying common denominator, simplifying the complex landscape of digital CH to hopefully making it more accessible for exploration and interconnection.

3.6 Characterising Community Practices and Semantic Interoperability

In this section, I will define the two main axes or perspectives of my thesis, chiefly in 3.6.1 and in 3.6.2. These two concepts, while distinct, are deeply interwoven, each playing a critical role in the advancement and application of CH.

Community practices in CH are foundational to the collaborative and collective efforts that drive the field, ranging across various stakeholders, from DH practitioners, librarians, archivists, and curators to CHIs. This focus on community practices leads us to consider how these collaborative networks communicate and operate effectively on a technical level, which brings us to the concept of semantic interoperability.

Semantic interoperability is crucial in elevating community practices to a higher intellectual plane. It represents the technical backbone that enables these diverse communities to not only share data but also to understand and interpret data consistently and accurately across different systems. This seamless integration and interpretation of semantic data are what allow community practices to transcend beyond simple data exchange to more meaningful and impactful collaborations. By bridging these communal aspects of collaboration with the technical necessities of data exchange, semantic interoperability ensures that the collective efforts in CH are not only shared but also effectively leveraged and understood.

Thus, the transition from community practices to the intricacies of semantic interoperability reflects a natural progression from collaborative engagement to the technical enablers that make such sophisticated interactions possible.

3.6.1 Community Practices

This subsection explores the concepts of community, practices, and community practices, as well as explores the distinctions and overlaps between community practices and CoP. To fully grasp the nuances of community practices, it is essential first to understand what constitutes a and what is meant by . Following this, I will examine how these elements synergies within community practices. Lastly, this exploration will contrast these practices with CoP, highlighting both their characteristics and areas of intersection.

A community is a group of people who share common interests, values, goals or characteristics and come together to form a social unit. Communities can be based on geographical location, shared hobbies, cultural background or any other common factor. In an effort to capture the multifaceted nature of communities, I have selected three definitions from the (Oxford English Dictionary, 2023), each highlighting a different aspect of what constitutes a community. These definitions reflect the diversity and complexity inherent in the concept of a community.

Community is:

The first definition emphasises the traditional, geographical-based community, often bound by a shared cultural or ethnic identity. This perspective, rooted in the physical coexistence of individuals within a specific locale, aligns with the observations by (Jewkes & Murcott, 1996) that can be variably interpreted, often constructed differently by its members and external observers. They argue that these varied interpretations, although they may not be consistent, are nonetheless meaningful, as each interpreter gives the concept of community a specific, albeit subjective, significance.

The second definition broadens the scope to include groups formed around shared interests, pursuits, or occupations. This acknowledges that communities are not solely confined to geographic boundaries but can be formed around any shared interest or activity, regardless of physical proximity. (McMillan & Chavis, 1986) expand on this by emphasising the psychological aspects of community, such as a sense of belonging, influence, and shared emotional connection. They highlight the critical role of boundaries, personal investment, and shared symbols in fostering a strong sense of community, illustrating how these elements contribute to the formation and sustenance of community ties.

Lastly, the third definition brings in the contemporary dimension of online communities. Here, (Hammond, 2017) points out that online communities are distinguished by their reliance on technology for interaction, yet share fundamental aspects with traditional communities, such as a sense of connection, trust, and reciprocal interaction. Hammond underscores that online communities vary in strength and depth, reflecting the evolving nature of community in the digital age. This perspective highlights the complexity of communities in today’s world, where interactions transcend physical boundaries and are increasingly mediated by digital platforms.

Looking at practices, it becomes apparent that they encompass a wide range of activities, including routines, habits, skills, methods and regular activities undertaken by individuals. These elements are deeply influenced by a potent combination of cultural, social and personal factors that play a key role in shaping both individual behaviour and societal norms. The (Oxford English Dictionary, 2023) provides insightful definitions of practice that illuminate its multifaceted nature.

Practice is:

Building upon the earlier discussion of practices, (Foucault, 1982) [pp. 780-781]'s theory offers a critical lens through which to further examine their societal impact. Foucault’s perspective suggests that practices are not simply personal or cultural habits but are deeply embedded within, and influenced by, broader societal power structures and knowledge systems (Haugaard, 2022, p. 345). In this light, the definitions provided previously by the (Oxford English Dictionary, 2023) acquire additional layers of meaning. The first definition, which views practice as the practical application of ideas, beliefs, or methods, resonates with Foucault’s idea that practices are where theoretical concepts materialise into actions that both reflect and reinforce existing power dynamics and knowledge paradigms. These practices are not merely acts of execution or operation; they are instrumental in the embodiment of power relations within everyday activities. The second definition, highlighting the habitual nature of practices, aligns with Foucault’s view on the subtlety of power. Here, routine behaviours and customary actions are seen as vehicles through which power is exerted and normalised, subtly shaping individual behaviours and societal norms. This habitual aspect of practices contributes to the ongoing process of subject formation, subtly guiding the normalisation of certain discourses and power structures. Thus, from a Foucauldian perspective, practices bridge the theoretical and the habitual, playing a pivotal role in the dynamics of power and knowledge within society (see Demenchonok, 2018).

As for community practices, they encompass a diverse array of shared activities, behaviours, and rituals that play a key role in strengthening the bonds among community members and defining their collective identity. These practices serve as the fabric that weaves individuals together within a particular community, shaping its character and reinforcing its sense of belonging. Moreover, community practices are not static but rather dynamic, evolving over time to adapt to changing needs, values, and goals of the community. Communities formed around shared interests, pursuits, or occupations, such as hobbyist groups, professional associations, or activist collectives, engage in practices that are geared toward collaborative efforts and the advancement of common goals.

In line with (Weil, 1996), building community practices involves reinforcing the connections among individuals, groups, organisations, and communities in both geographic and functional terms. The author emphasises the need for activities and policies that foster positive relationships and support problem-solving resources within communities. This is particularly relevant in addressing the challenges posed by socio-economic disparities, racial tensions, and the disappearance of jobs for low-skilled workers. Weil’s approach to community building advocates for proactive social work at grassroots and inter-organisational levels, focusing on empowerment and social development. These practices are aimed at enabling communities to not only cope with their problems but also to create sustainable solutions that align with the values of social justice and participatory democracy. Especially pertinent is the emphasis on economic development strategies that can provide communities, particularly those on the economic margins, with greater security and sustainability. This aligns with the broader social work goal of fostering inclusive, equitable, and non-discriminatory communities.

CoP represent a distinct form of community characterised by their members sharing a common professional or intellectual interest. In such communities, individuals collaborate with the goal of advancing their expertise within the specific domain they are passionate about. Key features of CoP include a shared body of knowledge, a cohesive community of people who actively engage in interaction and collective learning, and the presence of shared practices that organically evolve over time, as described by (Wenger, 2011). Notably, the entire DH community has been likened to a CoP (see Siemens, 2016) and specific initiatives, such as the LINCS[218] recognised themselves as such (Brown et al., 2021, p. 1), highlighting (Wenger, 2011)’s description on people that ‘share a concern or passion for something they do and learn ho to do it better as they itneract regulrly’.

Evolution of CoP, as critiqued by (Li et al., 2009), reveals a journey from a focus on situated learning and the novice-expert dynamic to broader applications in organisational development and competitiveness. Initially, CoP centred on the development of professional identity through interactions in social environments. This focus shifted to personal growth, trajectories of participation (peripheral versus core), and eventually, to CoP as tools for enhancing organisational effectiveness. The adaptability and evolving interpretation of CoP, however, present challenges in defining and maximising their effectiveness. This evolution suggests a need for strategies that balance personal growth with organisational goals and optimise the CoP characteristics, such as fostering interactions, knowledge sharing, and a sense of belonging. Interventions that enhance relationship building and promote knowledge exchange are essential for optimising the function of CoP in various settings.

According to (Leigh Star et al., 2003), the convergence of information artefacts and communities of practice, especially in library science, is a dynamic and reciprocal process. Information artefacts not only shape but are also shaped by the practices and needs of these communities. The interaction between CoP and information systems, as discussed, is crucial for the development of effective and transparent information systems. This is similar to the collaborative practices seen in groups like IIIF, where shared standards and open communication among diverse institutions enhance access and interaction with digital collections. For library science, understanding and supporting CoP and these collaborative practices is essential for adapting to technological advancements and evolving user needs. The socio-technical perspective offered highlights the importance of considering both social interactions and technical aspects in designing and managing information systems within libraries.

To provide a comprehensive understanding, let’s examine the essence of the concepts of and CoP:

Within this context, it is essential to assess the mechanisms by which actors become entangled in consensus-making processes. Community practices often serve as a foundational framework for consensus-building, providing the common ground upon which diverse stakeholders converge. These practices not only define a community’s identity but also shape the way decisions are made, knowledge is shared, and values are upheld.

The next subsection explores the second axis of this thesis, which is intricately intertwined with community practices. LOUD specifications, especially IIIF and Linked Art standards, represent community-driven endeavours that hold the key to achieving semantic interoperability. These standards provide a shared foundation upon which organisations, individuals, and apparatuses can build, facilitating more effective communication, and reinforcing the consensus-making processes that has been discussed.

3.6.2 Semantic Interoperability

This subsection serves as a baseline exploration of the concepts of semantics, interoperability, and the powerful synergy they form in the space of information exchange known as semantic interoperability.

To begin, semantics can range from simple, general definitions of data elements to highly specific and complex definitions that fully describe the context, relationships, and constraints of the data within a particular domain or system. According to the (Oxford English Dictionary, 2023), semantics in linguistics are the ‘the meaning of signs; the interpretation or description of such meaning; (chiefly Semiotics) the study of the meaning of signs, and of the relationship of sign vehicles to referents’, as well as the branch of linguistics or philosophy concerned with meaning in language. Whereas in computing, it relates ‘to the meaning of the strings in a programming language’. It is essential to recognise that semantics provide the basis for shared understanding, both in language and technology.

In terms of a more formal definition, the SDI can serves as a basis for what semantic information is. (Floridi, 2005) [p. 367] describes semantic information as ‘well-formed, meaningful, and truthful data’ This definition can be formally represented using FOL as:

∀x [W(x) ∧ M(x) ∧ T(x)] → S(x)

Equation 3.3: Semantic Information

Here is a breakdown of and the three criteria that data (x) must satisfied to be considered as representing semantic information S(x):

  1. x must be well-formed, denoted as W(x).
  2. x must be meaningful, denoted as M(x).
  3. x must be truthful, denoted as T(x).

In simpler terms, ‘for all data x, if x is well-formed, meaningful, and truthful, then x represents semantic information’.

Transitioning to interoperability, it extends beyond basic compatibility, ranging from systems that can exchange data without errors (Fafalios et al., 2023) to more sophisticated levels of interoperability where systems can effectively use the data being exchanged to support a wide range of tasks and operations.

Shedding light on the perspective of the W3C (see Etemad & Rivoal, 2023, p. 6.3.2.), (Raemy & Sanderson, 2023) [p. 22] have taken a stance in positioning LOUD through the prism of interoperability: ‘Interoperability is a state in which two or more tested, independently developed technological systems can interact successfully according to their scope through the implementation of agreed-upon standards’.

To delve deeper into the nuances of interoperability, it is important to understand that it can take various forms, as outlined by These forms manifest in different ways, and some examples and descriptions, drawing insights from various sources, are being explored here:

Exploring the practical application of interoperability in the STEM, the HIMMS[219] - an American non-profit organisation founded in 1961 dedicated to transforming the global health ecosystem through information and technology – outlines four levels of interoperability (HIMSS, 2020):

These levels illustrate the multifaceted nature of interoperability and its role in enhancing data exchange in common models.

Moreover, open standards play a pivotal role in fostering interoperability, aligning with the enrichments policy of Europeana (Manguinhas et al., 2023, p. 7). Yet, the openness of a standard alone may not guarantee widespread adoption. As highlighted by (Nelson & Van de Sompel, 2020) (Nelson & Van de Sompel, 2022), successful interoperability often depends on having a significant platform, whether through commercial influence or active community engagement. In essence, a core ingredient for successful interoperability specifications is a substantial presence, either in the form of commercial influence or engaged communities. In their words:

Because of [the growing global adoption of open standards] by GLAM institutions, especially [IIIF specifications stand] as a testimony that rich interoperability for distributed resource collections is effectively achievable. But other promising specifications that aim for the same holy grail are struggling for adoption, and, many times, lack of resources is mentioned as a reason. While that undoubtedly plays a role, it did not stand in the way of rapid adoption of protocols that have emerged from large corporations, such as the Google-dominated schema.org. This consideration re-emphasizes that a core ingredient of a successful interoperability specification, and hence of achieving an interoperable global information web, is a large megaphone, either in the guise of commercial power or active community engagement. (Nelson & Van de Sompel, 2022, pp. 8–9)

In the context of semantic interoperability, CIDOC-CRM defines it as ‘the capability of different information systems to communicate information consistent with the intended meaning’. More specifically, the intended meaning comprises of the following elements (Bekiari et al., 2021, p. 18):

  1. the data structure elements involved;
  2. the terminology appearing as data;
  3. the identifiers used in the data for factual items such as places, people, objects, etc.

Furthermore, (Sacramento et al., 2022) [p. 124] provides a comprehensive definition of semantic or content interoperability, emphasising the simultaneous exchange of data and meaning. This is achieved through metadata and controlled, shared vocabularies.

Semantic (content) interoperability is the ability of computer systems to unambiguously exchange data using a shared meaning, independently of their hardware, software, or platforms. It is concerned not just with the packaging of data (syntax), but the simultaneous transmission of the meaning with the data (semantics). This is accomplished by adding data about the data (metadata) and linking each data element to a controlled, shared vocabulary (ontology).

As such, by following (Floridi, 2005) (Floridi, 2011)'s logic of SDI, semantic interoperability can be defined as the seamless exchange of well-formed, meaningful, and truthful data between distinct systems.

The principle that governs the exchange of data between entities needs to be defined. It ensures that when two entities, represented by y and z, exhibit semantic interoperability denoted as I(y) and I(z) respectively, and there exists a seamless exchange mechanism E between them, then certain conditions must hold true for the exchanged data x. In FOL, it can be formally expressed as follows:

∀y ∀z [I(y) ∧ I(z) ∧ E(y,z)]

Equation 3.4: Information Exchange

Equation 3.4 asserts that for all entities y and z, if they exhibit semantic interoperability and there is a seamless exchange mechanism between them, certain criteria regarding the exchanged data must be met. Specifically, it implies that for any data “x” being exchanged between y and z, the three conditions of SDI presented in must also be satisfied.

The complete implication can be expressed as:

∀y ∀z [I(y) ∧ I(z) ∧ E(y,z)] → [∀x (W(x) ∧ M(x) ∧ T(x)) → E(y,z) ∧ S(x)]

Equation 3.5: Semantic Interoperability

Simply put, Equation 3.5 can be summarised it this way: ‘For all y, for all z, if both y and z exhibit Semantic Interoperability I and there is a seamless exchange E between y and z, then for any data x being exchanged between them, x must be well-formed, meaningful, and truthful’.

Moreover, building upon this definition of semantic interoperability, it is crucial to recognise that the exchange of well-formed or meaningful data hinges on specific constructs within information systems. Delving into this relationship, (Scharnhorst et al., 2023) [p. 1115] emphasise the close connection between the creation of controlled vocabularies and the paradigms emerging within research. They articulate that:

The creation of specific KOS is intrinsically connected to the innovativeness of new systems of thought in research and inherently non-interoperable. For service providers supporting humanities research and providing access to cultural heritage data the challenge lies in balance between accommodating specific communities and their needs and fostering data exchange across communities. Part of this balance is realized via an emerging network of connected services. This problem is actually shared by all knowledge domains, and not specific to the humanities. What is needed to further foster the implementation of semantic interoperability is to explicitly discuss the context in which it is introduced, including its main target (findability, accessibility, or re-use), its function for research and beyond, and levels of maturity and sustainability when it comes to new technological solutions.

This observation underscores the imperative nature of addressing these socio-technical requirements to enhance semantic interoperability. Achieving this capability enables different systems to communicate information consistently with the intended meaning, thereby promoting seamless and meaningful data exchange.

In conclusion, the concept of semantic interoperability, encompassing semantics and interoperability, forms a crucial axis of my research. The synergy between these elements and their profound influence on data exchange will be central to my exploration of LOUD through various lenses.

3.7 Summary and Preliminary Insights

This section provides a summary of what has been discussed in this literature review as well as some preliminary insights with regard to the LOUD ecosystem, chiefly the design principles, communities, standards, and the implementations. It follows the flow of the present chapter and is organised into five subsequent parts. Finally, in 3.7.7, I end with a few reflections on why we ought to care about CH data in the wider sense.

3.7.1 Cultural Heritage Data

CH data are unique and require different — or adjacent – methods of analysis than quantitative scientific data. The diversity of CH data, including tangible, intangible, and natural heritage, presents a challenge in preserving and promoting these resources effectively. Representing CH data in digital form can also be challenging, as it may lead to a loss of context and complexity. To address these challenges, a comprehensive understanding of the various types of heritage resources, their meanings, and values is necessary, along with an effective preservation and valorisation strategy that considers their heterogeneity. For this, I find that we need a vision that takes into account the diversity of human and non-human entities, entangled in community-based socio-technical activities, whether local, interdisciplinary or global. The following provides a concise summary of the key aspects explored in the first section, addressing the distinctive nature of CH data.

3.7.2 Cultural Heritage Metadata

A variety of different metadata standards have emerged, tailored to different functions and purposes, with some dating back to the 1960s, particularly in library settings. However, the majority of other traditional CH metadata standards have emerged between the 1990s and the 2000s. In terms of descriptive metadata standards and conceptual models, MARC, ISAD(G), CDWA, CIDOC-CRM are being supplemented – or gradually substituted – by models and profiles like LRM, RiC, and Linked Art, which – to some extent – align with Linked Data principles, enabling richer relationships and enhancing access to vast and diverse collections.

These evolving metadata standards support a more interconnected, open, and accessible information landscape, enabling researchers, practitioners, and the public that should hopefully navigate and explore CH resources more effectively[220], especially when they are used in conjunction with controlled vocabularies, facilitating cross-domain reconciliation. As for aggregation of CH (meta)data, alternative web-based technologies to OAI-PMH appear to be viable options, using AS in particular.

3.7.3 Trends, Movements, and Principles

CH data is intrinsically intertwined with broader trends including Big Data and AI, serving as both a reflection and a product of these influences. Yet it is vital to recognise that CHIs and DH practitioners are – or should be – instrumental in shaping and curating such data. They not only shape and curate them, but also inform and drive the very trends that permeate across our data landscape. Their values, actions and methodologies represent critical drivers in the process, guiding the collection, preservation, and interpretation of CH data.

The scientific movements and guiding principles of Open Science/Open Scholarship, Citizen Science/Citizen Humanities, FAIR, CARE, and Collections as Data collectively permeate and shape CH data. While the first two represent dynamic movements actively involving scholarly practices or the public, the other three are more concerned with the system in which data is situated. LOUD design principles and standards, which are more content-centric, bolster the effectiveness of these principles and movements, fostering open and accessible CH data through their real-world applicability.

3.7.4 Linked Open Data

The Open Web Platform and Linked Data are foundational to the evolution of scholarly research and CH practices, enabling the creation of federated datasets and KGs. The section explores these concepts and their applications in the CH domain.

The web architecture, underpinned by principles, protocols, and identifiers like URI, URL, and URN, facilitates the exchange of data and functionality across applications and systems. It emphasises architectural principles such as orthogonality and protocol-based interoperability and explores various web architectures, including the client-server model and SOA.

The Semantic Web, a vision for a web understood by machines, relies on standards like RDF, RDFS, OWL, and SPARQL. The limitations of RDF in complex KR can be mitigated to some extent with RDF reification and quoted triples. SHACL is also introduced for data graph validation.

Linked data principles promote the publication and interlinking of data on the web, creating a web of data that is navigable and usable. Challenges in Linked Data implementation include GUIs, application architectures, schema mapping, link maintenance, licensing, trust, quality, relevance, and privacy, enhancing the web’s potential as an open, interconnected platform.

Deployment schemes like the Five-Star and Seven-Star models provide criteria for publishing open data. These models address the clarity, usability, and applicability of open data, emphasising schema documentation and data quality.

Finally, the application of LOD in the CH sector is explored through examples like Europeana. Despite its potential in improving data quality and visibility, challenges persist, including issues related to cataloguing, adoption of new standards, and the complexity of Linked Data terminology. The section underscores the need for collaboration and community-driven practices for effective LOD implementation.

3.7.5 LOUD: Design Principles, Communities, Standards

LOUD focuses on improving data accessibility primarily for software developers. It balances data completeness and practical considerations like scalability and usability. Its design principles are:

  1. Right Abstraction for the Audience: Tailoring data access to specific user needs.
  2. Few Barriers to Entry: Simplifying initial engagement with the data.
  3. Comprehensibility by Introspection: Ensuring data is largely self-explanatory.
  4. Documentation with Working Examples: Providing clear guidelines and practical use cases.
  5. Consistent Patterns Over Exceptions: Reducing complexity through uniform patterns.

The systematic review of LOUD in scholarly literature employed the weight of evidence framework. A Boolean query identified relevant papers, with findings showing 46 relevant references from 2018-2023, mainly in English. These papers were categorised into four main categories: mentions of LOUD, descriptions of LOUD, explanations of LOUD design principles, and comparative analyses where theLOUD principles have been reused in various applications.

LOUD integrates technologies, mostly community-driven, like IIIF, WADM, and Linked Art. IIIF facilitates the sharing of high-resolution images and audiovisual content through a series of specifications, WADM provides a standard for creating and sharing annotations across various platforms, and Linked Art provides a model and an API specification for semantically describing CH. Together, they demonstrate a transformative potential in how CH data is interacted with and understood, reshaping traditional humanities and opening new research opportunities.

3.7.6 Community Practices and Semantic Interoperability

In the exploration of community dynamics and the intricacies of data exchange, two axes or perspectives come into focus in my PhD: community practices and semantic interoperability. These axes represent not just isolated concepts but comprehensive frameworks that influence how communities function and thrive in an interconnected world. Community practices, as shared activities and rituals, weave the fabric of collective identity within communities, while semantic interoperability acts as the bridge for meaningful and truthful data exchange between different systems.

Both community practices and semantic interoperability will permeate the empirical parts of the thesis as I explore LOUD for CH through different prisms. These axes serve as critical lenses through which we can dissect and analyse the dynamics, challenges, and opportunities that arise within this landscape.

3.7.7 The Case for Cultural Heritage Data

One thing that has been partially touched upon but not strongly asserted in this chapter is ‘why do we really care about cultural heritage?’

The importance of CH data as primary and secondary sources for DH practitioners transcends the mere definition, interlinking, and preservation of these sources. The undertaking, notwithstanding technological assistance, can be inherently challenging, contingent upon numerous interdependencies.

Moreover, it is deeply rooted in our response to many pressing global challenges, including the far-reaching consequences of climate change largely caused and accelerated by human activities. In an era characterised by profound disruptions, exemplified by events such as heatwaves, fires, droughts, floods, rising sea levels, and the resultant migrations driven by these environmental changes, which affect not only humans but the entire biosphere, it becomes increasingly important to emphasise the societal responsibilities that accompany our engagement with CH as well as DH practices in the Anthropocene (see Nowviskie, 2015).

4. Exploring Relationships through an Actor-Network Theory Lens

As Jim Clifford taught me, we need stories (and theories) that are just big enough to gather up the complexities and keep the edges open and greedy for surprising new and old connections. (Haraway, 2016, p. 101)

This chapter serves as the theoretical framework of the dissertation, and its primary goals are to elucidate the theoretical underpinnings and provide a comprehensive toolbox for addressing the identified problem. In the preceding literature review chapter, I highlighted the issue that necessitates attention around interlinking CH. The theoretical framework, sometimes referred to as the ‘toolbox’, which can be likened to ‘tools’ the that will be employed to understand and address this problem.

Here, the primary purpose of this chapter is to offer an in-depth exploration of the tools – which comprises various theories, propositions, and concepts – delineating their characteristics, behaviours, historical applications, interrelationships, relevance to the study’s objectives, and potential limitations. Subsequently, the next chapter will elucidate how these tools will be operationalised in the research process.

The theoretical framework of this study is firmly rooted in ANT, which will be pursued systematically throughout the research. ANT is a constructivist approach that seeks to elucidate the fundamental dynamics of societies. Unlike traditional perspectives that restrict the concept of an ‘actor’ – or ‘actant’ – to individual humans, ANT expands this notion to encompass non-human and non-individual entities (see Callon, 1999, 2001; Latour, 1996, 2005).

ANT goes beyond the mere identification of actors and networks; it embodies a comprehensive methodology for exploring the intricate interplay of socio-technical systems. ANT distinguishes itself not only by recognising both human and non-human entities - from individuals and technological artefacts to organisations and standards - as actors (or actants), but also by examining their roles within heterogeneous networks of aligned interests. This approach facilitates a nuanced understanding of enrolment and translation processes, where diverse interests are aligned to form cohesive networks, and the concept of irreversibility, which describes the stabilisation of these networks over time. In addition, ANT introduces the concepts of black boxes and immutable mobiles, highlighting the persistent yet mobile nature of network elements such as software standards that transcend spatial and temporal boundaries (Walsham, 1997, pp. 468–470). These concepts are instrumental in dissecting the dynamics of IIIF and Linked Art specifications, which can be considered either full-fledged actors or immutable mobiles, depending on the context of the network under consideration. This dual perspective underscores ANT's role as both a theoretical lens and a methodological tool, providing a robust framework for dissecting the fabric of socio-technical assemblages and enriching our understanding of DH and CH interconnections.

Additionally, the theoretical framework is enriched by integrating complementary perspectives from Donna Haraway’s SK, Susan Leigh Star’s BO, and Luciano Floridi’s PI. Each of these frameworks contributes uniquely to our understanding of LOUD and its socio-technical landscape. Haraway’s approach emphasises the contextually-embedded nature of knowledge, underscoring the importance of diverse perspectives in shaping our understanding of technological phenomena (see Haraway, 1988). The concept of BO provides a framework for examining the role of LOUD technologies as mediators among varied groups, highlighting the importance of flexibility and adaptability in technological systems (Star, 2010; see Star & Griesemer, 1989). Meanwhile, PI offers a foundational perspective, viewing information as an intrinsic part of the reality that shapes and is shaped by technologies (see Floridi, 2011). Collectively, these theories complement the ANT-based approach by providing a multi-faceted understanding of the complexities inherent in LOUD, its technologies, and the communities involved.

This ANT-grounded toolbox is composed of three elements:

  1. Demonstrating how non-human entities exert agency.
  2. Identifying the human and non-human actors involved in these processes.
  3. Investigating the concept of translation and the process by which a network can be represented by a single entity.

For example, when considering the design of the PIA data model, a full-fledged actor in its own right, and more broadly for any KR system, pertinent questions arise about the influences exerted by the various groups of individuals involved in the process. These questions concern not only their interactions with each other but also their impact on the manifestation of the model and, consequently, how KRs can influence the various actors, both during its implementation and throughout its creation. In addition, the data model is composed of parts always bigger than the sum of their individual characteristics, as each part not only contributes to the overall functionality but also embodies a complex network of relationships and interactions. This viewpoint, inspired by (Latour et al., 2012), asserts that in the realm of social connections, following Gabriel Tarde’s monadological approach – i.e., viewing each individual or element as a self-contained universe or ‘monad’ with its own unique properties and relations (see Tarde, 2000) – individual elements (such as stakeholders or data points) often carry more information and potential than what is apparent when they are viewed solely as components of a larger system. In this perspective, the complexity and richness of each individual element often surpasses the aggregate. Thus, the model, potentially acting as a boundary object when not aligned with standardised processes, serves as a site of negotiation and alignment among different stakeholders. As (Haraway, 2016) [p. 104] poetically puts it, software could also be defined as ‘imploded entities, dense material semiotic “things”’, a notion that underscores the entanglement of information, technology, and materiality.

This chapter is organised into four sections, corresponding to the three aforementioned aspects and an additional section focusing on the revised epistemological foundations.

First, Section 4.1 explores the dissolution of rigid distinctions between human and non-human actors, emphasising the dynamic and interdependent nature of such relationships. This exploration is fundamental in understanding the broader implications of standards and community engagements in any field.

Then, Section 4.2 examines how collectives composed of differing actors can be assembled into a cohesive network where each entity’s agency and influence are recognised. In this section, the concepts of quasi-object of BO are introduced to elucidate the role of shared objects and concepts in mediating and facilitating interactions among diverse groups within a network.

Section 4.3 investigates the translation process where actors negotiate, modify, influence, and align their interests and identities in the formation and maintenance of networks. Additionally, PI, particularly the SLMS approach, is introduced here to provide a structured understanding of how information and knowledge are conceptualised, managed, and communicated within these networks, offering a deeper insight into the dynamics of standard adoption and community interaction in collaborative efforts.

Finally, Section 4.4 revises the epistemological foundations to address the nuanced inquiries presented later in the empirical chapters. In this section, SK is introduced, emphasising the importance of context-specific and perspective-driven knowledge in shaping our understanding of technological and cultural phenomena. This concept, developed by Donna Haraway, advocates for a more critical and reflexive view of knowledge, recognising that all knowledge is situated within specific cultural, historical, and personal contexts. This approach challenges the notion of objective or universal knowledge, asserting that all understanding is partial, located, and contingent. The incorporation of SK is crucial for comprehending how different actors’ perspectives and experiences influence the implementation and interpretation of LOUD standards. It helps in examining how these standards and community participation in IIIF and Linked Art are perceived and enacted differently across various social fabrics, particularly contrasting settings where these standards and communities are not engaged. Central to this discussion is the question, This question serves as a common thread, leading into the main research question: ‘How to situate Linked Open Usable Data and to what extent has LOUD shaped or will shape the perception of Linked Data in the broader context of cultural heritage and digital humanities’. Throughout these sections, ANT forms the underlying common thread, with the other theories augmenting and enriching this comprehensive theoretical framework.

Overall, the theoretical framework will be drawn upon to explore what (Manovich, 2017) [pp. 60-61] refers to as ‘everything and everybody’. Borrowing from Haraway’s concept of Tentacular Thinking, this approach recognises the interconnectedness and interdependence of all elements within the research scope, from the minutiae of technical details to the broader societal implications (see Haraway, 2016). This comprehensive view is essential for addressing the detailed nuances of technical implementations, yet it is also crucial for understanding their wider societal implications, and for considering the multi-layered complexities involved in the implementation and perception of LOUD.

4.1 Implosion of the Boundaries: Non-humans have Agency

The (Oxford English Dictionary, 2023) assigns two primary sets of meanings to the term ‘agency’. The first pertains to ‘a person or organisation acting on behalf of another, or providing a particular service’. The second, of greater relevance to this discussion, relates to an ‘action, capacity to act or exert power’. This second definition is further elaborated as an ‘action or intervention producing a particular effect; means, instrumentality, mediation’.

The concept of agency has been a central theme in various philosophical and sociological discourses. In ancient philosophy, both Plato and Aristotle contributed foundational ideas to the concept of agency, each offering distinct perspectives that have significantly influenced subsequent thought.

Plato, known for his theory of ideal forms, presented a dualistic view of reality, distinguishing between the world of forms (ideas) and the physical world. Within this framework, he saw agency as the soul’s ability to recognise and conform to these ideal forms (Watson, 1975, p. 209). For Plato, true agency involved transcending the physical and sensory world and directing one’s actions according to reason and intellect. This pursuit of knowledge and truth was seen as the highest form of agency, with actions aimed at realising eternal and immutable truths. Plato’s vision of agency is closely tied to knowledge, virtue and the pursuit of the good, as seen in his portrayal of the philosopher-king in , who governs himself and the state with wisdom and insight (see Plato, n.d.).

Aristotle, Plato’s student, offered a more practical and empirical approach. He incorporated agency into his broader ethical framework, placing emphasis on the ability to act virtuously and to make decisions in accordance with a telos or purpose. Aristotle regarded every action and choice as directed towards an end, and this teleological approach is key to understanding his concept of agency. Agency in Aristotle’s philosophy is deeply intertwined with the notions of potentiality and actuality, where potentiality represents inherent capabilities and actuality is their realisation through action. This perspective reinforces the importance of rational deliberation, moral virtue, and the realisation of potentiality in human life. In addition, Aristotle emphasised the role of choice (prohairesis) and practical wisdom (phronesis) in guiding deliberate, rational and virtuous action (see Charles, 2017).

These ancient philosophical perspectives, with Plato’s focus on reason and alignment with the ideal, and Aristotle’s emphasis on practical wisdom and virtue, set the stage for later philosophical explorations of agency by modern thinkers such as David Hume (1711-1776), Immanuel Kant (1724-1804), and Georg Wilhelm Friedrich Hegel (1770-1831). In their perspectives, agency is understood as an individual’s capacity to act in the world, based on intentionality and rationality (Pippin, 1991). Hume’s empiricist approach saw agency as closely tied to the experiences and perceptions of the individual, underlining the role of personal choices and mental states (see Schier, 1986). Kant’s critical philosophy stressed the importance of autonomy and moral law in agency, which he understood as the capacity of individuals to act according to universal moral principles derived from reason. Hegel offered a more dialectical approach, seeing agency as part of a broader historical and social process in which the actions of the individual are interwoven with the unfolding of rational will in history. These classical views of agency focus primarily on human agents and their conscious, intentional actions.

In contrast to these traditional perspectives, the advent of ANT and the works of Bruno Latour, John Law, Madeleine Akrich, and Michel Callon, mark a significant shift. These academics argue for a more inclusive understanding of agency, where non-human entities — ranging from technological artefacts to animals and even ideas — can also be agents that influence and shape the course of social events. This perspective is a departure from the anthropocentric view of agency and opens up new ways of understanding social dynamics and networks.

The concept of the , as introduced by Algirdas Julien Greimas (1917-1992), is pivotal in this context. In Greimas’ semiotic theory, an actant can be any entity, human or non-human, that contributes to the progress of a narrative. This concept significantly expands the traditional narrative framework established by Vladimir Propp (1895-1970), which focused mainly on human characters and their roles in folk tales. Propp’s analysis focused on the actions and roles of these character types, which he categorised into a standardised framework. Greimas’ approach to narratives, as referenced by (Boullier, 2018), emphasises the potential of any entity to play a role in a story, thereby expanding the narrative scope.

The agency’s move is based on a well-known but seldom mentioned loan from Greimas’ 1966 semiotics. The concept of “actant” allowed the potential arrangement of any entity that populated the narratives to be aligned beyond Propp’s tradition. While Greimas’ formalism was certainly not preserved, the principle allowed for more open stories to be told and the concept of “allies” to be formalised, in particular, which extended the idea of “adjuvants” and “opponents” (without this being done from a strategic perspective, contrary to some interpretations). (Boullier, 2018)

This idea of actant resonates strongly within ANT, as it aligns with the theory’s aim to dissolve the strict dichotomy between human and non-human actors. In ANT, actants are not limited to individuals or even sentient beings; they include any entity that can affect or be affected by the network. This redefinition of agency through the lens of ANT and the concept of the actant is a cornerstone in understanding the complex, interconnected networks that constitute social and technological realms. It allows for a broader and more nuanced understanding of how various elements within a network interact and influence one another, regardless of their traditional classification as human or non-human.

ANT provides a radical redefinition of agency, challenging the modernist and post-modernist interpretations. It proposes an viewpoint (Latour, 1990), dissolving the dichotomy between human and non-human agency, and focusing on the network dynamics in a society where ‘contemporary techno-science consist of intersections or “hybrids” of the human subject, language, and the external world of things, and these hybrids are as real as their constituent’ (Bolter & Grusin, 1999, p. 57). Agency, in ANT, includes not just intentional actions but the capacity of any entity to affect or be affected in a relational network (Latour, 1996). This expansive view of agency, influenced by the concept of the actant and informed by Greimas’ semiotics, offers a more holistic understanding of actors within networks.

Adopting a Latourian approach, researchers such as anthropologists and sociologists are encouraged to observe the balance between human and non-human properties within networks. This balanced observation is crucial for a full understanding of the dynamics within these structures (Latour, 1993, p. 96). By acknowledging the agency of both human and non-human agents, and by recognising the blurred boundaries between subjects and objects, researchers can gain deeper insights into the complex interplay of forces that shape social realities.

4.2 Assembling the Collective

Following the exploration of agency in the previous section, the assembly of collectives is examined. This process involves identifying a myriad of actors, both human and non-human, and understanding how they coalesce into actor-networks. The assembly of such a collective, a differing cosmos of socio-technical agents, is predicated on the recognition of each actor’s unique agency and the dynamic interplay of relationships that bind them together.

The transformation challenges of tools for DH and object knowledge, as discussed by (Camus et al., 2013), highlight the complexities involved in integrating DH with traditional scholarly practices. The author emphasises the need for a nuanced approach to the digitisation and dissemination of CH resources, underscoring the pivotal role of collaborative efforts in bridging the gap between technology and humanities scholarship. This analysis aligns with the ANT perspective, which advocates for recognising the contributions of diverse actors within the wider DH ecosystem.

Within the LOUD ecosystem, a diverse set of actors comprises individual contributors, institutions, and several groups and committees, each with its own set of objectives. This ensemble also includes specifications and compatible software that facilitate interoperability, as well as end users. Interestingly, the majority of these end users remain unaware — whether through seamless integration or simply because their interaction does not require conscious recognition — that their digital interactions are often mediated by or compliant with LOUD standards. This diversity underscores the importance of understanding how different actors, their objectives, and their contributions shape the development and adoption of LOUD specifications and practices.

Transitioning from the depiction of the LOUD ecosystem’s varied participants, the concept put forth by (Gandon, 2019) for an envisioned web architecture introduces a composed of diverse natural intelligences – such as humans, connected animals and plants – and artificial intelligences, including entities capable of reasoning and learning. This shift marks a deeper recognition of the layered interactions that form the backbone of digital platforms, and points to a future where technology adapts to embrace a wider range of intelligences within the structure of the web.

Classifying these diverse actors involves understanding their roles and interactions within the network. Quasi-objects and BOs provide frameworks for this classification, enabling a nuanced understanding of the socio-technical assemblage and facilitating communication among its varied components.

The foundation established by ANT, incorporating the concept of from (Serres, 2014) challenges traditional categorisations by embodying characteristics of both subjects and objects. This conceptual framework is crucial for appreciating how non-human entities can exhibit agency and actively participate in social networks, thus broadening our understanding of actor-network dynamics.

Quasi-objects, existing in a state of flux and embodying characteristics of both subjects and objects, challenge our conventional understanding of agency. Concurrently, the concept of BOs is introduced, enriching the ANT-grounded toolbox by highlighting the role of shared objects and concepts in mediating and facilitating interactions among diverse groups within the network. Unlike quasi-objects, which symbolise a hybrid state between subjectivity and objectivity, BOs focus on interaction and communication. They are crucial in collaborative efforts, especially in diverse and interdisciplinary settings, by maintaining a common identity across various contexts while being interpreted differently in each. Understanding the distinction and the interplay between quasi-objects and BOs is vital for comprehending the dynamics of actor-networks. The introduction of BO in this section elucidates their role in mediating complex socio-technical interactions, highlighting the importance of BO in community-based initiatives and their broader impact. Star’s reflection on BOs underscores their significance:

Boundary objects are objects which are both plastic enough to adapt to local needs and constraints of the several parties employing them, yet robust enough to maintain a common identity across sites. They are weakly structured in common use, and become strongly structured in individual-site use. They may be abstract or concrete. They have different meanings in different social worlds but their structure is common enough to more than one world to make them recognizable, a means of translation. The creation and management of boundary objects is key in developing and maintaining coherence across intersecting social worlds [(Star, 1999) p. 393]

The relevance of BO extends to the restructuring of residual categories through cycles of standardisation attempts that create said boundary objects as illustrated in Figure 4.1. This cycle emphasises the negotiation and alignment among different stakeholders, underscoring the adaptive and flexible nature of BO in managing the complexities of standardisation and the varied interpretations across social worlds (see Star, 2010).

Relationships Between Residual Categories, BOs, and Standardisation Attempts. Adapted from (Star, 2010)
Figure 4.1: Relationships Between Residual Categories, BOs, and Standardisation Attempts. Adapted from (Star, 2010)

The integration of concepts such as quasi-objects and BOs within the ANT-based toolbox necessitates a reevaluation of existing ontological frameworks. This shift leads to a more interconnected understanding of societal dynamics, recognising the central role of diverse actors in forming networks.

The adoption of these concepts in ANT requires a re-evaluation of existing ontologies. This re-evaluation involves redefining our understanding of agency, action and influence to include a wide range of actors, both human and non-human. This approach leads to a more nuanced and interconnected understanding of society, where traditional boundaries between different types of actors are actively re-imagined. In this way, ANT argues for a network-like ontology and social theory that can fully integrate the influence and interactions of disparate actors within society (Latour, 1996, p. 370).

In engaging with the complexities of assembling the collective, we encounter the necessity to cease replicating the , as discussed by This principle, which dictates the preservation and accumulation of knowledge under a single authority or location, is challenged by the fluid and distributed nature of ANT's networks. Derrida’s critique invites a rethinking of how knowledge and information are curated and disseminated, echoing the ANT perspective that stresses the distributed, multifaceted interactions of actors within a network. The dialogue with Derrida’s deconstruction of the archive complements the ANT approach by advocating a more open, inclusive understanding of how socio-technical collectives are formed and maintained.

For instance, the examination of BOs in the context of queer identities illustrates the potential of these concepts to challenge and redefine traditional typologies, offering new perspectives on identity and community formation within socio-technical networks (see Junginger & Dörk, 2021).

Having assembled our collective and identified the diverse actor-networks, the focus shifts to exploring the relationships and communication mechanisms among them. This exploration, conducted in the subsequent section, is crucial for understanding how different actors – ranging from quasi-objects to boundary objects – contribute to and shape the collective narrative and functionality of a given network.

4.3 The Translation Process

The translation process within ANT refers to the dynamics of establishing associations and networks among diverse actors. Latour describes translation as a specialised relation that transforms mediators into coexistent entities without directly transferring causality. This concept underscores the complexity of interactions within networks, emphasising the creation and maintenance of associations that extend beyond mere causal relationships.

[The] word “translation” now takes on a somewhat specialised meaning: a relation that does not transport causality but induces two mediators into coexisting (Latour, 2005, p. 119)

Associations within ANT evolve continuously, demonstrating the ongoing and emergent nature of actor-networks. They materialise through the interactions and negotiations among actors. This perspective broadens the understanding of networks by stressing their fluid nature.

Understanding the dynamics of actor-networks requires an exploration of the ways in which actors influence each other. This is especially important in order to capture the nuanced roles and interactions within these networks. A key contribution in this area has been made by Latour, who suggests a compelling structure for disentangling these interactions as ‘mediation’. Latour’s conceptualisation is particularly insightful in that it breaks down the nature of influence into distinct yet interrelated processes. The essence of his argument can be summarised as follows:

This capacity of actors to influence each other was defined by (Latour, 2005) as mediation, further broken down into four types: interference, composition, black boxing, and delegation. Interference appears when one actor interferes with the goal of another. In composition, the actors influence the common goal of the network together. Black-boxing is when gradual complexification of actors (and their interrelations) reaches a point where treating the constellation as a single actor becomes more meaningful, and delegation is when meaning and expression is delegated to non-human objects. (Czahajda et al., 2022, p. 3)

In the context of the LOUD space, treating a constellation of actors necessitates a focus on delegation, a process pivotal for understanding how meanings and functions are assigned within networks. This emphasis on delegation underscores the necessity of expanding our epistemological horizons to encompass PI. Integrating PI offers a comprehensive framework for analysing how information not only mediates relationships within these networks but also influences the reality of digital ecosystems. Such an expanded perspective is essential for a thorough understanding of the dynamics within the LOUD ecosystem and its broader implications.

PI is conceived as a groundbreaking approach that examines the nature and dynamics of information. It explores how information fundamentally structures reality and thereby shapes the entities within it, termed or informational organisms. These entities are embedded in information environments where they engage and interact in a vast information ecosystem. (Floridi, 2010)’s work illuminates the ways in which information underpins and transforms our understanding of reality, and suggests that living in the information age means recognising our role and identity as part of a complex, interconnected informational world. This perspective invites a deeper reflection on the implications of living in the midst of vast information networks, and urges a re-evaluation of how information influences human identity, society and our wider interaction with the digital and natural worlds.

Diving deeper into Floridi’s PI, the LoA concept emerges as a critical tool for dissecting the intricacies of complex computational systems, including LOUD-compliant ecosystems. This approach, fundamental to PI (Ganascia, 2015), aids in navigating the multifaceted layers and perspectives inherent in these systems (Angius et al., 2021), particularly those structured around client-server architectures. LoA provide a structured way to analyse and understand complex systems by breaking them down into different layers or perspectives. Each level focuses on specific aspects of the system, allowing for a clearer analysis of its components and their interactions. By advocating for a separation of concerns, the LoA framework equips us with a strategic method to manage and simplify complexity, enabling a focused examination of distinct abstraction levels within digital ecosystems (Leeuwen, 2014).

Building on the foundation laid by Floridi’s LoA, (Selbst et al., 2019) critique the fair ML field’s reliance on abstraction and modular design for achieving fairness, identifying five abstraction traps that highlight the challenges of applying computational interventions to societal contexts without considering the interplay between social context and technical systems. This critique underscores the relevance of incorporating a socio-technical perspective in this thesis, emphasising the need to make use of STS methodologies in the design process to avoid these abstraction traps.

Considering the insights from (Selbst et al., 2019), this thesis will explore four levels of LoA and one transversal dimension within LOUD ecosystems, acknowledging that each can act as its own actor-network or collectively form a singular network. These levels, from low to higher abstraction, include Algorithmic and Computational Processes, Infrastructure, Data Model, and Representation and Display. Societal implications, integrated across all levels, will address the broader cultural and social impacts.

In the exploration of LOUD ecosystems, the concepts of ‘immediacy’ and ‘hypermediacy’, as delineated by (Bolter & Grusin, 1999), provide additional insightful perspectives on the role of interfaces as LoA. Immediacy refers to the design of interfaces that aim to create a seamless, transparent UX, making the technology invisible and allowing for direct interaction with the content. This is evident in interfaces such as OSD, which displays high-resolution images and strives to provide a smooth and immersive viewing experience by minimising the perceptibility of the API-compliant resource. On the other hand, hypermediacy emphasises the presence and visibility of the medium, drawing attention to the various forms of mediation. Interfaces embodying hypermediacy offer a multi-layered, heterogeneous presentation, making users aware of the different media elements and their interactions, e.g. Exhibit for storytelling purposes. This duality enriches the user’s digital encounter, underscoring the need for LOUD-compliant tools and services to mediate these experiences effectively. By embracing these concepts, the framework can strategically leverage interfaces to either conceal or reveal the intricacies of the digital medium, facilitating a nuanced engagement with information ecosystems. Figure 4.2 presents a comprehensive view of the LOUD ecosystem’s LoA, enriched by the inclusion of societal implications as a cross-cutting dimension and the incorporation of immediacy and hypermediacy as critical concepts at the representation and display level for understanding user interaction and interface design.

Exploring Levels of Abstraction in LOUD Ecosystems: Integrating Societal Implications with the Concepts of Immediacy and Hypermediacy
Figure 4.2: Exploring Levels of Abstraction in LOUD Ecosystems: Integrating Societal Implications with the Concepts of Immediacy and Hypermediacy

The SLMS scheme, which includes a framework for each identified LoA, equips this research with a comprehensive lens for analysing computational systems, revealing how they can be effectively combined with ANT. This methodology, as depicted in Figure 4.3, enriches the exploration of digital ecosystems. This combination offers a unique perspective on understanding the interplay between computational entities and the broader networks they inhabit. The SLMS scheme can be summarised as follows:

The SLMS According to (Floridi, 2008)
Figure 4.3: The SLMS According to (Floridi, 2008)

(Gobbo & Benini, 2016) expand on the SLMS scheme by highlighting the challenges and intricacies of quantifying and qualifying computational information. They advocate for a comprehensive methodology that appreciates both the physical and conceptual dimensions of data, facilitating a deeper understanding of programmable artefacts and their informational content. This perspective not only complements the analytical capabilities of ANT but also opens new avenues for investigating the dynamics of information and technology.

As I venture to revise the epistemological foundations and introduce Haraway’s concept of SK, it becomes increasingly manifest that the integration of ANT with Floridi’s PI and the insights of computational information theory provides a robust framework for exploring the complexities of digital and networked environments. This interdisciplinary approach lays the groundwork for a comprehensive exploration of the digital world, emphasising the importance of situated, contextual knowledge in understanding and navigating the digital landscape.

4.4 Epistemological Foundations

This section establishes the epistemological foundations, presenting ANT, BO, and PI, alongside Donna Haraway’s SK. Rather than synthesising these theories, this chapter places an emphasis on situating LOUD within a feminist perspective to construct a new materialistic foundation reminiscent of Haraway’s ‘alignment in tentacular worlding’ [(Haraway, 2016) p. 42]. This assemblage seeks to navigate the controversies and mappings within LOUD-like communities, applying a Tardian approach to trace the spreadability of ideas (Latour et al., 2012).

To analyse the relevant actor-networks effectively, especially being part of both the IIIF and Linked Art communities, a particular lens is required. ANT, while expansive, has faced criticism for its perceived flatness in analysing networks. Here, Haraway’s SK becomes instrumental, providing a stance that enriches the ANT-grounded theoretical framework with a comprehensive lens that prioritises context in shaping knowledge. SK emphasises that knowledge is always situated, partial, and contextually produced, offering a critical perspective on determining relevance within networks. SK complements ANT by adding depth to the analysis of actor-networks. It highlights the significance of context – both human and non-human – in the production of knowledge, thereby enriching the theoretical framework with a nuanced lens for exploring the dynamics within the LOUD space.

SK, as articulated by (Haraway, 1988), emphasises the contextual nature of knowledge and challenges the pursuit of an objective, universal truth divorced from the position of the knower. Haraway’s framework, which integrates standpoint theory, concedes that knowledge is inherently shaped by its social, cultural and historical context, and supports an understanding of knowledge as partial and situated. This approach, noting the influence of epistemological privilege and intersectionality, argues against universalism by stressing the importance of being conscious of the specific perspectives and biases that inform one’s understanding. The relevance of SK to ANT lies in its complementary perspective of acknowledging the diverse, context-specific factors that influence knowledge production within networks, enriching ANT's analysis of actor-networks by incorporating a critical, reflexive lens on the situatedness of knowledge.

In forging this theoretical framework, I seek to transcend the notion of merely ‘cobbling together’ disparate ideas. Instead, I aim to weave their contributions into a coherent web of thought, ensuring a seamless and comprehensive framework that embodies the essence of their respective insights, one that transcends disciplinary boundaries. The theoretical framework can be synthesised as follows, elegantly interweaving distinct yet complementary perspectives to enrich our understanding of socio-technical ecosystems:

Figure 4.4 illustrates how these theories intertwine to form the epistemological foundation of this research, demonstrating the synergistic potential of combining ANT with SK, BO, and PI to navigate the complexities of digital ecosystems in the CH field.

A Sympoiesis of Theories: ANT Entangled with SK, BO, and PI
Figure 4.4: A Sympoiesis of Theories: ANT Entangled with SK, BO, and PI

Concluding this chapter, the developed toolbox lays a coherent foundation for empirical research, poised to explore the dynamics within actor-networks. This exploration is not entirely novel in the context of CH, as exemplified by (Guillem et al., 2023)’s use of ANT to spotlight the keystones that were destroyed by the fire at Notre-Dame de Paris. As this narrative unfolds into Chapter 5, the ANT-based toolbox, enriched with amodern and feminist perspectives, will be instrumental in navigating the forthcoming empirical landscapes.

5. Research Scope and Methodology

This chapter delineates the Research Scope and Methodology, laying the groundwork for the empirical exploration within this thesis.

Structured into five key sections, this chapter outlines the methodological framework adopted to explore the purposes of the research and specifies the systematic approach used to consider the outcomes. It initiates with the presentation of research questions in Section 5.1 and hypotheses in Section 5.2 that are crucial for evaluating LOUD within the CH context. Following this, Section 5.3 details the objectives aimed to be achieved based on the research findings. The methodology, elaborated in Section 5.4, encompasses the research methods, data creation, collection, curation, modelling, analysis, alongside considerations of limitations and ethics. The final section, 5.5, defines some of the scholarly implications of the thesis, highlighting its substantial contributions to DH and the broader field of the humanities. This discourse encapsulates the implications of the research, affirming its value and positioning within the scholarly community.

Throughout this thesis, the two pivotal perspectives, Community Practices and Semantic Interoperability, serve as threads interwoven across chapters. While certain research questions, hypotheses, and objectives are specifically aligned with one perspective, others bridge both, showcasing their interconnected relevance.

5.1 Research Questions

Within this thesis, the primary research question seeks to critically evaluate the role and impact of LOUD, exploring how it influences and integrates within the fields of CH and DH. This research seeks, in part, to uncover the depth of LOUD's contribution to these disciplines, and to explore its potential to reshape the perception and use of Linked Data. Framing the central inquiry of this investigation, the primary research question is articulated as follows:

How can the concept of Linked Open Usable Data be situated within the broader framework of cultural heritage and digital humanities, and to what extent has it influenced – or is likely to influence – the perception and use of Linked Data in these fields?

In order to explore this overarching question, three specific research questions have been formulated to probe deeper into the socio-technical dynamics of LOUD:

These questions collectively aim to illuminate the multifaceted implications of LOUD, paving the way for a nuanced understanding of its benefits and challenges. The subsequent section will transition into the hypotheses derived from these research inquiries, further delineating the expected insights and contributions of this investigation.

5.2 Hypotheses

Transitioning from the research questions, this section posits a main assumption based on the expectation that LOUD design principles, being grassroots-driven and based on consensus and transparency, will foster greater acceptance within CH and DH fields. Despite the nascent recognition of LOUD as a distinct entity and its perception as a technical utility rather than part of a broader socio-technical ecosystem, these hypotheses aim to explore the multifaceted impacts of LOUD standards on these disciplines.

The hypotheses outlined below seek to elucidate the dynamics and potential outcomes identified in the research questions:

In summary, these hypotheses underpin the aim of the thesis to examine the influence and operationalisation of LOUD, setting the stage for a detailed exploration of the objectives in the following section.

5.3 Objectives

This section aims to define the concrete objectives that guide the empirical exploration of LOUD. Through leveraging ANT epistemological foundations, these objectives are crafted to dissect various aspects of LOUD's integration, adoption, and impact. Each objective targets a specific dimension, from community consensus and participation to the practical implementation within PIA and evaluation of LOUD standards in Yale. More specifically, five objectives have been identified. They do not maintain a 1:1 relationship with what was being presented earlier as two objectives are linked for the first and the third hypothesis and research question.

These objectives are explored extensively in three empirical chapters, offering a detailed insight into the operationalisation and impact of LOUD standards. In the following section, the methodology used to achieve these objectives is discussed in greater detail.

5.4 Methodology

This section is designed to rigorously address several pivotal questions that underpin the research approach, reflecting on the epistemological and ontological stances guiding this study. It explores the selection and application of data collection tools, detailing their relevance and the implications of their use, alongside the timing and demographic details of data collection. This section also looks at the analytical tools used, discussing their significance, implications and any ethical considerations that arose.

The methodology of this research is streamlined into three parts. This consolidation provides a holistic overview, starting with an outline of the research methods presented in each of the empirical chapters in 5.4.1, through a combined exploration of data creation, collection, curation and modelling, to an examination of data analysis. The section concludes by addressing the overall limitations and ethical considerations of the research in 5.4.2 and 5.4.3, ensuring a careful and ethically sound approach to the study of LOUD.

5.4.1 Outline of the Research Methods

This subsection clarifies the multifaceted approach to data collection, creation, curation, modelling and analysis that is central to the empirical chapters that follow. By interweaving quantitative and qualitative methods, this research not only navigates the landscapes of cultural processes, but also bridges the micro-macro divide in the study of cultural constellations. Each empirical pursuit is dissected in detail, spotlighting the ways and means used to unveil the underlying dynamics at play.

At its core, this thesis seeks to unravel the intricate web of cultural processes and constellations through a lens that is both micro in perspective and rich in empirical evidence. It emphasises an actor and practice-centred inquiry, as articulated by (Wietschorke, 2014) [p. 160]. This study aligns with such a perspective, aiming to dissect and understand cultural phenomena through a detailed examination of specific cosmologies, like those of LOUD-driven communities, the PIA research project, and Yale’s LUX.

The attempt to understand a particular cosmology requires immersion in the practices and dynamics of the communities at its heart. Inspired by (Latour & WooIgar, 1986), this research situates itself within the vibrant ecosystems of the IIIF and Linked Art communities. As an active participant, the research reflects the intricacies of a laboratory where the fusion of practices unfolds, providing a rich tapestry of insights into the research objectives.

Embracing the role of a ‘praxiographer’ [221], in the vein of (Mol, 2002), this exploration goes beyond mere observation. It addresses the of research, emphasising the situated, embodied practices that shape knowledge production. This praxiographic lens foregrounds the complex, often tacit, processes through which the communities’ cosmologies are enacted and understood. It acknowledges that these practices are not just observed but are actively participated in, thereby crafting a nuanced narrative of the research landscape. This approach aligns with Mol’s advocacy for a more embodied and engaged form of inquiry, one that recognises the researcher as an integral component of the knowledge ecosystem, contributing to and being shaped by the evolving dynamics of community practices.

Recognising the transformative potential of digital methods, this study echoes (Venturini & Latour, 2009)’s call as well for a paradigm shift. It emphasises the need to fuse digital tools and data with traditional approaches, thereby enriching micro-analyses of interactions with insights into broader macro-structures. This not only challenges existing methodologies, but also invites a reappraisal of the dichotomies that have long governed the social sciences and humanities at large, and argues for a deeper engagement with the complexities of the digital sphere.

As outlined in Table 5.1, the empirical chapters are meticulously aligned with the research’s overarching axes, offering a panoramic view of the questions, hypotheses, and objectives steering this inquiry. This segmentation underscores the nuanced exploration of community practices, semantic interoperability, and their intersections, paving the way for a comprehensive understanding of the cultural and technological fabrics under investigation.

Table 5.1: Empirical Chapters: Research Axes and Scope Overview
Chapter Empirical Research Community Practices Semantic Inter-operability Research Scope
Chapter 6 Assessment of Practices in IIIF and Linked Art Communities YES To some extent RQ01/HYP01 – OBJ01.1, OBJ01.2
Chapter 7 Deployment of LOUD within PIA To some extent YES RQ02/HYP02 – OBJ02
Chapter 8 Review of a Large-Scale LOUD Implementation and Data Consistency within LUX YES YES RQ03/HYP03 – OBJ03.1, OBJ03.2

Embarking on this research journey requires a versatile approach that encompasses data creation, collection, curation, modelling and analysis. This approach not only ensures a robust empirical foundation, but also fosters a dynamic interaction with the data, allowing new insights and perspectives to emerge. The following narrative will explore these dimensions, highlighting the central role of each in shaping the contours of this scholarly enterprise. Through this lens, the research not only contributes to academic discourse, but also navigates the challenges and opportunities presented by LOUD in the CH field.

In the following three parts (sub-subsections 5.4.1.1, 5.4.1.2, and 5.4.1.3), dealing with the empirical research, a diagram is provided to illustrate the process from data collection or generation to the results that will be explored in later chapters. Figure 5.1 provides a visual template detailing the specific shapes and colours used to represent different stages and components of the research process. This schematic representation helps to fully understand the progression from initial data gathering to the analytical findings that form the basis of subsequent discussions.

Research Outline Visual Template
Figure 5.1: Research Outline Visual Template
5.4.1.1 Assessment of Practices in IIIF and Linked Art Communities

Chapter 6 engages with the social fabrics of IIIF and Linked Art, aiming to achieve two objectives through a mixed-methods approach. The research process and steps undertaken are detailed in Figure 5.2.

Research Outline for Assessing the IIIF and Linked Art Practices
Figure 5.2: Research Outline for Assessing the IIIF and Linked Art Practices

For the first objective, which is to analyse capacity-building and advocacy initiatives within the IIIF and Linked Art communities, data was collected from - generically referred to here as - guidelines on each initiative’s website, such as cookbook recipes to help create resources and principles, as well as meeting minutes produced in Google Docs and GitHub issues. The meeting minutes from Google Docs were extracted with the help of a tool and a series of Python scripts. As for the guidelines, a matrix was prepared for the qualitative data, while the more quantitative data was extracted using the GitHub API. A first analysis was performed by aggregating and structuring all the data in CSV files according to the levels of abstraction discussed in the previous chapter, here in terms of category (e.g. viewer, type of support, JSON similarity, etc.). Further analysis was carried out using NetworkX[222], a sophisticated Python library designed for the analysis of networks between social entities and their information exchange (Hagberg et al., 2008). This tool allows the generation of visual representations that aim to achieve the goals of SNA. SNA is understood here more as an analytical goal realised through visual means than as a methodological tool (see Akhtar, 2014). It complements the theoretical framework based on ANT, enriching it with a quantitative and visual approach to studying the interconnectedness of actors and the influence of these connections on the distribution of information and resources. By incorporating SNA, this research seeks to uncover the rich network of relationships and patterns between actors within LOUD ecosystems, with visualisations that can be effectively realised through tools such as Gephi[223], a network analysis and visualisation software (Bastian et al., 2009). For GitHub issues, a case study was conducted to show how an issue going through the IIIF review process works.

As for the second objective, it consists of analysing the adoption surveys that have been carried out within the IIIF community over the years on the adoption of its specifications and creating my own survey on LOUD's socio-technical characteristics. For the latter, Google Forms was used and the raw data has been made available after a pseudonymisation stage. Graphs and statistics were created using Python, in particular using its pandas library[224], as well as employing RStudio[225] to produce various files for publication in a final report.

Finally, the main scripts and outcomes of this empirical research were made available through a GitHub repository[226].

5.4.1.2 Deployment of LOUD within PIA

The deployment of LOUD standards and the underlying tools and software required to implement them within the PIA project is covered in Chapter 7. The implementation and analysis process has a single objective, captured in Figure 5.3.

Research Outline for Deploying LOUD Standards in the PIA Project
Figure 5.3: Research Outline for Deploying LOUD Standards in the PIA Project

In keeping with the chapter’s title, it describes a laboratory effort focused on creating and transforming data into resources that conform to LOUD standards. Initially, data was extracted from Salsah[227], a digital platform developed by DaSCH that stores images and metadata from the CAS photographic archives. This involved using a Python script to extract CSV, JSON and TIFF files. The process required specific image conversion steps – TIFF to JPEG2000 – to facilitate their distribution via SIPI, a IIIF Image API server developed by the DHLab. In addition, the IIIF Presentation API 3.0 and Change Discovery API 1.0 resources were generated by an early-stage micro service. In parallel with the development of Linked Art data, a workflow prototype[228] was created in collaboration with members of the University of Oxford, based on a preliminary JSON API designed for the PIA project. At the same time, a revised data model was developed to ensure compatibility with the requirements of DSP, setting the stage for the migration of (meta)data from Salsah. This foundational work enabled the subsequent updating of resources to conform to the Linked Art standard, taking advantage of the established workflow.

5.4.1.3 Review of a Large-Scale LOUD Implementation and Data Consistency within LUX

The aims of Chapter 8 are to analyse how the Yale University has rolled out the LOUD specifications as part of LUX, Yale Collections Discovery, and also to assess the consistency of Linked Art and IIIF resources within the LUX platform. These two objectives are depicted in Figure 5.4.

Research Outline for Reviewing LOUD Implementation and Data Consistency within LUX
Figure 5.4: Research Outline for Reviewing LOUD Implementation and Data Consistency within LUX

The first objective was to document the collaborative efforts of Yale employees involved in the development of LUX. To achieve this, I conducted both group and one-on-one interviews and recorded these discussions. These recordings were first transcribed using Whisper[229], an advanced speech recognition tool developed by OpenAI. To ensure accuracy, I meticulously checked and amended the transcriptions. I then performed thematic analysis on the transcripts, now formatted as CSV files, and consolidated the revised verbatim transcripts into a Markdown document. This extensive process provided the basis for writing a report comprising selected quotes, as well as an analysis using simple topic modelling to extract key themes (see Benabdelkrim et al., 2020), which provides distant reading insights.

The second objective was to verify the consistency of Linked Art and IIIF resources. Once the data had been collected, an effort was made templating the identified patterns by selecting various resources qualitatively and validating them quantitatively using a JSON validator. For IIIF resources, the Presentation API Validator[230] designed by IIIF-C was leveraged. All of this data, along with the scripts, was compiled and analysed onto a GitHub repository[231].

5.4.2 Overall Limitations

The overall limitations of this thesis cover several key areas, reflecting the constraints encountered in the study of LOUD.

The extent of initiatives and infrastructure supporting LOUD standards is another critical limitation. The variability in support and implementation across platforms and projects may affect the generalisation of the study’s findings, suggesting the need for broader engagement and standardisation efforts amongst CH practitioners. Moreover, the influence of platforms’ formatting power on the dissemination and reception of LOUD standards cannot be overlooked (Boullier, 2018), highlighting the need for a nuanced understanding of how platform-specific affordances shape scholarly communication and knowledge exchange. Moreover, according to (McCarty, 2023), at , the integration of LOUD within DH faces challenges despite its seemingly low-tech orientation in practice and tool adoption. This situation echoes McCarty’s discussion of the need for a , suggesting that while some of the tools developed by the wider IIIF community are accessible, they still require a significant level of technical expertise. It emphasises the importance of harmonising traditional scholarly methods with digital tools to enrich humanistic research, and urges a critical evaluation of the limitations and potential of digital tools to ensure that they complement rather than constrain scholarly inquiry.

Obtaining data from individuals and their CHIs through generic investigation echoes (Bourdieu, 1988)’s theory of fields[232] and therefore may not capture the full socio-technical landscape of LOUD ecosystems. This approach may overlook the wider range of actors and their interactions that are crucial for a holistic understanding of the impact of LOUD.

The implementation of LOUD standards and practices within the PIA project served as an experimental ground for testing the semantic model and data reuse in a participatory manner, revealing the gap between theoretical constructs and their practical implementation. This discrepancy underscores the challenges of deploying layered data models in real-world interfaces, and highlights potential limitations in empirical validation and UX insights.

Digital methods, while instrumental, may not fully capture the nuanced diffusion of ideas, a central tenet in ANT where non-humans are given agency. This gap points to the need for methodological innovations that can better account for the dynamics of idea diffusion within and across networks. In addition, the ‘zooming effect’ introduces methodological constraints, demonstrating that the depth of analysis is often limited by the computational resources available. This limitation affects the scale at which data can be analysed, potentially distorting the understanding of network dynamics and interaction patterns [see (Boullier, 2018)].

In summary, while these identified areas represent the primary limitations of this research, it is important to acknowledge that the range of limitations may not be exhaustive. The intrinsic multilayered nature of LOUD necessitates ongoing review and negotiation across modes of research that bridges theoretical frameworks with empirical realities.

5.4.3 Ethical Considerations

Moving on from exploring general limitations, it is equally essential to address the ethical considerations inherent in undertaking such research. The LOUD community, predominantly composed of members from the Global North and characterised by a lack of diversity in terms of gender and ethnicity, presents specific ethical challenges. This composition raises concerns about inclusivity and representation, and points to the need for conscious efforts to ensure that the development and application of LOUD standards do not inadvertently perpetuate existing inequalities or biases.

Furthermore, the dominance of privileged circles within the LOUD community underscores the importance of adopting ethical research practices that actively seek to amplify underrepresented voices. Engaging with a wider range of stakeholders, including those from marginalised communities, is crucial to fostering a more inclusive and equitable DH and CH landscape. This approach not only enriches research with diverse perspectives, but also aligns with broader ethical commitments to social justice and equality.

Ultimately, ethical considerations extend to the methods used to analyse and interpret data within LOUD ecosystems. Ensuring transparency, respecting privacy and acknowledging the limitations of one’s own research stance are fundamental to ethical scholarship. By addressing these ethical dimensions, the thesis aims to contribute thoughtfully and responsibly to the ongoing discourse around LOUD in the heritage domain and to advocate for practices that promote diversity, inclusivity and equity.

While this thesis navigates the realities of LOUD for CH and addresses overarching limitations and ethical considerations, it’s paramount to acknowledge the inherent privilege of conducting such research as PhD Candidate in DH in Switzerland. While this work aims to highlight and mitigate issues of inclusivity and representation, the ability to address all facets of marginalisation is limited. This recognition serves as a basis for advocating for a more inclusive and equitable approach in future research endeavours.

5.5 Contribution to the (Digital) Humanities

My thesis aims to contribute to the humanities, increasingly digital, by situating itself at the intersection of DH and STS. Addressing the disillusionment with digital methods identified by (Venturini et al., 2015, 2018), my research addresses issues such as the overly narrow yet ambitious notions of digital traces, the fluctuating levels of awareness and confidence in their production, the misconception of equating digital methods with mere automation, and the misplaced hope of easily applying traditional quantitative methods to digital traces.

CH resources, documented and preserved by cross-border memory institutions, are increasingly interconnected through digital platforms. The role of web architecture, in particular LOD technologies and henceforth standards conforming to LOUD design principles, is central in this context. The incorporation of shared methodologies, as seen in the IIIF community, exemplifies the socio-technical interactions that are central to the scholarly and heritage domains. These community-driven technologies facilitate not only the description and contextualisation of CH records, but also their integration into scholarly discourse, thus underlining the vital role of community practices and semantic interoperability in improving accessibility, interoperability and engagement within the humanities.

In an effort to transcend the conventional boundaries of academic dissemination, I have decided to produce both a PDF version of the thesis intended for the university’s repository and an HTML version of the thesis, and I plan to make the structured data of the dissertation available via Linked Art API endpoints, time permitting. This endeavour not only embraces the ethos of open scholarship and self-publishing as discussed by (Capadisli, 2020), but also ventures into experimental territories of knowledge representation. By considering diverse and niche use cases, this work seeks to extend the methodological framework of DH and encourage a broader and more nuanced engagement with LOUD.

6. The Social Fabrics of IIIF and Linked Art

This chapter unfolds as a foray into the social fabrics of the IIIF and Linked Art communities, and represents the first of three empirical chapters within this doctoral thesis. In this endeavour, I seek to disentangle the many threads that make up these communities, with the aim of identifying the essential strands that contribute to their collective weave. Drawing on [(Haraway, 2016) p. 116], this investigation explores the extent to which their fabric encompasses an entanglement of ‘people, critters and apparatuses’ [(Haraway, 2016) pp. 128-129], examining dependencies as highlighted by (Latour, 2022, p. 71). In doing so, I attempt to shed light on the interplay of relationships that form the backbone of IIIF and Linked Art, unveiling the complex tapestry of collaboration and interconnectedness that defines them.

With respect to scope, this chapter considers the socio-technical dynamics required to produce specifications that are aligned with the LOUD design principles. The central hypothesis is that although grassroots efforts to forge active communities require substantial and sustained commitment, these efforts typically succeed in the medium to long term. The analysis focuses on how consensus building and advocacy within the IIIF and Linked Art communities promote the adoption of LOUD. It also examines the role of community participation in the evolutionary adoption of these standards.

Most of the (technical) documentation related to this empirical investigation, such as Python scripts, raw data, and diagrams, is made accessible through GitHub[233]. This open repository serves not only as a resource for replicating and verifying the findings but also exemplifies a commitment to transparency, in keeping with the ethos of the IIIF and Linked Art communities.

The chapter begins in Section 6.1, where I delve into some of the foundations of the IIIF and Linked Art communities. The first objective, concerning consensus building and advocacy within these digital communities, is addressed in more detail in Section 6.2. This section unpacks the collaborative strategies and initiatives that are central to promoting standards adoption and community cohesion. Following this, Section 6.3 focuses on the second objective by analysing community participation and adoption of LOUD standards.

In Section 6.4, the chapter synthesises the findings from the previous analyses, weaving together the threads of the research to provide a comprehensive overview of the terrain explored. The chapter concludes in Section 6.5, where reflections and forward-looking perspectives consider the wider implications of the study’s findings for the future of the IIIF and Linked Art communities, anticipating the challenges and opportunities on the horizon.

6.1 Exploring the Foundations

The core objective of the IIIF and Linked Art communities is the creation and maintenance of specifications. Through these specifications, which act as mediators, there is the potential for a harmonious symphony of technology and purpose. At the heart of this ecosystem is the API, mediated by the JSON-LD's API @ context, which provides clarity and cohesion to the standards being developed and implemented. Validators ensure compliance with the specifications, with software conforming to these standards and encapsulating the layered abstractions that define the technological landscape.

For this potential to be fully realised, however, it is imperative that those within these communities possess resources that not only comply with these shared APIs, but also embody them. This embodiment is facilitated by an underlying infrastructure that includes servers, scripts, micro-services and other technological components, as well as viewers, players and browsers that render compliant (meta)data.

In this respect, while specifications serve as the foundational goal, the broader vision of achieving harmony through technology and purpose requires the active participation of community members. They are tasked with fostering an ecosystem where specifications are not only developed and maintained, but brought to life through practical implementation. This requires a commitment to true collaboration within and across community boundaries, and a concerted effort to implement best practices, especially in user interface design. Such a commitment ensures that the integration of vision and execution is not just a theoretical ideal, but a lived reality that reflects the shared aspirations of both IIIF and Linked Art and their respective constituencies.

The complexity of the IIIF community provides a rich area for exploration, in contrast to Linked Art, where community interactions are more streamlined and occur primarily through fortnightly calls[234]. This complexity is vividly illustrated in Figure 6.1, which describes the range of entities within the framework. The decision to focus on IIIF is motivated by its rich constellation of committees, working groups and supporting institutions, providing a deeper dive into the development of digital standards in a robust community setting.

Overview of the IIIF Community
Figure 6.1: Overview of the IIIF Community

In the diagram, a colour scheme helps to identify the different components of the IIIF ecosystem: entities in purple represent people, resources and institutions that are part of the IIIF-C. Light pink highlights the committees associated with the consortium, reflecting the structural elements that support the governance and operational dynamics of the consortium. Soft blue denotes the community committees, including the CoCo and the Editorial Committee, which are central to guiding community initiatives and maintaining specifications. Green identifies the community and TSGs, highlighting the collaborative spirit that drives the development of IIIF. Finally, yellow identifies essential resources and tools, such as the specifications or cookbook recipes, highlighting the practical outputs that enable the adoption and implementation of the APIs.

This visual representation serves not only as a guide to the organisational structure of the IIIF community, but also as a representation of the vibrant interaction between disparate entities. It highlights the integral role of both the formal IIIF-C bodies and the wider community in fostering a culture around interoperability, facilitating collaboration and sharing best practice. This comprehensive approach to community engagement and standards development underscores the concerted effort needed to move the digital scholarship and CH fields forward. The interconnectedness depicted underscores the role of essential platforms, facilitating their evolution. For instance, both the IIIF and Linked Art communities maintain accessible Google Drives, where meeting minutes on Google Docs, often linking to GitHub issues, underscore the technical discourse within these initiatives. In addition, the use of dedicated Slack workspaces for continuous conversations further reinforces the importance of regular engagement.

The following sections explore in more depth the processes of consensus building and the advocacy initiatives as well as the engagement of the community that not only promote adherence to specifications, but also encourage best practices. Such endeavours are foundational to fostering a LOUD ecosystem characterised by community-driven commitment to a few standards, yet embracing a multitude of implementations. This rich tapestry of connected knowledge and practices ensures the digital heritage domain is both universally accessible and deeply engaging.

6.2 Consensus Building and Advocacy Initiatives

The analysis examines the mechanisms of capacity building and advocacy within the IIIF and Linked Art communities, focusing on three key areas.

Initial research centres on 6.2.1, exploring the different types of meetings that play a crucial role in both community engagement and the wider collaborative process, with illustrations from the IIIF Discovery TSG and Linked Art meeting minutes. Attention then shifts to 6.2.2, highlighting the role of GitHub in facilitating code and use case documentation, particularly within TRC, as a testament to the community’s collaborative spirit and commitment to transparency. Finally, 6.2.3 examines community guidelines and best practice efforts, discussing how each community provides resources that transcend the standards, such as the IIIF cookbook recipes and Linked Art patterns, to guide and encourage further community contributions and foster interoperability.

6.2.1 Types of Meetings and Engagement Opportunities

Meetings within the IIIF and Linked Art communities are essential for collaboration, sharing insights and steering the direction of projects and standards. These predominantly virtual meetings are orchestrated by designated chairs and follow structured agendas to maximise productivity and inclusivity. IIIF meetings vary in frequency - fortnightly, monthly or quarterly - reflecting the dynamic needs of different parties. Linked Art, which maintains a consistent rhythm, meets fortnightly. Each meeting begins with a roll call, which allows participants to register their presence or send their regrets in advance, a gesture that underlines their interest even if they are absent. To further enhance the collaborative nature of these discussions, one or more participants are asked to take the role of note-taker for each meeting.

A foundational element of these meetings is the ongoing, mostly asynchronous, collection of use cases, primarily facilitated through GitHub. Although the initial sessions heavily focus on gathering these use cases, the practice continues across all stages of development and discussion. Each meeting is structured around a predefined agenda, ensuring focused and productive sessions that typically span one hour. Additionally, meetings are announced at least a day in advance. This process ensures that discussions about the implications of use cases and their potential integration into documents – such as specifications, early drafts or recipes – are grounded in a broad and inclusive collection of community inputs, pending subsequent approval or implementation.

In addition to virtual sessions, face-to-face meetings provide invaluable opportunities for deeper engagement. Within IIIF, the annual conference acts as a catalyst for stakeholders to meet – whether as part of the official programme or in parallel sessions. In addition, ad-hoc in-person meetings have also been instrumental in driving significant progress, such as the inclusion of audiovisual resources in the Presentation API 3.0 and the ongoing efforts to include 3D materials in the forthcoming version of this specification.

The Linked Art initiative, although newer, has held five major face-to-face meetings, three before the COVID-19 pandemic and two in 2023. The fourth meeting, held in New Haven, CT, USA, focused on resolving outstanding issues to finalise the first version of the standard. The practical implications of developing the Linked Art API, alongside the deployment of the LUX platform, illustrate the tangible outcomes of these meetings. Many discussed issues were long-standing, exacerbated by the pandemic. For instance, discussions around rights statements[235] and API Displacements and/or Transclusions[236] have highlighted the evolving requirements and challenges faced by the community. It also became clear that discussions on identity – encompassing nationality, gender, and ethnicity – had matured, enabling new insights and considerations. These discussions are delicate, often challenged by inherent biases and assumptions in metadata assignment, highlighting the need for a more inclusive and diverse approach to identity representation within CH.

Furthermore, while discussions about extending the CIDOC-CRM hierarchy to include new properties or classes - such as the Transfer class and the inclusion of non-human actors - take place within the broader CRM-SIG, Linked Art approaches these considerations from a different perspective[237]. Unlike CRM-SIG, where ontological purity is a major concern, Linked Art prioritises more practical applicability and flexibility in modelling CH data. This leads to decisions where Linked Art has, for pragmatic reasons, created its own classes and leveraged existing classes designed for human entities to encompass non-human actors. This approach, while departing from strict ontological standards, aims to address immediate modelling needs within the community, recognising that it may occasionally compromise semantic interoperability for the sake of broader and more inclusive data representation practices.

Transitioning from these theoretical and practical considerations, I delve into detailed case studies of community interactions as this section explores specific series of meetings by looking at the notes taken during them – in 6.2.1.1 meetings from the IIIF Discovery TSG[238], a completed IIIF group, and in 6.2.1.2 the ongoing fortnightly meetings of the Linked Art initiative.

This exploration provides a lens into the number of meetings, the people involved and the outcomes that shape the evolving landscapes of both communities. Meeting minutes from both groups were extracted using a tool developed by (Bainbridge, 2021) to streamline the process of extracting texts from Google Docs into organised spreadsheets. This data then underwent a cleaning and curation process to ensure accuracy and consistency for analysis and visualisation[239].

6.2.1.1 IIIF Discovery TSG Meetings

The IIIF Discovery TSG formally ended its work in summer 2022, but continued to meet several times until March 2023. The purpose of the Discovery TSG was to address the critical need for interoperable resources to be easily discoverable. Recognising the importance of standardised discovery processes, the group aimed to develop specifications to improve the findability and reusability of IIIF resources. This initiative focused on adopting existing techniques and tools, facilitating their widespread use within the community, and supporting the creation of infrastructure such as registries. Key deliverables from the group comprise the Change Discovery API and the Content State API.

In addition to these APIs, the TSG focused its efforts on producing essential documentation and support materials to facilitate the wider adoption and implementation of these standards. This includes the creation of a Registry of (Metadata) Profiles[240], which serve as a mechanism for discovery-related metadata profiles to leverage through the seeAlso property in IIIF Manifests. A note on SEO guidelines was also produced[241], with a view to improving search engine ranking of web pages containing IIIF resources.

Figure 6.2 shows that 84 people, represented as individual horizontal bars, attended a total of 101 Discovery TSG meetings. Each person attended an average of 10.49 meetings, but the median attendance was 2. This illustrates a significant imbalance in participation: a small core group was highly active and contributed to the majority of discussions, while a larger number of participants attended less frequently.

Stacked Bar Chart Showing IIIF Discovery TSG Meeting Attendance
Figure 6.2: Stacked Bar Chart Showing IIIF Discovery TSG Meeting Attendance

A closer look at meeting attendance over the years reveals a gradual decline in participation, providing insights into the operational dynamics of the Discovery TSG and engagement trends. Initially, in 2017, there were 20 meetings with 42 unique participants and an average attendance of 12.35 per meeting. In 2018, while the number of meetings increased to 22, the number of unique participants decreased to 35 and the average attendance decreased to 8.73.

In 2019, the number of meetings decreased to 18 with 26 unique participants, and the average attendance further decreased to 7.78. The trend continued in 2020 with 15 meetings, 20 unique participants and an average attendance of 8.2. In 2021, there were 17 meetings with a slightly higher pool of 23 unique participants and an average attendance of 7.94.

The most significant decline occurred in 2022, when only 6 meetings were held, with 13 unique participants and an average attendance of 5. This decline continued in 2023, with only 3 meetings, 7 unique participants and an average attendance of 4.67. This significant reduction in engagement can be explained by the fact that the majority of the group’s objectives and outputs had been achieved by the summer of 2022, leading to a natural reduction in the need and frequency of meetings as the project neared completion.

The data also highlights the centrality of github.com in discussions and collaborations, as evidenced by over 400 hyperlinks to GitHub issues, pull requests and other GitHub-related content mentioned during meetings. In addition to GitHub, other frequently referenced domains include iiif.io with 169 mentions and docs.google.com with 139 mentions, reflecting the use of official documentation and shared documents for referencing standards. In addition, www.w3.org was mentioned 60 times, demonstrating its relevance for W3C standards aligned with IIIF practices. This frequency underscores GitHub’s essential role as a central platform for facilitating communication and project development within the IIIF Discovery TSG, alongside these other critical resources.

As the Discovery TSG wrapped up its active phase, its contributions to the IIIF community proved significant, setting critical standards for interoperability. However, the full impact of its work remains to be seen. To date, only a limited number of institutions have implemented the Change Discovery API, and while some viewers support the basics of the Content State API, the full intent of the standard has been realised predominantly through demo clients. This scenario underscores the ongoing challenge within the community: not only to develop these capabilities, but also to deploy and use them widely.

6.2.1.2 Linked Art Meetings

Now transitioning into the Linked Art meeting minutes, some parallels with the IIIF Discovery TSG meetings can be observed, yet with distinct dynamics reflective of its ongoing activity. This analysis provides a snapshot of the Linked Art meetings from January 2019 to March 2024. The group convenes fortnightly to deliberate on refining the model and the API, with a strong emphasis on releasing the first stable version and integrating the patterns and specification into various platforms.

Figure 6.3 indicates participation trends over this period, showing that 130 different individuals attended the 115 meetings held, predominantly in a virtual format, with five of these meetings being face-to-face[242]. Each participant attended an average of 13.57 of sessions, but the median attendance was only 2, indicating a very long tail of participation. This suggests a core group of highly engaged members who contribute consistently, while a larger number of participants engage more sporadically. This distribution highlights the challenges and successes of sustaining engagement over time, particularly in a volunteer-driven context where continuous engagement can be difficult to maintain.

Stacked Bar Chart Showing Linked Art Meeting Attendance
Figure 6.3: Stacked Bar Chart Showing Linked Art Meeting Attendance

A detailed examination of meeting attendance over the years for the Linked Art initiative reveals an active yet fluctuating participant engagement. In 2019, the year began with 26 meetings and included 53 unique participants, each attending an average of 17.88 sessions. This high level of involvement suggests robust initial interest and active participation in the foundational phases of the initiative.

However, subsequent years showed a slight decrease in engagement: 2020 hosted 23 meetings with 34 unique participants and an average attendance dropping to 15.48. In 2021, the meetings slightly increased to 25 with 48 unique participants, averaging 13.92 attendees per session. The year 2022 saw 19 meetings with 46 participants and an average of 14.74, showing a stabilisation in engagement levels despite fewer meetings.

The year 2023 continued with 18 meetings, and an increase in unique participants to 57, suggesting a widening of the community base, with an average attendance of 14.11 per meeting. The early data for 2024 shows 4 meetings with 25 unique participants and a relatively high attendance rate of 15.25, indicating sustained, albeit selective, engagement.

The data also underscores the significance of github.com as a pivotal platform for discussion and collaboration within the Linked Art meetings, with over 431 links pointing to GitHub highlighting its use for project management and technical discussions. Other frequently mentioned domains include docs.google.com with 35 references and linked.art with 44 mentions, reflecting their roles in documentation and project-specific discussions, respectively. Domains such as www.getty.edu and vocab.getty.edu were also significant, cited 10 and 9 times respectively, indicating the integration of controlled vocabularies in standardising terminologies within the Linked Art context.

While the Linked Art group originally had optimistic goals of releasing the first stable version of the specifications by 2020 and then 2021, the actual release is now expected to occur in 2024. This extended timeline, while longer than originally planned, has had an unforeseen benefit: it has allowed various platforms to experiment more extensively with implementing the evolving model, even in the absence of a finalised standard. Such experimental implementations have not only helped to refine the specifications through practical application, but have also fostered a richer understanding among actors of how the model can be adapted and used. Although Linked Art focuses primarily on museum artworks and their relationships, treating them as first-class citizen resources within the digital representation sphere, there is an observable shift towards embracing the broader GLAM domain. This broadening of scope suggests significant potential for Linked Art to impact a wider range of CHIs, improving cross-domain semantic interoperability while maintaining a hands-on orientation.

6.2.2 Collaborative Development through GitHub Documentation

Within IIIF and Linked Art, GitHub has solidified its position as a central hub for collaboration across both communities, further illustrating the pragmatic approach adopted towards research and development (see Scroggie et al., 2023). This preference for GitHub, complemented by the use of universally accessible tools like Google Drive for documentation, illustrates a pragmatic shift towards platforms that facilitate broad participation and collaboration. Such practices are integral to the collaborative process, enhancing transparency, and efficiency in addressing and resolving issues (Escamilla et al., 2023).

Figure 6.4 illustrates the journey from a conceptual use case to its technical realisation within both the IIIF and Linked Art communities, highlighting the shared methodologies despite the different governance structures. The diagram begins with the community’s identification and documentation of a use case, which leads to the creation of a corresponding GitHub issue. This issue acts as a nexus for collaborative discussion, leading to the creation of a GitHub pull request proposing specific changes or additions to the specifications. The figure highlights the critical role that community editors and reviewers play in this sequence, and the collaborative vetting process that ensures that proposals are consistent with the technical and philosophical ethos of the communities. For IIIF, this process extends to formal review mechanisms involving the TRC, which ensures adherence to established standards and eventually validation against a validator if needed. In contrast, Linked Art, while using a similar collaborative and open platform for discussion and documentation, manages this process without a formal TRC, relying on broader community engagement and consensus, especially from its Editorial Board.

Illustrative Diagram of the GitHub-Based Process for Approving and Integrating Use Cases
Figure 6.4: Illustrative Diagram of the GitHub-Based Process for Approving and Integrating Use Cases

Figure 6.5 details the process undertaken by the IIIF community to review and integrate a specific GitHub issue[243], the validation of the Image and Presentation APIs 3.0 that happened in May 2020[244]. This detailed illustration shows the relationships and interactions between the various actors, and highlights the structured approach to consensus building, specification development and implementation by compatible software. The process begins with a human collective, such as the IIIF Editorial Committee, identifying and proposing changes, which are then carefully discussed and voted on by TRC members using platforms such as GitHub and Zoom. The involvement of the TRC is critical, providing the formal review and approval necessary for the adoption of changes, marked by key milestones such as the release of the API 3.0 Release Candidate. This figure emphasises the essential role of compliance and compatibility checks against the JSON-LD API @ context and the relevant validators, ensuring that updates are seamlessly integrated into the wider IIIF ecosystem, supporting a wide range of compliant clients.

Detailed Illustration of the GitHub-Enabled Approval Process by the TRC for the Image and Presentation APIs 3.0
Figure 6.5: Detailed Illustration of the GitHub-Enabled Approval Process by the TRC for the Image and Presentation APIs 3.0

The approval of both specifications by 30 members of the TRC marks a significant milestone, reflecting a substantial level of consensus for such a key initiative. While this level of participation is remarkable in the context of other use cases, it also demonstrates the difficulty of achieving widespread participation. The fact that not all TRC members were present or voted on these crucial specifications, and that not every institution within the IIIF-C consistently sends a representative, suggests areas for improvement in engagement and representation within the decision-making processes of the community[245].

Despite these challenges, there’s a great deal of trust within the community in those who manage the development and implementation of the specifications. Many institutions and developers are diligently adapting their content and software to meet or upgrade to the latest specifications. However, without a concentrated authority to enforce compliance, deviations occur. These range from instances where cross-institutional interoperability remains achievable despite unorthodox approaches, to scenarios where outright anti-patterns are deployed, underscoring the diverse adherence to the APIs[246].

While there is no of implementations within the IIIF or Linked Art community, understanding and addressing these deviations is crucial for the ecosystem’s health. Moreover, as both communities maintain open and shared specifications, the responsibility falls on the community itself to oversee and assess how the standards are leveraged across different settings.

6.2.3 Community Guidelines and Best Practice Efforts

As highlighted in the literature review, IIIF and to some extent Linked Art publish guidelines, primarily on their website. These resources are designed to facilitate the discovery of standards-compliant resources across different platforms and to assist individuals, projects and organisations in implementing these standards. This dissemination of knowledge plays a key role in enabling a wide range of users to effectively engage with and use IIIF and Linked Art specifications, in many cases mediated by external interfaces.

These best practices are critical as they demonstrate the full potential of the specifications and, in the case of Linked Art, demonstrate the full promise of the model. However, when moving from one client to another, one often encounters BOs, as parts of a resource may be supported differently or not at all (see Raemy, 2022). In the context of Linked Art, starting with the model before considering the implementation of API endpoints can lead to complex situations. This complexity arises when considering cases of transclusion, as the rigidity of the API requires careful consideration of how different aspects of the model are represented and accessed across endpoints, necessitating a nuanced approach to integrating the model with API capability.

To further understand the nuances of these guidelines and best practices, in March 2024 I undertook an in-depth exploration of the IIIF cookbook recipes in 6.2.3.1 and Linked Art patterns in 6.2.3.2. This exploration was done to be reasonably grounded in SNA using NetworkX and Gephi Lite, allowing commonalities within this set of guidelines to be discovered and visualised.

6.2.3.1 IIIF Cookbook Recipes

Figure 6.6 illustrates a network derived from the IIIF cookbook recipes and their support across various viewer clients, including UV, Mirador, Annona, Ramp, Clover, and the navPlace Viewer[247]. In this graph, nodes represent either specific IIIF Cookbook Recipes, identified by unique slugs and labelled with descriptive titles, or viewer clients, serving as the primary agents interacting with these recipes. The edges between nodes are colour-coded to signify the degree of support – ‘yes’ for full support or ‘partial’ – for each recipe by the respective client.

Exploring IIIF Cookbook Recipes by Support with Gephi Lite
Figure 6.6: Exploring IIIF Cookbook Recipes by Support with Gephi Lite

The allocation of node colours, predicated on their modularity class, categorises the recipes, elucidating their thematic relevance and grouping within the IIIF ecosystem. This classification sheds light on the recipes’ roles and their significance to the community, from simple manifest creation to more complex implementations like audio, video, and multi-volume work representations. The size of each recipe node quantitatively reflects the extent of client support, visually emphasising recipes that have garnered widespread adoption or those that may require further advocacy. In a more analytical lens, the graph unveils the pivotal roles of Mirador and UV within the framework, underscoring their centrality and indispensability in the broad spectrum of implementing the core APIs. Concurrently, it delineates specific trajectories, such as Annona and Mirador’s inclination towards annotations, and Ramp’s association with audiovisual content.

In terms of JSON Path analysis, a thorough examination was conducted to discern patterns and insights regarding the structure and use of properties across the IIIF Cookbook Recipes. Although this analysis was meticulously performed, it did not yield any particularly striking results or unexpected insights[248].

For content similarity, leveraging the cosine similarity scores from the analysis, two pairs of recipes emerged as notably identical, showcasing the closest thematic and content-based connections within the dataset. The recipes ‘Embedding HTML in descriptive properties’[249] and ‘Rights statement’[250] shared the highest similarity score of 0.672318, indicating a significant overlap in their content and approach. This high degree of similarity suggests that these recipes, while distinct in their focus, employ comparable structures or narratives. Another pair, ‘Support Deep Viewing with Basic Use of a IIIF Image API Service’[251] and ‘Image in Annotations’[252], also demonstrated a notable similarity score of 0.541640.

Additionally, the examination sheds light on the significant re-use of fixtures or content links across the recipes, which underscores the foundational role of these materials within the IIIF ecosystem. The repeated utilisation of specific resources, such as the image captured during the 2019 IIIF Conference in Göttingen, Germany[253], which appears 41 times across various guides, exemplifies how readily available and central these resources have become in demonstrating IIIF capabilities. Similarly, audio and video fixtures from Indiana, such as the Mahler Symphony Audio[254] and the Lunchroom Manners Video[255], highlight the framework’s adaptability and richness in managing a diverse range of multimedia content. However, it also highlights a certain tendency towards using readily available or well-known resources, often those directly provided by the IIIF community or known to be in the public domain. This practice, while practical, sometimes leads to a lack of diversity in the resources used, underscoring an opportunity for the IIIF Cookbook authors to explore and integrate a wider array of materials.

6.2.3.2 Linked Art Patterns

Turning the analytical gaze to Linked Art, I extend my exploration to the patterns that underpin this vibrant ecosystem. This progression makes it possible to discern the web of relationships that animate the Linked Art data model, mirroring here the methodological approach through graphs with the IIIF cookbook recipes.

Figure 6.7 presents a graph where nodes symbolise Linked Art patterns[256]. The edges in this visualisation represent the relationships derived from the patterns’ data, notably in terms of model components and their (sub)category. The colour of each node is determined by the pattern’s category, signifying the thematic or functional group to which it belongs. Meanwhile, the edge colour is informed by the source nodes, providing insight into the directionality and origin of each relationship.

Exploring Linked Art Patterns by Category with Gephi Lite
Figure 6.7: Exploring Linked Art Patterns by Category with Gephi Lite

A diverse array of 136 patterns emerges, categorised under themes such as [257], which serve as the foundation for resource description and interaction. The analysis identifies a comprehensive list of patterns, each addressing specific aspects of CH data modelling. Notably, some patterns function as sub-patterns, inheriting their parent name as one of their categories. Exploration of pattern similarity within the Linked Art model reveals noteworthy connections[258], reflecting the integrated and comprehensive nature of the model. While these findings underscore the versatility and depth of the framework, its true value extends to practical applications. The insights gained from these similarities provide a rich resource for optimising Linked Art implementations in real-world scenarios, suggesting a broad potential for reuse in various CH contexts.

It’s important to recognise that, like the IIIF cookbook recipes, this graph represents a snapshot in time, reflecting the entities and relationships documented up to a certain point. While this visualisation provides a window into the modelling efforts within Linked Art, other examples using different entities could be explored, recognising the dynamic and evolving nature of the initiative.

6.3 Community Participation and LOUD Standards: Adoption and Impact

This section addresses the objective of examining community participation, alongside the evolutionary adoption of LOUD standards and their potential impact with respect to scientific movements and principles. In pursuit of this objective, data were collected from adoption surveys conducted within the IIIF community, enriched further by insights drawn specifically for this study on the socio-technical characteristics of both IIIF and the Linked Art community. This comprehensive dataset (see Raemy, 2023) serves as a foundation for our exploration, offering a unique lens through which the dynamics of community engagement, technological adoption, and the overarching influence of LOUD standards.

Following the publication of a detailed survey report on the characteristics of IIIF and Linked Art (see Raemy, 2023), this section presents a comprehensive summary of its findings. The survey has provided insightful revelations into the dynamics of community engagement, preferred practices and tools, barriers to wider participation and the resulting implications for the CH sector. These findings underscore the central role of open standards and collaborative efforts in shaping practices, and provide a nuanced understanding of how these communities contribute to the development and adoption of LOUD standards.

Conducted between March and May 2023[259], the online survey attracted 79 participants from 20 countries[260]. It was targeted to individuals involved in IIIF and Linked Art, but also people that have heard of the communities or the specifications. The survey’s structure included sections tailored to gauge involvement in either or both communities, with branching logic to ensure relevance to participants’ experiences. It captured data on socio-demographics, organisational affiliation, and the extent of participants’ involvement in community events, activities, and calls, offering a window into the communities’ composition and engagement patterns.

Results highlighted a global distribution of participants, with a noticeable concentration in the Global North, and indicated a wide range of engagement levels, from passive awareness to active participation in community events and activities. The findings underscore the communities’ commitment to fostering global collaboration, despite challenges such as geographical and institutional representation biases. The survey serves as a foundational effort to map the landscape of these communities, identifying areas for growth and increased inclusivity.

The survey findings reveal a core group of individuals actively involved in both the IIIF and Linked Art communities, underscoring the depth of engagement among key members. This involvement reflects a collaborative spirit and a shared dedication to advancing the digital cultural heritage landscape.

Figure 6.8 illustrates the dynamics of this engagement within the IIIF community, mapping out the timeline of involvement and highlighting the active participation in community calls. It shows the year of IIIF involvement along the x-axis against participant IDs on the y-axis, representing engagement through point size for IIIF call participation over the past year. It highlights a pattern of greater participation from members involved since 2016 or earlier, with one instance of engagement traced back to 2009, demonstrating the impact of long-term involvement and hinting at early contributions to the IIIF community’s evolution. Participants who are only involved in the Linked Art community are not included in this plot.

Year of Involvement in the IIIF Community
Figure 6.8: Year of Involvement in the IIIF Community (Raemy, 2023, p. 20)

Following on from the visualisation of IIIF's community engagement, the responses of 52 participants are analysed in Figure 6.9, which situates IIIF's contributions within broader scientific and data governance movements. The data reveals a strong recognition of IIIF's pivotal role in several areas: a substantial majority affirm its importance to Open Science and Citizen Science initiatives, illustrating its far-reaching benefits. Participants also highlighted IIIF's substantial contribution to the FAIR principles, particularly in the areas of interoperability and reusability, demonstrating its effectiveness in promoting a more connected and accessible digital research environment. However, further research is needed to fully assess IIIF's adherence to the CARE Principles for Indigenous Data Governance, suggesting a potential area for future developmental focus.

Situating IIIF to Movements and Principles
Figure 6.9: Situating IIIF to Movements and Principles (Raemy, 2023, p. 21)

Shifting the emphasis to Linked Art, analysis of the engagement of 16 participants in Figure 6.10 provides a nuanced view of the role of Linked Art in relation to the same scientific and governance movements. Although the pool of participants is smaller, resulting in a more limited data set, the responses still provide useful insights. Participants recognise Linked Art’s potential contributions to Open Science and Citizen Science, albeit to a lesser extent than IIIF. In addition, the feedback highlights Linked Art’s strong capabilities in interoperability, in line with the FAIR principles, particularly in the areas of discoverability, accessibility and reusability. These findings highlight the need for further empirical studies to validate and refine Linked Art’s alignment with CARE, and provide a clear direction for future research and community engagement strategies.

Situating Linked Art to Movements and Principles
Figure 6.10: Situating Linked Art to Movements and Principles (Raemy, 2023, p. 22)

Following the survey’s broad findings on participant distribution and engagement, we transition to a focused exploration of the IIIF and Linked Art communities’ characteristics. Through this research, I seek to gain a better understanding of the contributions and challenges faced by these communities.

6.4 Synthesis and Insights

This section synthesises the present chapter and provides some and insights that can be learned from the social fabrics of IIIF and Linked Art.

Behind the visible activities like meetings lies a preponderance of preparatory work managed by co-chairs, editorial boards, or driven by community-generated use cases. This foundational work often predetermines the direction and outcomes of formal gatherings. Meeting patterns indicate initial high attendance, which peaks at certain project milestones before declining. Scheduling meetings to accommodate broader global participation, particularly from Asia and Oceania, should be considered to enhance inclusivity, although the primary interaction remains asynchronous.

GitHub serves as a critical hub for community engagement, with a core group of active contributors who often consistently attend meetings. This platform lowers the barrier to decision-making within the community, although it also reflects biases similar to those found in FLOSS communities in terms of inclusiveness. The assertion that ‘standards are made by the people who show up, and not nearly enough people are showing up’ as posited by (Cramer, 2016), underscores the critical role of active participants in shaping these standards. However, when individuals from privileged backgrounds wield disproportionate influence in these participatory roles, the issue of over-representation comes into sharp focus.

This demographic homogeneity can give rise to several challenges, including the perpetuation of biases and the neglect of pivotal issues of relevance to underrepresented or marginalised groups. Such imbalances can lead to standards that inadequately reflect the comprehensive spectrum of societal needs and realities. Furthermore, participation in these standardisation processes is itself a privilege, a point underscored by (Capadisli, 2023), suggesting the need for greater inclusivity in these influential roles. Moreover, the assumption that internet access and digital devices are universally available is critically examined in works such as (Czahajda et al., 2022), which uses ANT to reveal key actors in the digital landscape. This echoes the issues of the IIIF community, where the generation of IIIF resources presupposes means that may not be accessible to all, particularly outside the Global North and among non-English speakers.

The IIIF Cookbook recipes and Linked Art patterns demonstrate the tension between the creation of advanced specifications and their practical implementation by platforms. While new recipes and patterns continue to emerge, actual implementation in software and tools lags behind, reflecting a gap between ideation and practical application. For instance, the IIIF community is actively developing customised tools, particularly for annotations, aimed at catering to users with advanced needs. These tools are designed to enhance usability and functionality but have yet to achieve widespread deployment. This ongoing development indicates that the community is still navigating the optimal routes to broad adoption and interoperability, striving to balance sophisticated requirements with general usability. As another example, the question of how digital frameworks represent non-human entities remains a point of discussion within Linked Art, reflecting wider societal challenges in ensuring that diverse perspectives are included in digital and cultural narratives.

Survey findings also underscore the need for ongoing efforts to develop LOUD standards that foster an inclusive, dynamic digital ecosystem. Future strategies should include creating educational resources and frameworks that support interdisciplinary collaboration and reduce barriers to participation.

In summary, as the IIIF and Linked Art communities progress, their continued relevance will increasingly depend on their ability to embrace diverse perspectives and adapt to changing technological and cultural landscapes. The concept of sympoietic collaboration, inspired by Donna Haraway’s ideas in (Haraway, 2016) [pp. 86-87], illustrates this path forward. ‘Sympoiesis‘, or making-with, in contrast to ‘autopoiesis’, or self-making, portrays these communities not just as isolated entities but as interdependent networks. This interdependence must extend to incorporating new voices into ideas, meetings, and implementations, fostering vibrant ecosystems where diverse contributions enrich the shared narrative. By prioritising mutual collaboration over individual autonomy, sympoietic collaboration aims to enrich the field, ensuring that digital heritage remains dynamic and inclusive. Pragmatically, this discussion will be continued in the following section, where I will explore specific perspectives and suggest potential pathways forward.

6.5 Perspectives

This section reflects on the empirical findings of this chapter and outlines prospective avenues for both the IIIF and Linked Art communities, emphasising strategic steps forward in community engagement, technical innovation, and culminating in discussing integration practices.

6.5.1 Enhancing Community Engagement

Both the IIIF and Linked Art communities face challenges in transforming increased membership into active participation. Despite a tenfold increase in members on platforms like IIIF Slack workspace over the past 7-8 years, this has not translated into proportional growth in contributions. Advocacy efforts need to focus on making these platforms more welcoming and integrating passive members into active roles. One of these efforts in the IIIF community is the Ambassadors programme[261], which could be expanded. For Linked Art, it is essential to increase engagement through clear role definitions and regular interactive sessions that go beyond the technicalities of the model, including workshops and webinars. These efforts should aim to demystify the participation process and encourage more diverse community involvement.

While the primary tools for collaboration in the IIIF and Linked Art communities are hosted by GAFAM – namely GitHub and Google Drive – there are growing calls to consider alternatives that may better align with open-source values and data sovereignty concerns. Codeberg[262], as an alternative to GitHub, represents one such possibility. However, transitioning to new platforms could be a significant undertaking, requiring considerable effort and adaptation. Exploring these options could help mitigate the risks associated with reliance on major tech conglomerates and foster a more decentralised approach to digital resource management, though the change would be quite tremendous and involve significant logistics.

6.5.2 Technical Development

The technical backbone of both the IIIF and Linked Art communities relies on the continuous development and maintenance of essential tools.

For IIIF, this includes servers and clients such as Cantaloupe, Mirador, and UV. These tools not only need regular updates to stay compliant with evolving standards but also require careful management of dependencies, such as upgrading to the latest versions of underlying technologies like React for Mirador, to avoid security vulnerabilities (see White, 2023). Similarly, Linked Art’s integration efforts must focus on enhancing interoperability with broader standards like RiC and Schema.org, which will improve its usability across various DAMs. This effort should be supported by the development of targeted guidelines that cater to different technical levels and user groups, and the creation or availability of comprehensive workflows and tools essential for the uptake of the standard.

These measures are crucial for ensuring the broader adoption and sustainability of IIIF and Linked Art. By keeping technological infrastructure up-to-date and secure, and by enhancing interoperability with other standards, the communities can accommodate a wider range of platforms and users. Additionally, providing clear guidelines and accessible tools is essential for lowering entry barriers, thereby facilitating a more widespread and effective implementation of these standards across diverse environments.

6.5.3 Integration Practices

Robust governance is essential for sustaining the technical infrastructure and community cohesion in both IIIF and Linked Art. Establishing diverse expert governance bodies can guide strategic developments and ensure adherence to best practices. These bodies should also address the preservation of institutional knowledge to maintain continuity amidst changes. The role of initiatives like iiif-commons[263] is critical here, as they lower barriers to entry and support the development of both general-purpose and specialised tools, accommodating a wide range of community needs[264].

LOUD entities, in the context of this analysis, can be perceived as ‘[a construction of] new properties to perceive in the world’ [p.251 (McCarty, 2023) citing (Anderson, 2014)], suggesting that these entities do more than just exist; they actively shape our understanding and engagement with CH. By rearranging data and metadata in unprecedented ways, LOUD instances allow us to see connections and interactions that were previously obscured. This conceptual framework not only enriches our interaction with digital archives, but also challenges us to rethink how these technologies can be used to foster a more inclusive and dynamically interconnected cultural landscape.

Ultimately, the synthesis of insights from the IIIF and Linked Art initiatives provides a robust framework for future growth. By fostering an environment that encourages sympoietic collaboration and integrates diverse perspectives, these communities can continue to thrive and lead in the creation of open ecosystems. Integrating new voices and adapting to the shifting technological landscape will be crucial for the continued relevance of the IIIF and Linked Art communities.

The following two chapters look at the practical implementation of their respective APIs in a research project and on larger systems, illustrating their versatility and implications in a number of different settings.

7. PIA as a Laboratory

[D]igital environments are, of course, themselves socio-technical assemblages, with agency, affective and material qualities. (Edwards, 2011, p. 48)

This chapter focuses on an empirical exploration of the implementation and evaluation of LOUD standards within the PIA project, which runs from February 2021 to January 2025. Specifically, the discussion covers the refinement of the data model, the creation of bespoke IIIF-compliant resources, Linked Art templating and the technological framework to undertake the digitisation, cataloguing and indexing of the associated CAS photographic collections (SGV_05, SGV_10 and SGV_12).

This is an attempt to consider how the PIA project’s implementation of LOUD standards enhances our understanding of their potential to facilitate data reuse and wider participation. I argue that the implementation of shared API allows for a prompt understanding of the benefits and limitations of these standards, drawing on the expertise and contributions of the extended communities. Expected outcomes include improved accessibility and reusability of heritage data, as well as wider engagement with diverse user communities. The aim is to implement and evaluate these standards in an exploratory setting to assess their impact on long-term engagement. The choice of PIA as a testing ground is suited both to the fluidity of research environments and to the possibility of experimenting with different modes of practice. Research agendas are inherently transient and flexible, making PIA a dynamic space for the uptake and evaluation of LOUD standards. This approach not only deepens the understanding of these technologies, but also increases the flexibility to adapt to emerging capabilities.

In Section 7.1, I begin by providing an overview of the infrastructure baseline that was established at the start of the project and the data that was aggregated to set the stage for a thorough assessment of the research. The application and impact of LOUD standards within the research project will be explored in Section 7.2, a section that is crucial to deriving actionable insights and understanding the wider impact of LOUD in a real-world and relatively short-term context. The synthesis of these findings in Section 7.3 provides a comprehensive overview of the impact and potential benefits of LOUD standards on participatory knowledge practices.

The chapter concludes with Section 7.4, which offers insights into the future implications of this work for research and practice, for what will be done until the end of the PIA project or beyond.

7.1 Data and Infrastructure Baseline at the Project Onset

The journey of the analogue archives of CAS began at the end of the century with the creation of the first archives by the SSFS in 1896, now known as CAS. They began collecting photographs in the 1930s and have approximately 300,000 objects spread across 30 collections. This first phase, which continued well into the era of digitisation, consisted of maintaining and expanding analogue archives. Over time, as social and cultural contexts evolved, so did archival practices and technologies. Early photographic collections were central to shaping national identities and cultural origins during periods of significant industrial and social change (see Caraffa & Serena, 2015). These collections were not simply repositories of existing cultural elements, but were instrumental in constructing the very notions of culture they purported to document through the systematic collection, archiving and interpretation of images and materials.

In the early 2010s, the second stage began, involving the digitisation of approximately 105,000 photographs, in collaboration with the DHLab and the restoration atelier of Anklin & Assen (Graf et al., 2019). This phase included the development of a database for the digitised images and the addition of metadata to these images. The move to digital archives offered many benefits, including easier classification, access from anywhere, and increased security through redundant storage (see Rosenthaler & Fornaro, 2016). Moreover, as the physical life of analogue originals diminishes and reproduction becomes more difficult, digitisation has become an essential strategy not only for preservation but also for redefining collections as integral components of CH in today’s knowledge society.

The digital transformation of the CAS archives has involved a network of interconnected actors, primarily centred around the University of Basel. The analogue CAS archives are stored on one of the university’s premises, underlining the collaborative ties and shared resources between the Institute of Cultural Anthropology and European Ethnology and the broader university framework (Eggmann & Bischoff, 2010; Leimgruber et al., 2011). The digitisation initiative, overseen by the DHLab, reinforces the university’s key role in the digital preservation and management of CH (Fornaro & Chiquet, 2020). After its inception as a project within the DHLab, DaSCH became an independent entity in 2021, operating as a national research data infrastructure with substantial support from the SNSF, with the University of Basel continuing as the hosting institution (Gautschy, 2022; Rosenthaler et al., 2015). This transition underlines the integral role of the University of Basel in the management of digital archives. In addition, the PIA project, involving similar stakeholders such as the Institute and the DHLab, and now also the HKB as design experts, with DaSCH overseeing the long-term preservation of the (meta)data (Cornut et al., 2023).

Among the practices for cataloguing and indexing the photographic archives of CAS was, and still is, spreadsheet-based processing of metadata according to a specific data model agreed upon with the DHLab, and uploading this along with the associated data, i.e. the digitisations, onto the Salsah platform, which was the first virtual research environment developed for DaSCH. The metadata uploads could be corrected at any time using an online mass editor on Salsah, and sometimes the digitised objects were uploaded first, sometimes in parallel with the metadata, depending on the needs and resources available.

Figure 7.1 shows a rather elementary data model of SSFS, which was still in use at the beginning of the PIA project. It is primarily centred on photographic objects, through sgv:Image, sgv:Album or sgv:Tonbildschau, although it already had links to other types of entities, whether people or subjects, or links to Geonames[265] for geographical referencing. While this model aligns with the spirit of Linked Data, its integration was quite basic; apart from using Dublin Core for titles (dc:title), there were no other subproperties or mappings being actively implemented [266].

Overview of the Legacy SSFS Data Model
Figure 7.1: Overview of the Legacy SSFS Data Model

Additionally, the naming convention for properties did not fully adhere to camel case[267], which combines words or phrases by capitalising all words following the first word and removing spaces (e.g., CamelCase), often used to improve readability and compatibility in coding environments. The eight classes of the SSFS data model are as follows:

Furthermore, a series of lists completes this legacy data model, notably on the accepted formats (through has:Format), medium (via hasModel), and types of photographic objects (via hasObjecttype), in which a hierarchy was created and is maintained by CAS.

A dedicated website has also been created to present the digitised photographic archives, aggregating (meta)data from the Salsah API. On this interface, and still at the time of writing in April 2024, visitors can access around the images digitally preserved by DaSCH, although some of these have not yet been catalogued and/or digitised.

This phenomenon can be clearly illustrated by the three corresponding CAS collections studied over the course of the PIA research project. In the SGV_05 collection, which comprises approximately 90,000 objects, none had been catalogued or digitised prior to the initiation of PIA, with these processes beginning concurrently once the project commenced. However, not all items in this collection are slated for digitisation or fine cataloguing. In the case of the SGV_10 collection, although over 12,000 photographs were digitised and catalogued, the level of detail in this cataloguing remains minimal. Regarding the Kreis Family collection, approximately fifty photo albums had yet to be digitised. For the SGV_12 collection, while more than 47,000 black and white negatives were successfully digitised and catalogued, there remained over 10,000 prints yet to be processed. These prints, a curated selection by the photographer Brunner, are organised within boxes according to a nomenclature established by the photographer himself. Additionally, some of these objects from these three collections needed to be restored prior to digitisation[268].

The following consideration regarding the assignment of identifiers within the CAS photographic archives is intended to provide additional contextual information. Understanding the structure of these identifiers is important, particularly for images without extensive associated metadata, as the identifiers themselves provide valuable clues as to the original format of the items and the collection to which they belong. This systematic approach to naming helps to infer details that are not always explicitly documented elsewhere within the metadata.

Initially, the database migration from Salsah to DSP was scheduled to be completed by the start of the project in 2021. However, delays meant that this task extended nearly to the project’s conclusion, necessitating some extraction and handling of (meta)data, especially at the beginning of PIA. This ongoing requirement significantly influenced the project workflow and data management strategies, underscoring the dynamic nature of managing large-scale digital archives.

The migration to DSP also provided an opportunity to redesign the data model to better meet the needs of both CAS and PIA. A balance had to be struck between minting new PIA-specific properties and dealing with the extension of the model on the research project so as not to overload the CAS data model. The development of DSP started as a replacement for Salsah and consists of three main components: DSP-APP, DSP-API and DSP-TOOLS. DSP-APP is the GUI that allows users to view and edit data directly in their browser, improving accessibility and interaction. DSP-API acts as the core of the software stack, an RDF database that provides API access to data, enabling data interactions. Finally, DSP-TOOLS is a command-line tool for uploading data models and large datasets to a DSP server, streamlining the data integration process.

One of the challenges was to integrate LOUD standards within these processes and to coordinate this implementation alongside the myriad satellite activities of the research project. In the following section, I will delve into the specifics of these efforts and discuss how collaboration and technological adaptation were managed concurrently with the main objective tied to my thesis within PIA.

7.2 LOUD Standards in an Explorative Setting

PIA has envisioned the development of processes, tools and interfaces to generate and make visible knowledge in a participatory manner. The purpose is to enable intuitive access and exploration, based on the example of three collections from the CAS photographic archives. To this end, we have used a variety of face-to-face, virtual and hybrid formats: plenary meetings every six weeks, group and thematic meetings at different intervals, meetings between PhD students, sprint meetings among developers or designers, project-related courses, as well as internal and external workshops. Additionally, we had to navigate the creation of a technical infrastructure that would allow us to begin our research activities independently or in tandem with the ongoing database migration, restoration, and digitisation efforts. This infrastructure setup was critical to ensure that our research could proceed without too much delay, even as fundamental records management tasks were being completed.

With a view to achieving semantic interoperability affordances beyond or in parallel with the development of GUI-based tools, I, with the help of my colleagues, have approached the deployment and establishment of LOUD specifications and practices within the project incrementally, as a way of addressing the needs of different audiences, whether human or machine, but also as an attempt to extend participation beyond the set of tools being developed specifically for PIA. As such, the two main IIIF APIs were first implemented to streamline access to image-based resources and to facilitate user annotation of resources in accordance with WADM. Then, an endpoint compatible with the IIIF Change Discovery API to support ongoing updates and changes was deployed. On the Linked Art front, a conceptual model mapping the current classes and properties, and boilerplate documents were created, followed by the roll-out of a preliminary version to coincide with the ongoing development of the initiative’s first stable specification release. Not only does Linked Art serve as a benchmark for accessing CH semantic metadata, but also as an alternative endpoint that expands the scope and accessibility of data interactions.

Figure 7.2 presents a synoptic - and asynchronously processed - overview of the PIA technical infrastructure, providing a high-level perspective focusing on APIs and (meta)data integration rather than the various GUI-based prototypes. This view helps to outline the phased implementation of the infrastructure, which will be explored hereafter, starting with (meta)data extraction, followed by an explanation of the updated CAS data model. I will then look at the IIIF Image API and the IIIF Presentation API along with WADM, moving on to IIIF Change Discovery and concludes with an overview of the metadata APIs, including Linked Art. Each of these components plays an important role in the project’s architecture. The systematic deployment of these processes and standards has been strategically designed to build layers of semantic interoperability incrementally.

Synoptic View of the PIA Infrastructure: Showcasing its Connection to DSP and the CAS Photo Archive Website
Figure 7.2: Synoptic View of the PIA Infrastructure: Showcasing its Connection to DSP and the CAS Photo Archive Website

7.2.1

(…)

7.2.2

(…)

7.2.3

(…)

7.2.4

(…)

7.3 Synthesis and Insights

(…)

7.4 Perspectives

(…)

8. Yale’s LUX and LOUD Consistency

(…)

9. Discussion

[Il] faut renoncer à l’idée d’une interopérabilité syntaxique ou structurelle par l’utilisation d’un modèle unique, qu’il s’agisse de la production, de stockage ou de l’exploitation au sein même d’un [système d’information]. (Poupeau, 2018) [269]

This chapter presents a comprehensive discussion where I interpret, analyse and critically examine my findings in relation to the thesis and the wider application of LOUD. Through an in-depth analysis of the design principles of LOUD and their implications for CH, this discussion aims to demonstrate the many challenges and opportunities inherent in this framework. The focus is on achieving community-driven consensus, rather than simply pursuing technological breakthrough.

The following sections are organised to provide a comprehensive review of the empirical findings, an evaluation abstracting LOUD, and a retrospective analysis of the research journey. Firstly, in Section 9.1, I will present a summary of the empirical findings from my research. This will include key themes and insights, structured to reflect the different areas of study and practice within LOUD.

Secondly, in Section 9.2 I will provide an evaluation of LOUD by means of using the LoA approach. This evaluation will focus on the impact of LOUD on the perception of Linked Data within the CH domain and the wider DH field. This will include the key themes and insights that have emerged, structured in a way that reflects four levels of abstraction. I will also explore the dual nature of LOUD implementation, involving both simplicity and complexity, and discuss the various factors that influence such dynamics.

Finally, in Section 9.3, I will offer a retrospective analysis of the research journey. This section will interpret the findings to situate LOUD as fully-fledged actors. It will reflect on the challenges, achievements, and lessons learned throughout the research process, providing a holistic view of the project’s trajectory and its implications for the future of LOUD.

9.1 Empirical Findings

This section summarises the empirical findings of my research and already offers some suggestions. The structure does not follow the exact order of the three empirical chapters but is organised around overarching topics that emerged throughout the study. The seven topics include Community Practices and Standards, Inclusion and Marginalised Groups, Maintenance and Community Engagement, Interoperability and Usability, Future Directions and Sustainability, Digital Materiality and Representation, as well as Challenges of Scaling and Implementation.

Community Practices and Standards

GitHub serves as a vital hub for community involvement, with a core group of active contributors often attending meetings regularly. This platform simplifies decision-making within the community, although it also reflects biases similar to those in FLOSS communities. Behind visible activities like meetings, there is substantial preparatory work managed by co-chairs, editorial boards, or driven by community-generated use cases. This foundational work often determines the direction and outcomes of formal gatherings. The LUX project at Yale, as seen in , has successfully fostered collaboration across various units, bringing together libraries and museums on a unified platform. The technological foundation of LUX, based on open standards, facilitates data integration and cross-collections discovery.

Not only does the deployment of FLOSS tools contribute to these achievements, but it also emphasises the social advantages of working collaboratively. The concept of the Tragedy of the Commons, as described by (Hardin, 1968), highlights the potential for individual self-interest to deplete shared resources. However, (Ostrom, 1990) offers a counterpoint by demonstrating how communities can successfully manage common resources through collective action and shared norms. In this context, initiatives like the CHAOSS initiative[270] play a significant role by providing metrics that help evaluate the health and sustainability of open source communities. These metrics include contributions, issue resolution times, and community growth, offering valuable insights into how collaborative efforts can be maintained and improved.

Reaching consensus is another critical aspect of community practices and standards. While the minutes of meetings are valuable artefacts, they often reflect an Anglo-Saxon approach to decision-making characterised by few substantive points and critical turning points. The formal aspects of conversations captured in minutes do not fully encompass the decision-making process, which frequently involves informal conversations, consensus-building through open dialogue, and subtle cues that influence outcomes. These elements are integral to the English and American approach and hold valuable lessons for an international community. IIIF and Linked Art are international communities, but decisions are made in English and the majority of participants are based in North America and the UK, significantly imprinting this approach. Understanding these nuances can help us improve our collaborative efforts within the IIIF and Linked Art communities. By recognising and appreciating these different facets of decision-making, we can learn from each other and enhance our collective ability to make effective and inclusive decisions.

Some of the challenges associated with these practices include the major demand on resources for community building, the slowness inherent in distributed development, and the difficulty in achieving consensus. Additionally, the concept of social sustainability can be seen as an imaginary construct that papers over differences, as discussed by Addressing these challenges is crucial for the long-term success and effectiveness of the IIIF and Linked Art communities.

Inclusion and Marginalised Groups

The demographic homogeneity in these communities can perpetuate biases and neglect issues relevant to underrepresented or marginalised groups, as seen in . Participation in these standardisation processes is itself a privilege. The assumption that internet access and digital devices are universally available is critically examined, revealing key actors in the digital landscape. This mirrors issues within the IIIF community, where generating IIIF resources presupposes means that may not be accessible to all.

We need clear terms of inclusion, as highlighted by She argues that effective inclusion requires a critical examination of the frameworks and conditions under which inclusion is offered. The framework should ensure that inclusion initiatives do not merely add diversity to existing power structures but work to transform these structures fundamentally. This involves questioning who defines the terms of inclusion, who benefits from them, and who may be inadvertently excluded. (Hoffmann, 2021) suggests a participatory approach, where marginalised communities are actively involved in shaping inclusion policies and practices, thus making inclusion an ongoing, reflective process rather than a static goal.

The inclusion of marginalised groups is a necessary step, but it is not sufficient. To truly make a difference, there must be a strategic and concentrated effort to appropriate technologies, as emphasised by (Morales, 2009, 2017, 2018) and further articulated by (Martinez Demarco, 2019, 2023). This strategic approach highlights the political significance of challenging dominant neoliberal and consumerist perspectives on technology and individual engagement.

(Martinez Demarco, 2023) underscores the critical importance of focusing on practices that go beyond mere inclusion. Instead, it requires a deep understanding and critical assessment of how technology is intertwined with social, economic, and ideological contexts. It implies a reflective and deliberate process of technology adoption in which individuals creatively tailor technology to their specific needs, beliefs, and interests. Moreover, a key aspect highlighted by (Martinez Demarco, 2023) is the implicit and explicit critique of a universalist approach to inclusion, which often lends itself to all too easy instrumentalisation. Understanding and studying resistance to inclusion in an oppressive digital transformation context is paramount, particularly given the highly unequal conditions that prevail.

In this light, a comprehensive study of socio-material and symbolic processes, practices, and involved in embedding technologies into individuals’ lives is needed. This approach also recognises technology as a catalyst for change. It envisions the use of technology to drive meaningful change at multiple dimensions and realities—national, societal, or personal. By focusing on these practices, empowering individuals to navigate and use technology thoughtfully and purposefully becomes a reality, bridging the gap between technological advances and societal progress (Martinez Demarco, 2019).

Maintenance and Community Engagement

The tension between creating advanced specifications and their practical implementation by platforms is evident in the IIIF Cookbook recipes and Linked Art patterns, as discussed in Chapter 6. This ongoing development shows that the community is still finding the best ways to achieve broad adoption and interoperability. The deployment of the Change Discovery API, as illustrated in Chapter 7, demonstrates that establishing such a protocol on top of the IIIF Presentation API is feasible and straightforward. High-level support from leadership, particularly Susan Gibbons as Vice Provost, has been crucial in building trust and ensuring the project’s success as a valuable discovery layer at Yale. This integration of diverse collections through a unified platform, based on open standards, highlights the potential for transforming teaching, learning, and research by leveraging collaborative efforts. The topic modelling exercise in LUX reveals the intricate actor-networks composed of organisations, individuals, and non-human actors. This analysis underscores the importance of ongoing processes and relationships in maintaining and evolving infrastructure, akin to the concept of ‘infrastructuring’.

As detailed in Chapter 8, following best practices and guidelines such as the SHARED Principles is essential for better involvement, but it is also crucial to uphold these commitments consistently over the long term to ensure meaningful participation. Between the PIA team members, there were sometimes ‘disconnects between different communities who undertake collaborative research’ (Vienni-Baptista et al., 2023). This was something we had to navigate and learn from, which was manageable within the context of a laboratory setting. However, for any follow-up projects or whatever forms the digital infrastructure we built may take, it is imperative that these disconnects are addressed and solidified to ensure cohesive and sustained community engagement.

Interoperability and Usability

Within PIA, different APIs have been progressively deployed to meet various requirements while allowing parallel exploration of data modelling. Each API offers unique advantages, but their collective integration promotes semantic interoperability. For example, the IIIF Image API has been instrumental in rationalising image distribution across prototypes, providing efficient access to high-quality digital surrogates and the ability to resize them for different uses. Adherence to LOUD standards and schemas within LUX has generally been positive, although transitioning between versions of a specification can present challenges, highlighting the need to improve the consistency of compliant resources.

Linked Art, for instance, has the capacity to generate various insights and sources of truth around different entities. However, additional or entirely new vocabularies from sources like the Getty may need to be used – such as Homosaurus. Complementary to Linked Art, using WADM allows for assertions that go beyond purely descriptive narratives, though it may sacrifice some semantic richness. This complexity in managing vocabularies and maintaining semantic richness directly ties into broader usability considerations within the community.

Addressing these usability concerns, Robert Sanderson has suggested focusing on the use of full URIs in Linked Art to ensure computational usability, in contrast to IIIF‘s approach of minimising URIs to enhance readability. This difference highlights a fundamental question in usability: balancing readability and computational usability. Understanding developers’ perspectives on these approaches is critical.

I would suggest as a way forward for the IIIF and Linked Art communities to focus on further improving usability of the specifications. This includes conducting comprehensive usability assessments of APIs to evaluate the experiences of new developers versus existing ones, understanding the steepness of the learning curve associated with each API, and guiding improvements in documentation, on-boarding processes, and overall developer support. Efforts should be made to lower the barriers to entry for new developers by developing more intuitive and user-friendly tutorials, providing example projects, and creating a robust support community. Ensuring that developers can quickly and effectively leverage APIs will foster greater adoption. Addressing the challenges of transitioning between different versions of specifications is critical, and developing tools and guidelines that help maintain consistency across versions will reduce friction and ensure smoother updates.

Future Directions and Sustainability

Survey findings, as discussed in , underscore the need for ongoing efforts to develop LOUD standards that foster an inclusive, dynamic digital ecosystem. Future strategies should include creating educational resources and frameworks that support interdisciplinary collaboration and reduce barriers to participation. While the Manifest serves as the fundamental unit within IIIF, the Linked Art protocol can play a similar central role as semantic gateways in broader contexts, allowing round-tripping across the APIs. The topic modelling exercise in LUX, detailed in , reveals complex actor-networks of organisations, individuals, and non-human actors, providing insights into the relationships sustaining the LUX initiative.

The next steps for Linked Art might involve forming a new consortium independent of a CIDOC Working Group, which could provide the necessary support to sustain the initiative. Alternatively, integrating Linked Art into IIIF as a new TSG and specification could address the discovery challenges within IIIF, as discussed during the birds of a feather session led by Robert Sanderson (see Raemy, 2024) at the 2024 IIIF Conference in Los Angeles[271]. Design principles that act as bridges across different disciplines, as proposed by (Roke & Tillman, 2022), are crucial. IIIF has demonstrated that this collaborative approach is feasible, and Linked Art could follow in its footsteps. However, achieving this requires increased dedication from passive members and broader adoption of the model and the API ecosystem in the near future.

Digital Materiality and Representation

As explored in Chapter 7, the detailed digital representation of photographic albums, such as the Kreis Family Collection, demonstrates the need to comprehensively capture the materiality of digital objects. This includes the structure and context of images, which are crucial for maintaining their historical and social significance. The implementation of the IIIF Presentation API in creating a detailed digital replica of the Getty’s Bayard Album shows how digital materiality can be enhanced through thoughtful use of technology, but also highlights the scalability challenges for such detailed representations.

Creating these detailed digital representations can be seen as a ‘boutique’ approach, which, while labour-intensive and resource-demanding, is necessary for preserving the integrity and contextual significance of cultural heritage objects. The challenge lies in developing the appropriate means and methodologies to achieve this level of detail consistently. Future endeavours, whether through research projects or collaborative efforts between GLAM institutions and DH practitioners, should aim to address these challenges and create sustainable practices for digital materiality and representation. As Edwards aptly notes:

‘Presentational forms equally reflect specific intent in the use and value of the photographs they embed, to the extent that the objects that embed photographs are in many cases meaningless without their photographs; for instance, empty frames or albums. These objects are only invigorated when they are again in conjunction with the images with which they have a symbiotic relationship, for display functions not only make the thing itself visible but make it more visible in certain ways‘. [(Edwards & Hart, 2004) p. 11]

Challenges of Scaling and Implementation

As seen in Chapter 6, the IIIF Cookbook recipes and Linked Art patterns reflect the tension between creating advanced specifications and their practical implementation. This gap between ideation and real-world application underscores the challenges faced by the community in achieving broad adoption and interoperability. In Chapter 7, the exploration of APIs like the IIIF Change Discovery API illustrates the practical challenges and potential of scaling these technologies for wider adoption. The successful implementation in PIA demonstrates viability, but also points to the need for continued development and community engagement to fully realise the benefits.

Furthermore, assessing the scalability of IIIF image servers, as discussed by (Duin, 2022) and exemplified by the firm Q42 with their Edge-based service Micrio[272], highlights the importance of optimising data performance. Erwin Verbruggen aptly noted that ‘optimising data performance in my opinion mens sending as little data over as needed’[273], emphasising the need for efficient data handling to enhance scalability. This insight reinforces the necessity of continual refinement in scaling digital infrastructure to support broader use and integration.

Reflecting on these findings, I would like to assert that continuous participation, particularly for institutions that can afford to be part of initiatives like IIIF-C, is essential. Active members should not only focus on their own use cases but also consider the needs and perspectives of other, perhaps marginalised, groups. Achieving the dual goals of making progress within one community, whether it be IIIF or Linked Art, while also engaging in effective outreach and creating a solid baseline, will benefit everyone in the CH sector and beyond. Addressing where LOUD fits in, how people perceive this new concept or paradigm, and understanding how LOUD differs from Linked Data in general are essential. These questions help to clarify the stages at which themes related to one of the LOUD design principles emerge, crystallise, and potentially disappear. My thesis does not fully resolve these queries but offers insights and hints for further exploration.

In conclusion, the empirical findings reveal the richness of the implementation and maintenance of LOUD standards in the CH domain. From the critical role of community practices and standards to the challenges of achieving interoperability and inclusivity, each theme underlines the complex interplay of social, technical and organisational factors. will look at the evaluation of LOUD and explore its overall impact, delving into the delta of what to do with it, particularly in terms of Linked Data versus LOUD, where my thesis provides pointers but does not provide definitive answers.

9.2 Evaluation: Abstracting LOUD

In this section I will assess the impact of LOUD within the CH domain and the wider DH field, examining its implications for community practices and semantic interoperability, and secondarily whether LOUD has affected the perception of Linked Data.

Referring to Figure 4.2, the following is a descriptive attempt to provide levels of abstraction of LOUD based on my empirical findings, focusing particularly on the deployment of IIIF within PIA and Linked Art within the LUX framework, aside from the data model abstraction level.

The dual simplicity and complexity of implementing LOUD specifications and participating in community-led efforts can be attributed to the need for a reorientation of research projects. It is essential for these projects to actively engage in community processes rather than intermittently presenting their progress and subsequently withdrawing. This ongoing engagement fosters a more robust and collaborative environment, ultimately contributing to the advancement of shared goals and standards. Such a reorientation necessitates a fundamental change in how universities and GLAMs institutions operate, extending their involvement beyond the immediate project scope to ensure sustained participation and impact.

Despite the introduction of LOUD, the perception of Linked Data has not evolved significantly. Most software engineers continue to treat resources primarily as JSON, often overlooking the graph structure that underpins Linked Data. For IIIF, this approach is appropriate given its focus on content interoperability and presentation. However, for Linked Art, overlooking the graph structure could be problematic to some extent, as it limits the full realisation of the semantic relationships and rich interconnections that Linked Data can provide. This highlights the need for more focused efforts to integrate semantic web principles, particularly in contexts where these principles can significantly improve the quality of data.

I have faced challenges in moving many of the models developed within PIA into (beta) production, and the usability requirements of APIs have scarcely been addressed. However, the findings from this thesis should be viewed as starting points rather than conclusive solutions. The unseen aspect of this dissertation is my active involvement in both communities and my attempts to reciprocate this engagement within PIA. Each investigation presented could have warranted a dedicated thesis, indicating the breadth and depth of the topics explored. Ultimately, this work merely scratches the surface of numerous subjects, laying the groundwork for future research and development.

The next section will offer a retrospective on the work accomplished during this PhD thesis. It will reflect on the various milestones achieved, the lessons learned, and the potential directions for future research.

9.3 Retrospective: Truding like an Ant

In this retrospective[277], I will offer an analysis of the research journey. This section will interpret the findings to situate LOUD as fully-fledged actors within the CH field. It will reflect on the challenges, achievements, and lessons learned throughout the research process, providing a holistic view of the project’s trajectory and its implications for the future of LOUD.

The empirical findings of my research reveal the nuanced interplay between socio-technical practices and implementations, synthesising insights through both thematic and abstract lenses. This dual approach underscores the importance of fostering collaboration and effective decision-making, while addressing biases and promoting inclusivity. The need for ongoing maintenance, interoperability and usability remains paramount, as does the development of educational resources and consortia to sustain initiatives. In addition, capturing digital materiality and addressing scalability challenges are critical to the widespread integration of LOUD standards. These findings lay the groundwork for future research and development aimed at bridging operational applications with more extensive design approaches.

How can LOUD be situated as fully-fledged actors within the CH field? Reflecting on the notion of , frequently mentioned during the 2024 IIIF Conference, LOUD specifications embody this concept perfectly. Even if not all embedded patterns of a given API-compliant resource are correctly interpreted or rendered by a client, some of its basic features should still be displayed. This flexibility is crucial for ensuring the broad usability and adaptability of LOUD standards, allowing them to transcend institutional boundaries and serve as robust mediums of knowledge transfer. To paraphrase (Poupeau, 2018)’s quote at the beginning of this chapter, there isn’t a unique model for interoperability, but there are definitely best sociotechnical practices to be learned from IIIF and Linked Art. The act of participation prevails over the relatively easy and one-off deployment of specifications for the short term.

By using LOUD, CH data can be effectively interlinked with different datasets, resulting in numerous potential benefits. An overriding benefit is the improved discoverability and accessibility of CH resources, facilitating enhanced search and retrieval capabilities. In addition, the adoption of LOUD promotes seamless data sharing and reuse within academic and memory institutions, fostering a culture of collaboration and interdisciplinary knowledge exchange. This approach not only enhances the overall utility and comprehensiveness of CH repositories, but also promotes collective understanding and appreciation of diverse cultural assets and historical narratives.

However, it is essential to critically evaluate the application of LOUD in the context of CH data. While LOUD offers promising prospects for improved data interlinking and accessibility, challenges and concerns persist. The transition to LOUD principles necessitates significant investments in resources, including infrastructure, expertise, and time, which may pose barriers for smaller institutions or those with limited funding. Moreover, ensuring the accuracy, consistency, and quality of Linked Data is a complex task, demanding meticulous attention to detail and ongoing maintenance efforts. Furthermore, potential issues related to data ownership, rights management, and the potential misuse or misinterpretation of interconnected data should be carefully considered. Standardisation across different CH domains, each with unique data structures, formats, and contexts, may present formidable obstacles to seamless integration. These concerns underscore the need for a nuanced and cautious approach to the implementation of LOUD standards, taking into account the complexity and specificity of CH data and its diverse custodians.

This thesis has been a journey of discovering Linked Art and a confirmation that the ethos of IIIF is yet to be fully manifested beyond product implementation. The sense of belonging to a community is an ongoing endeavour, much like the ants in Latour’s metaphor. This dissertation underscores that active participation in community processes is essential to achieving the dual goals of advancing the technological framework for semantic interoperability and fostering an inclusive and collaborative CH ecosystem.

10. Conclusion

For a better understanding of the past,
Our images have to be enhanced,
A new dialogue in three dimensions,
Must have openness at its heart,
For somewhere within the archive
Of our aggregated minds
Are a multitude of questions
And a multitude of answers,
Simply awaiting to be found.
(Mr Gee, 2023)

This chapter brings to a close the journey undertaken since February 2021, aiming to clearly articulate the answers to the research questions, discuss how the research aligns with the objectives, elucidate the significance of the work, outline its shortcomings, and suggest avenues for future research.

I had the privilege of hearing the above poem at EuropeanaTech in The Hague in October 2023. What struck me most, and what I have tried to convey in this thesis, was the powerful dialogue and collective spirit striving to harness the potential of our (digital) heritage. With a sense of conviction after this conference, I approached the next one in Geneva in February 2024 with confidence, believing that I had made a compelling case for the concept of LOUD. When a participant asked how LOUD differed from Linked Data, however, I found myself explaining the socio-technical ethos of IIIF and Linked Art, the richness of the individuals who make them up, the ability to combine these different standards, and the common use cases that emerge from these collaborations. Whether my answer was convincing remains uncertain, but I knew it was too brief. Perhaps it is here, in this conclusion, that my thoughts can find their full expression.

I believe that LOUD should be at the forefront of efforts to improve the accessibility and usability of CH data, an endeavour that is increasingly relevant in a web-centric environment. This paradigm has gained considerable traction, particularly with the advent of Linked Art and the recognition that the IIIF Presentation API has been an inspiration for the LOUD design principles. The development and maintenance of LOUD standards by dedicated communities are characterised by collaboration, consensus building, and transparency. In the interstices of the IIIF and Linked Art communities, frameworks for interoperability are not only exposed, but revealed as profound testaments to the power of transparent collaboration across institutional boundaries. Both communities, it is true, are still very much Anglo-Saxon efforts, where the specifications have mainly been implemented in GLAM and/or DH research projects, or at least when we have been aware of them. It has clear guidelines on how to propose use cases, mostly using GitHub, and hides the sometimes unnecessary RDF complexity behind a set of JSON-LD @ context. IIIF is at the presentation layer and can really play its role as a mediator, with the Manifest as its central unit connecting other specifications, including semantic metadata, and preferably with simpatico specifications such as Linked Art.

An important hypothesis arises from the observation that adherence to the LOUD design principles makes specifications more likely to be adopted. The primary benefit of adopting LOUD standards lies in their grassroots nature. This grassroots approach not only aligns with the core values of openness and collaboration within the DH community but also serves as a common denominator between DH practitioners and CHIs. This unique alignment fosters a sense of shared purpose and common ground. However, it’s essential to acknowledge that while LOUD and its associated standards, including IIIF, hold immense promise, their limited recognition in the wider socio-technical ecosystem may currently hinder their full potential impact beyond the CH domain.

Consideration of socio-technical requirements and the promotion of digital equity are essential to the development of specifications in line with the LOUD design principles. In the context of the IIIF and Linked Art communities, this means both recognising current challenges and building on existing practices. This includes forming alliances that support diverse forms of inclusion at both project and individual levels. For example, organisations should be encouraged to send representatives from diverse professional and personal backgrounds, such as underrepresented groups or non-technical fields. This can be facilitated by initiatives that lower the barriers to participation, such as financial support for travel and participation, flexible participation formats, and targeted outreach efforts. Furthermore, as these standards often align with open government data initiatives, they present opportunities for broader public engagement and institutional transparency.

In the broader context of DH, understanding LOUD involves tracing the historical development of the field and its evolving relationship with technology. The interdisciplinary nature of DH has always integrated diverse scholarly and technical practices. In recent years, DH has seen a notable increase in interest in the use of Linked Data and semantic technologies to improve the discoverability and accessibility of CH collections. LOUD's emphasis on user-centred design and usability aligns well with these goals. Consequently, the principles of LOUD hold great promise for advancing the integration and use of community-driven APIs and/or Linked Data within DH. This can be seen within PIA, where the benefits of implementing IIIF helped us to streamline machine-generated annotations, integrate different thumbnails into GUI prototypes, model photo albums with different layers from the Kreis Family collection, and enable project members and students to engage in digital storytelling, an important participatory facet that can be seamlessly explored by DH efforts and CHIs with the help of the IIIF Image and Presentation APIs. Data reuse is definitely a key LOUD driver, which could have been done more extensively with a productive instance of Linked Art. As for widening participation, this is definitely a strategic and political decision, rather than a technical one. That said, LOUD specifications can definitely be embedded through strategic citizen science initiatives.

A recent example that highlights the comprehensive value of Linked Data was presented by (Newbury, 2024) at the CNI Spring 2024 Meeting. He delineated its significance as extending well beyond single entities, such as the Getty Research Institute, to enrich a vast ecosystem. Specifically, he identified three principal areas of value: Firstly, within the ecosystem itself, where the utility of information is amplified through its application in diverse contexts. Secondly, for the audience, by directly addressing user needs and facilitating various conceptual frameworks. And finally, within the community, by enabling wider use and adaptation of data and code. This approach to Linked Data, as articulated by Newbury, not only enhances its utility across these dimensions, but also aligns seamlessly with the LOUD proposition, underscoring a shared vision for a digital space where the interconnectedness and accessibility of (meta)data serve as foundational principles for progress and community engagement.

LUX, as a catalyst for LOUD, exemplifies a practical approach to implementing Linked Data that has garnered significant local engagement and support at Yale. This initiative demonstrates how sound socio-technical practices can be effectively applied within a supportive institutional environment. The consistency of the data within LUX aligns well with IIIF and Linked Art standards, with only a few minor adjustments required for full compliance. These quick fixes are manageable and do not detract from the overall robustness of the initiative. While it may be too early to fully assess the wider impact of using LOUD specifications on the LUX platform within the CH domain, the initiative has already attracted considerable interest in recent months. This growing attention suggests that the LUX approach is resonating with other organisations, suggesting the potential for wider adoption and impact. The enthusiastic local engagement at Yale provides a strong foundation for LUX and highlights its potential to serve as a model for similar projects aimed at enriching digital heritage through effective collaboration and agreed-upon standards.

In carrying out this thesis, I have adhered to the five main objectives set out at the beginning of the PhD. These objectives have been accomplished to a high degree, reflecting a substantial and well-executed project. Furthermore, most of the outputs – such as data models and scripts – from this work are available on GitHub, providing open access to the wider community. In addition, I have published several papers, both individually and collaboratively, further disseminating the findings and contributions of this research.

Additionally, this thesis is relevant because it sheds light on communities and implementations that can be celebrated not only for their standards but also for their operating ethos; IIIF and Linked Art present models ripe for emulation beyond their immediate digital confines. Here, agency and authority are most typically granted to the collective over the isolated, with each actor - be it an individual, an institution or an interface – intricately interconnected. Yale’s LUX initiative also embodies this ethos, demonstrating how collaborative efforts can lead to innovative solutions and wider impact. It is to be hoped, then, that these practices of openness and multiple partnerships will not be seen as limited to their origins in digital representation. At the very least, I hope that these socio-technical approaches can serve as exemplars or sources of inspiration in broader arenas, where the principles of mutual visibility and concerted action can point the way towards cohesive and adaptive collaborative architectures.

Despite its contribution, this thesis is far from perfect and certainly contains several shortcomings. I will name here three significant ones. First, the visualisations included and the use of FOL are primarily designed to support my own self-reflection and may be more beneficial to me than to the broader academic community. While they provide insights into my research process and findings, their applicability and usefulness to others might be limited. Second, the theoretical framework I employed, while instrumental to my research, may not serve as a universally applicable toolbox. Nevertheless, I urge readers to pay close attention to STS methodologies and practices. The works of Bruno Latour, Donna Haraway, and Susan Leigh Star have been invaluable companions throughout this dissertation. Additionally, for those involved in conceptualising semantic information, I recommend exploring Floridi’s PI, which offers profound insights into the nature and dynamics of information. These readings have greatly influenced my approach and understanding, and I believe they can offer valuable perspectives to others as well. Third, while the thesis aims to address both community practices and semantic interoperability, it leans more heavily towards the former. This emphasis on community practices may overshadow the broader discussion of semantic interoperability, potentially limiting the appeal of the thesis to those primarily interested in the technical aspects.

Other shortcomings include the broad scope of the thesis, with three empirical chapters exploring different avenues. While this comprehensive approach provides a broad understanding of the research topic, it has also resulted in a rather lengthy thesis. This may be a challenge for the reader, as a topic of interest in one chapter may not be as compelling in another. The diversity of empirical focus, while enriching the research, may dilute the coherence for some readers, making it more difficult to maintain a consistent engagement throughout the dissertation. Despite these limitations, I hope that the different perspectives and findings contribute to a richer, more nuanced understanding of LOUD for CH.

Avenues for future research are numerous and promising. One interesting area to explore is the comparative benefits experienced by early adopters of IIIF and Linked Art specifications versus those who implemented these standards later. Early adopters have the advantage of having their use cases discussed and resolved within the community, and it would be insightful to analyse the long-term impacts on their projects. Such a study is already feasible for early adopters of IIIF and will become possible to compare further implementations of Linked Art within a few years. Furthermore, future exploration could focus on the full implementation of Linked Art within PIA or similar efforts, as well as more performance-oriented testing with the deployed LOUD APIs. These efforts should further validate the robustness and scalability of these specifications. Another important area for future investigation is the participation of institutions and individuals from the Global South[26:3] diverse regions beyond North America and Western Europe[278] in both the IIIF and Linked Art communities. It is crucial to explore how we can better support their uptake of these specifications and encourage their active involvement in these initiatives to ensure a more inclusive and globally representative environment.

As I reflect on the journey of this thesis, I am reminded of the powerful dialogue and collective effort that has been at its heart. Mr Gee’s poem resonates deeply with my own aspirations for this work: to enhance our understanding of the past through openness and collaboration, as can be seen in IIIF and Linked Art. As I bring this dissertation to a close, I am filled with a sense of accomplishment and a renewed commitment to promoting sound socio-technical practices. It is my hope that the insights and methodologies presented here will inspire others to engage in this ongoing dialogue, continually asking and answering the many questions that arise as we collectively explore our cultural heritage landscapes.


  1. Throughout this dissertation, British English spelling conventions are predominantly observed. However, there are instances of American English spelling where direct quotations from sources are used as well as referring to names of institutions, standards, or concepts. ↩︎

  2. SNSF Data Portal - Grant number 193788: https://data.snf.ch/grants/grant/193788 ↩︎

  3. Seminar für Kulturwissenschaft und Europäische Ethnologie: https://kulturwissenschaft.philhist.unibas.ch/ ↩︎

  4. DHLab: https://dhlab.philhist.unibas.ch/ ↩︎

  5. HKB: https://www.hkb.bfh.ch/ ↩︎

  6. The considerable size of the ASV collection, which includes over 90,000 analogue objects, reflects not just the work of the main authors but also the contributions from numerous explorers and additional material beyond the maps and primary publications. ↩︎

  7. Max Frischknecht’s PhD: https://phd.maxfrischknecht.ch/ ↩︎

  8. PIA project website: https://about.participatory-archives.ch/ ↩︎

  9. The vision of the PIA project was first written in German and then translated into English and French. ↩︎

  10. In our joint paper, we wrote ‘man-made’, corrected here, which makes me think of the transition within the CIDOC-CRM for the Entity E22 Human-Made Object from version 6.2.7 onward. ↩︎

  11. Knora Base Ontology: https://docs.dasch.swiss/2023.07.01/DSP-API/02-dsp-ontologies/knora-base/ ↩︎

  12. SIPI documentation: https://sipi.io/ ↩︎

  13. IIIF Working Groups Meeting, The Hague, 2016: https://iiif.io/event/2016/thehague/ ↩︎

  14. Van Gogh, Vincent. (1889). Irises [Oil on canvas]. Getty Museum, Los Angeles, CA, USA. https://www.getty.edu/art/collection/object/103JNH ↩︎

  15. Giacometti, Alberto. (1956). L’homme qui marche I [Sculpture]. Carnegie Museum of Art, Pittsburg, PA, USA. https://www.wikidata.org/entity/Q706964 ↩︎

  16. UNESCO World Heritage List: https://whc.unesco.org/en/list/ ↩︎

  17. Blue Shield International: https://theblueshield.org/ ↩︎

  18. The ICBS was founded by the ICA, ICOM, ICOMOS, and IFLA. ↩︎

  19. Guro. (1900-1950). Male Face Mask (Zamble) [Wood and pigment]. Art Institute of Chicago, Chicago, IL, USA. https://www.artic.edu/artworks/239464 ↩︎

  20. I have opted for the term ‘affordance’ and not ‘representation’ as my intention is to maintain a comprehensive scope that encompasses various modalities such as modelling endeavours. ↩︎

  21. To some degree, parallels can be drawn between the distinctions of cultural and digital heritage with those drawn between the humanities and DH. ↩︎

  22. Inicio - Museos Comunitarios de América: https://www.museoscomunitarios.org/ ↩︎

  23. The descriptions of each of these nine dimensions are selected excerpts from (Star, 1999). ↩︎

  24. A PID is a long-lasting reference to a digital resource. It usually has two components: a unique identifier and a service that locates the resource over time, even if its location changes. The first helps to ensure the provenance of a digital resource (that it is what it purports to be), whilst the second will ensure that the identifier resolves to the correct current location (Digital Preservation Coalition, 2017) ↩︎

  25. Rijksmuseum: https://www.rijksmuseum.nl/ ↩︎

  26. In the original version, these instances contained typographical or factual errors. They have been struck through and corrected here. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎

  27. (Zeng & Qin, 2022) [p. 11] articulate that ‘as with “data”, metadata can be either singular or plural. It is used as singular in the sense of a kind of data; however, in plural form, the term refers to things one can count’. In the context of this thesis, I have chosen to favour the plural form of (meta)data. However, I acknowledge that I may occasionally use the singular form when referring to the overarching concepts or when quoting references verbatim. ↩︎

  28. The snapshot of this bibliographic record was taken from https://swisscovery.slsp.ch/permalink/41SLSP_UBS/11jfr6m/alma991170746542405501. ↩︎

  29. Seeing Standards: A Visualization of the Metadata Universe. 2009-2010. Jenn Riley. https://jennriley.com/metadatamap/seeingstandards.pdf ↩︎

  30. A widespread example in the CH domain is the serialisation of metadata in XML, a W3C standard. ↩︎

  31. It is noteworthy that the diversity of metadata standards in the heritage domain, characterised primarily by a common emphasis on descriptive attributes, is not counter-intuitive. This variation reflects the diverse nature of CH resources and the nuanced needs of GLAMs. ↩︎

  32. MARC Standards: https://www.loc.gov/marc/ ↩︎

  33. RDA: https://www.loc.gov/aba/rda/ ↩︎

  34. If RDA was initially envisioned as the third edition of AACR, it faces the challenge of maintaining a delicate balance between preserving the AACR tradition while embracing the necessary shifts required for a successful and relevant future for library catalogues that can easily be interconnected with standards from archives, museums, and other communities (see Coyle & Hillmann, 2007). ↩︎

  35. MODS: https://www.loc.gov/standards/mods/ ↩︎

  36. METS: https://www.loc.gov/standards/mets/ ↩︎

  37. People might even argue that FRBR is only interesting as an ‘intellectual exercise’ (Žumer, 2007, p. 27). ↩︎

  38. LRMer: https://www.iflastandards.info/lrm/lrmer ↩︎

  39. BibFrame: https://www.loc.gov/bibframe/ ↩︎

  40. EAD: https://www.loc.gov/ead/ ↩︎

  41. ISAD(G): General International Standard Archival Description - Second edition
    https://www.ica.org/en/isadg-general-international-standard-archival-description-second-edition ↩︎

  42. PREMIS: https://www.loc.gov/standards/premis/ ↩︎

  43. RiC Conceptual Model: https://www.ica.org/en/records-in-contexts-conceptual-model ↩︎

  44. RiC-O: https://www.ica.org/standards/RiC/ontology ↩︎

  45. CDWA: https://www.getty.edu/research/publications/electronic_publications/cdwa/ ↩︎

  46. CCO: https://www.vraweb.org/cco ↩︎

  47. VRA: https://www.vraweb.org/ ↩︎

  48. VRA Core 4.0 and CCO have a symbiotic relationship, with CCO providing data content guidelines and incorporating the VRA Core 4.0 methodology. The latter also been leveraged in other contexts to form the basis for more granular Linked Data vocabularies (see Mixter, 2014). ↩︎

  49. In French, the original language used for this acronym, CIDOC stands for Comité international pour la documentation du Conseil international des musées. ↩︎

  50. LIDO: https://cidoc.mini.icom.museum/working-groups/lido/ ↩︎

  51. CIDOC Working Groups: https://cidoc.mini.icom.museum/working-groups/ ↩︎

  52. CIDOC-CRM: https://cidoc-crm.org/ ↩︎

  53. CRM-SIG Meetings: https://www.cidoc-crm.org/meetings_all ↩︎

  54. CIDOC-CRM V7.1.2: https://www.cidoc-crm.org/html/cidoc_crm_v7.1.2.html ↩︎

  55. For a quick overview of the classes and properties of CIDOC-CRM, I recommend visiting the dynamic periodic table created by Remo Grillo (Digital Humanities Research Associate at I Tatti, Harvard University Center for Italian Renaissance Studies): https://remogrillo.github.io/cidoc-crm_periodic_table/ ↩︎

  56. CIDOC-CRM compatible models and collaborations: https://www.cidoc-crm.org/collaborations ↩︎

  57. At the time of writing none of these CIDOC-CRM extensions have been formally approved by CRM-SIG. It is also worth mentioning that other extensions based on CIDOC-CRM have been developed by the wider community, such as Bio CRM, a data model for representing biographical data for prosopographical research (see Tuominen et al., 2017) or ArchOnto, which is a model created for archives (see Koch et al., 2020). ↩︎

  58. CRMact: https://www.cidoc-crm.org/crmact/ ↩︎

  59. CRMarchaeo: https://cidoc-crm.org/crmarchaeo/ ↩︎

  60. CRMba: https://www.cidoc-crm.org/crmba/ ↩︎

  61. CRMdig: https://www.cidoc-crm.org/crmdig/ ↩︎

  62. CRMgeo: https://www.cidoc-crm.org/crmgeo/ ↩︎

  63. CRMinf: https://www.cidoc-crm.org/crminf/ ↩︎

  64. CRMsci: https://www.cidoc-crm.org/crmsci/ ↩︎

  65. CRMsoc: https://www.cidoc-crm.org/crmsoc/ ↩︎

  66. CRMtex: https://www.cidoc-crm.org/crmtex/ ↩︎

  67. FRBRoo: https://www.cidoc-crm.org/frbroo/ ↩︎

  68. PRESSoo: https://www.cidoc-crm.org/pressoo/ ↩︎

  69. Linked Art: https://linked.art ↩︎

  70. DCMI Metadata Terms: https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ ↩︎

  71. Getty Vocabularies: https://www.getty.edu/research/tools/vocabularies/ ↩︎

  72. Mastodon: https://joinmastodon.org/ ↩︎

  73. Homosaurus: https://homosaurus.org/ ↩︎

  74. DOLCE: www.loa.istc.cnr.it/dolce/overview.html ↩︎

  75. It must be noted though that the use of DLs in KR predates the emergence of ontological modelling in the context of the Web, with its origins going back to the creation of the first DL modelling languages in the mid-1980s (Krötzsch et al., 2013). ↩︎

  76. LinkedDataGPT: https://ld.gpt.liip.ch/ ↩︎

  77. Neo4j: https://neo4j.com/ ↩︎

  78. GB and PB are units of digital information storage capacity. 1 GB is equal to 1,000,000,000 ($10^{9}$) bytes, 1 TB is equal to 1,000,000,000,000 ($10^{12}$) bytes, and 1 PB, is equal to 1,000,000,000,000,000 ($10^{15}$) bytes. If a standard high-definition movie is around 4-5 GB, then 1 PB could store tens of thousands of movies. In 2011, (Gomes et al., 2011) [p. 414] reported that the Internet Archive held 150,000 million contents of archived websites – crawled through the Wayback Machine – or approximately 5.5 PB. As of December 2021, it was about 57 PB of archived websites and a total used storage of 212 PB, see https://archive.org/web/petabox.php. ↩︎

  79. In this context, UX is understood as an umbrella term encompassing both user and/or customer service, emphasising that the focus is on individuals who need or use a given service, regardless of their categorisation as users or customers. ↩︎

  80. According to (Nargesian et al., 2019) [p. 1986], a data lake is a vast collection of datasets that has four characteristics. It can be stored in different storage systems, exhibit varying formats, may lack useful metadata or use differing metadata formats, and can change autonomously over time. ↩︎

  81. An interesting initiative in this area is the use of RAIL, which empower developers to restrict the use of AI on the software they develop to prevent irresponsible and harmful applications: https://www.licenses.ai/ ↩︎

  82. Common Objects in Context: https://cocodataset.org/ ↩︎

  83. Viscounth – A Large Dataset for Visual Question Answering for Cultural Heritage: https://github.com/misaelmongiovi/IDEHAdataset ↩︎

  84. Artificial Intelligence for Libraries, Archives & Museums: https://sites.google.com/view/ai4lam ↩︎

  85. AEOLIAN Network: https://www.aeolian-network.net/ ↩︎

  86. Newspaper Navigator: https://news-navigator.labs.loc.gov/ ↩︎

  87. (Perrigo, 2023) investigated that Kenyan workers made less than USD 2 an hour to identify and filter out harmful content for ChatGPT. ↩︎

  88. FOSTER Plus (Fostering the practical implementation of Open Science in Horizon 2020 and beyond) was a 2-year EU-funded project initiated in 2017 with 11 partners across 6 countries. Its main goal was to promote a lasting shift in European researchers’ behaviour towards Open Science becoming the norm. ↩︎

  89. According to the Open Knowledge Foundation, a non-profit network established in 2004 in the U.K., which aims to promote the idea of open knowledge, sets out some some principles around the concept of openness and defines it as follows: ‘Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness)’. https://opendefinition.org/ ↩︎

  90. Phrenosis phronesis[26:4] in philosophy is related to ‘practical understanding; wisdom, prudence; sound judgement’ (Oxford English Dictionary, 2023) ↩︎

  91. Zooniverse: https://www.zooniverse.org/ ↩︎

  92. FromThePage: https://fromthepage.com/ ↩︎

  93. FAIR Principles: https://www.go-fair.org/fair-principles/ ↩︎

  94. FAIR Signposting: https://signposting.org/FAIR/. Signposting focuses on expressing the topology of digital objects on the web with a view to increasing the FAIRness of scholarly objects in a distributed manner (Van de Sompel, 2023). ↩︎

  95. CARE Principles for Indigenous Data Governance: https://www.gida-global.org/care ↩︎

  96. The Santa Barbara Statement on Collections as Data: https://collectionsasdata.github.io/statement/ ↩︎

  97. British Library’s Research Repository: https://bl.iro.bl.uk/ ↩︎

  98. Data Foundry – Data collections from the National Library of Scotland: https://data.nls.uk/ ↩︎

  99. LoC Labs Data Sandbox: https://data.labs.loc.gov/ ↩︎

  100. Royal Danish Library’s Mediestream: https://www2.statsbiblioteket.dk/mediestream/avis ↩︎

  101. Meemoo’s Art in Flanders: https://artinflanders.be/ ↩︎

  102. BVMC Labs: https://data.cervantesvirtual.com/ ↩︎

  103. DATA-KBR-BE: https://www.kbr.be/en/projects/data-kbr-be/ ↩︎

  104. A Checklist to Publish Collections as Data in GLAM Institutions: https://glamlabs.io/checklist/ ↩︎

  105. The birth of the Web: https://home.cern/science/computing/birth-web ↩︎

  106. All general-purpose servers must support the methods GET and HEAD. All other methods are optional. ↩︎

  107. Schema.org: https://schema.org/ ↩︎

  108. (Wood et al., 2014, p. 35) ↩︎

  109. FOL, also known as first-order predicate logic or first-order predicate calculus, is a formal system of symbolic logic used in mathematics, philosophy, and computer science. It is a logical framework for expressing and reasoning about statements involving objects and their properties and relationships. In FOL, statements are represented using variables, constants, functions, and predicates. It allows for the quantification of variables and the formulation of statements such as ∀ (for all) and ∃ (there exists), which enable the expression of universal and existential quantification. As such, FOL can express facts concerning some or all of the objects in the universe. Its epistemological commitment, i.e. what an agent believes about facts, is concentrated of what is true, false, or unknown (see Russell & Norvig, 2010, p. 285 ff.) ↩︎

  110. It must be noted that DL, a subset of FOL – briefly introduced in 3.2.4.4, has a more restricted syntax and semantics tailored for ontology modelling. ↩︎

  111. IRI is an extension of URI that allows for the use of international characters and symbols in web addresses. ↩︎

  112. JSON-LD will be discussed through examples in . ↩︎

  113. 5-star Open Data: https://5stardata.info/ ↩︎

  114. It is worth mentioning a related initiative to illustrate all datasets available as LOD in a so-called LOD cloud, which attempts to show the extent of LOD by domain or type on the web, but the website (https://lod-cloud.net/) is sometimes inaccessible. The sequence of diagrams, from 2007 up to 2020, can still be accessed on Wikidata: https://www.wikidata.org/entity/Q43984865 ↩︎

  115. Pelagios network: https://pelagios.org ↩︎

  116. Europeana: https://www.europeana.eu/ ↩︎

  117. BnF Data: https://data.bnf.fr/ ↩︎

  118. Swedish National Heritage Board: https://www.raa.se/ ↩︎

  119. Dutch Digital Heritage Network: https://netwerkdigitaalerfgoed.nl ↩︎

  120. LOD4Culture: https://lod4culture.gsic.uva.es/ ↩︎

  121. ‘Gartner’s Hype Cycle, introduced in 1995, characterizes the typical progression of an emerging technology from overenthusiasm through a period of disillusionment to an eventual understanding of the technology’s relevance and role in a market or domain’ (Linden & Fenn, 2003, p. 5). The x axis is concerned with maturity and the y axis with visibility. The five phases of the curve, on the y axis, are as follows: technology trigger, peak of inflated expectations, trough of disillusionment, slope of enlightenment, and plateau of productivity. ↩︎

  122. Desgin Principles of LOUD: https://linked.art/loud/ ↩︎

  123. Initially, a snowballing method was considered to find additional references (see Wohlin, 2014), but it yielded limited results. Most papers referred to only one or two references related to LOUD, primarily citing (Sanderson, 2018) and the design principles on the Linked Art website. ↩︎

  124. ∨ stands for OR and ∧ for AND. ↩︎

  125. Google Scholar: https://scholar.google.com/ ↩︎

  126. Semantic Scholar: https://semanticscholar.org/ ↩︎

  127. Zenodo: https://zenodo.org/ ↩︎

  128. lobid: https://lobid.org ↩︎

  129. CAA Data Dragons: https://datadragon.link/ ↩︎

  130. Project at the Royal Museum for Central Africa: https://fine-arts-museum.be/en/research/research-projects/loud ↩︎

  131. A good example of projects combining different specifications without explicitly mentioning LOUD is the O’Keefee Museum, where IIIF and Linked Art have been applied for cross-collection exploration (see Degler et al., 2020). ↩︎

  132. IIIF: https://iiif.io – the acronym is pronounced ‘triple eye eff’. ↩︎

  133. No, it wasn’t a napkin. ↩︎

  134. The conundrum refers to the challenge of information and effort being compartmentalised within organisations or repositories. ↩︎

  135. Shared Canvas Data Model: https://iiif.io/api/model/shared-canvas/1.0/ ↩︎

  136. OAI-ORE: https://www.openarchives.org/ore/ ↩︎

  137. IIIF Slack: https://iiif.slack.com ↩︎

  138. OpenURL: https://developers.exlibrisgroup.com/sfx/apis/web_services/openurl/ ↩︎

  139. Grants allocated to Stanford University by the Mellon Foundation to support the development of IIIF: https://www.mellon.org/grant-details/international-image-interoperability-framework-9268 and https://www.mellon.org/grant-details/lam-convergence-via-iiif-shared-canvas-devconx-and-bumblebee-10319 ↩︎

  140. See https://training.iiif.io/time_machine/iiif_intro/timeline.html ↩︎

  141. Mirador: https://projectmirador.org/ ↩︎

  142. UV: https://universalviewer.io/ ↩︎

  143. OSD: https://openseadragon.github.io/ ↩︎

  144. DICOM: https://www.dicomstandard.org/ ↩︎

  145. IIIF Groups: https://iiif.io/community/groups/ ↩︎

  146. IIIF Groups Framework: https://iiif.io/community/groups/framework/ ↩︎

  147. Past IIIF-Hosted Events: https://iiif.io/event/ ↩︎

  148. IIIF-C: https://iiif.io/community/consortium/ ↩︎

  149. IIIF Editorial Committee: https://iiif.io/community/editors/ ↩︎

  150. CoCo: https://iiif.io/community/coordinating-committee/ ↩︎

  151. TRC: https://iiif.io/community/trc/ ↩︎

  152. IIIF Specifications: https://iiif.io/api/ ↩︎

  153. i18n: https://jsonforms.io/docs/i18n/ ‘he term internationalisation is often represented as i18n, where 18 is the number of letters between the word’s opening “i” and closing “n”’ – (Sheldon, 2023). ↩︎

  154. For instance, if all IIIF APIs try to avoid dependencies on technology, per the fourth IIIF design principle, there are still individuals and institutions from the wider community that have done research on the current and best options in terms of image formats (JPEG2000 versus pyramidal TIFF), types of conversion, storage, and optimal software (Cossu, 2019; see Gomez et al., 2020; Robson et al., 2023; Rosenthaler, 2023). ↩︎

  155. IIIF Editorial Process: https://iiif.io/community/policy/editorial/ ↩︎

  156. IIIF-Discuss (mailing list): https://groups.google.com/g/iiif-discuss ↩︎

  157. IIIF on GitHub: https://github.com/iiif ↩︎

  158. IIIF Image API Compliance, Version 3: https://iiif.io/api/image/3.0/compliance/ ↩︎

  159. IIIF Image API Validator: https://iiif.io/api/image/validator/ ↩︎

  160. However, to provide such semantic metadata, IIIF adopters can leverage the rdfs:seeAlso property and point to structured metadata for aggregation and discovery purposes (Raemy, 2020, pp. 15–16). A registry of profiles is available for this specific use case: https://iiif.io/api/registry/profiles/. ↩︎

  161. IIIF Cookbook: https://iiif.io/api/cookbook/ ↩︎

  162. GeoJSON-LD: https://geojson.org/geojson-ld/ ↩︎

  163. IIIF Presentation API Validator: https://presentation-validator.iiif.io/ ↩︎

  164. IIIF Registry: https://registry.iiif.io/ ↩︎

  165. Cologny, Fondation Martin Bodmer, Cod. Bodmer 600a: The Life of Buddha, first book (Shaka no Honji, jō). https://doi.org/10.5076/e-codices-fmb-cb-0600a ↩︎

  166. Durham University’s Jalava: https://iiif.durham.ac.uk/jalava/universe.html ↩︎

  167. IIPImage: https://iipimage.sourceforge.io/ ↩︎

  168. Cantaloupe: https://cantaloupe-project.github.io/ ↩︎

  169. SIPI: https://sipi.io/ ↩︎

  170. Leaflet-IIIF: https://github.com/mejackreed/Leaflet-IIIF ↩︎

  171. Annona: https://ncsu-libraries.github.io/annona/multistoryboard/ ↩︎

  172. Clover: https://samvera-labs.github.io/clover-iiif/ ↩︎

  173. Ramp: https://iiif-react-media-player.netlify.app/ ↩︎

  174. IIIF 3.0 Viewer Matrix: https://iiif.io/api/cookbook/recipe/matrix/ ↩︎

  175. Storiiies: https://www.cogapp.com/storiiies ↩︎

  176. Exhibit: https://www.exhibit.so/ ↩︎

  177. Awesome IIIF: https://github.com/IIIF/awesome-iiif ↩︎

  178. e-codices: https://www.e-codices.ch/ ↩︎

  179. Guides to finding IIIF resources: https://iiif.io/guides/finding_resources/ ↩︎

  180. Cultural Japan: https://cultural.jp/ ↩︎

  181. WADM: https://www.w3.org/TR/annotation-model/ ↩︎

  182. OAC specified the Open Annotation Data Model, a precursor of WADM: https://web.archive.org/web/20230328030315/http://www.openannotation.org/spec/core/ ↩︎

  183. Wikibase: https://wikiba.se/ ↩︎

  184. The idea for such formulation is not mine and can be credited to (Kembellec, 2023) on his tweet about Bruno Bachimont and IIIF. I have revisited this thought to discuss Web Annotations. ↩︎

  185. The Web Annotation Working Group was chartered from August 2014 to October 2016, and was extended to February 2017 when WADM was officially vetted as a standard. https://www.w3.org/groups/wg/annotation/. ↩︎

  186. Linked Art: https://linked.art ↩︎

  187. Linked Art on GitHub: https://github.com/linked-art/linked.art ↩︎

  188. Linked Art Call, 8am Pacific, 11am Eastern on 2019-01-30 – Message on the Linked Art Google Group by Robert Sanderson: https://groups.google.com/g/linked-art/c/OCoz0lEdpYY ↩︎

  189. Kress Foundation: https://www.kressfoundation.org/ ↩︎

  190. AHRC: https://www.ukri.org/councils/ahrc/ ↩︎

  191. AAC: https://americanart.si.edu/about/american-art-collaborative ↩︎

  192. Pharos: http://pharosartresearch.org/ ↩︎

  193. ResearchSpace: https://researchspace.org/ ↩︎

  194. Linked Art V1.0 was finally released on 19 February 2025: https://linked.art/about/1.0/ ↩︎

  195. Linked Art Data Model: https://linked.art/model/ ↩︎

  196. Linked Art Profile of CIDOC-CRM: https://linked.art/model/profile/ ↩︎

  197. Notes on colours for data visualisation purposes: brown is used for physical objects, purple for digital objects, green for place, yellow and orange for conceptual entities, blue for activity, teal for timespan, and red for agents. ↩︎

  198. Linked Art Baseline Patterns: https://linked.art/model/base/ ↩︎

  199. Van Rijn, Rembrandt (1642). De Nachtwacht [Oil on canvas]. Amsterdam Museum on permanent loan to Rijksmuseum, Amsterdam, The Netherlands. http://hdl.handle.net/10934/RM0001.COLLECT.5216 ↩︎

  200. Vocabulary Terms in Linked Art: https://linked.art/model/vocab/ ↩︎

  201. The CIDOC-CRM namespaces from classes and properties, their prefix numbers, inconsistencies (is_, was_, etc.), as well as most dashes are removed in Linked Art. CamelCase is also used as common practice for classes. ↩︎

  202. Linked Art API Design Principles and Requirements: https://linked.art/api/1.0/principles/ ↩︎

  203. Linked Art API Endpoints: https://linked.art/api/1.0/endpoint/ ↩︎

  204. Linked Art Schema Definitions: https://linked.art/api/1.0/schema_docs/ ↩︎

  205. Linked Art HAL Links: https://linked.art/api/rels/1/ ↩︎

  206. Linked Art Community: https://linked.art/community/ ↩︎

  207. Oxford e-Research Centre: https://oerc.ox.ac.uk/ ↩︎

  208. First AHRC grant associated with Linked Art:
    https://linked.art/community/projects/researchnetwork/ ↩︎

  209. Second AHRC grant associated with Linked Art: https://linked.art/community/projects/linkedartii/ ↩︎

  210. Pre-Raphaelites Online: https://preraphaelitesonline.org/ ↩︎

  211. Linked Conservation Data: https://www.ligatus.org.uk/project/linked-conservation-data ↩︎

  212. LD4 Art & Design Affinity Group: https://ld4.github.io/art-design/ ↩︎

  213. EODEM: https://cidoc.mini.icom.museum/working-groups/documentation-standards/eodem-home/ ↩︎

  214. SARI Documentation: https://docs.swissartresearch.net/ ↩︎

  215. Arches: https://www.archesproject.org/ ↩︎

  216. Getty Conservation Institute: https://www.getty.edu/conservation/ ↩︎

  217. World Monuments Fund: https://www.wmf.org/ ↩︎

  218. LINCS: https://lincsproject.ca/ ↩︎

  219. HIMMS: https://www.himss.org ↩︎

  220. It is worth noting that although archives and museums have standardised their metadata practices later than libraries, they seem to have identified core models that can be more easily implemented to meet Linked Data principles. ↩︎

  221. ‘Praxiography is a concept that describes the particular methodology of practice theory–driven research.’ ‘Praxiography is a concept that describes the particular methodology of practice theory–driven research.’ (Bueger, 2019) ↩︎

  222. NetworkX: https://networkx.org/ ↩︎

  223. Gephi: https://gephi.org/ ↩︎

  224. pandas – Python Data Analysis Library: https://pandas.pydata.org/ ↩︎

  225. RStudio: https://posit.co/download/rstudio-desktop/ ↩︎

  226. The LOUD Social Fabrics: https://github.com/julsraemy/loud-socialfabrics ↩︎

  227. Salsah: https://web.archive.org/web/20240229093041/https://www.salsah.org/ ↩︎

  228. Linked Art Collection Data Workflow: https://github.com/tgra/Linked-Art-Collection-Data-Workflow ↩︎

  229. Whisper: https://github.com/openai/whisper ↩︎

  230. IIIF Presentation API Validator: https://presentation-validator.iiif.io/ ↩︎

  231. LOUD Consistency: https://github.com/julsraemy/loud-consistency ↩︎

  232. Bourdieu’s theory of fields conceptualises social life as a series of distinct but interrelated arenas or ‘fields’, each with its own rules, structures and forms of capital. Actors within these fields compete for dominance, status and resources, guided by both their ‘habitus’ and the specific capital they possess. ↩︎

  233. The LOUD Social Fabrics: https://github.com/julsraemy/loud-socialfabrics/ ↩︎

  234. In the earlier days of the IIIF community, there was much more fluidity in terms of roles and responsibilities, especially within the Editorial Board, something Linked Art is still experiencing. ↩︎

  235. Do we have a documented example for expressing image rights?: https://github.com/linked-art/linked.art/issues/311 ↩︎

  236. API Displacements and/or Transclusions: https://github.com/linked-art/linked.art/issues/411 ↩︎

  237. Inject classes/properties into the CRM hierarchy?: https://github.com/linked-art/linked.art/issues/479 ↩︎

  238. IIIF Discovery TSG: https://iiif.io/community/groups/discovery/ ↩︎

  239. Cf. The LOUD Social Fabrics – Meeting Minutes: https://github.com/julsraemy/loud-socialfabrics/tree/main/01_consensus-advocacy/meeting-minutes ↩︎

  240. Registry of Profiles: https://iiif.io/api/registry/profiles/ ↩︎

  241. Thinking about IIIF and SEO: https://guides.iiif.io/IIIF-and-SEO/ ↩︎

  242. For one of these face-to-face meetings, documentation was produced daily over four days, resulting in four distinct meeting records. Therefore, while only five face-to-face gatherings occurred, they account for eight entries in the total of 115 meetings documented. ↩︎

  243. Release Image API and Presentation API 3.0: https://github.com/IIIF/trc/issues/37 ↩︎

  244. Further details can be found in the official announcement of the final release of the third version of the core IIIF specifications, available at https://iiif.io/news/2020/06/04/IIIF-C-Announces-Final-Release-of-3.0-Specifications/ ↩︎

  245. The TRC includes a distinct group of ex officio members alongside IIIF-C representatives and up to five community representatives. As of May 2020, IIIF-C comprised 58 member institutions. Despite the significance of the issue at hand, less than half of the eligible voting members, totalling 30, participated in this decision-making process, which can take place asynchronously and as late as two weeks after the meeting. ↩︎

  246. An example of deviation from the IIIF specifications can be seen in the Digitaler Lesesaal of the archives of Basel-Stadt and St. Gallen. Despite their efforts to implement IIIF within their systems (see Kansy & Lüthi, 2022), there are technical challenges that could hinder the accessibility of IIIF Manifests beyond their portals. Key issues include non-compliant IIIF Manifests, problematic CORS header configurations, and the way in which URIs are minted, which prevent their use in external IIIF clients. Such practices not only challenge the core tenets of interoperability, but also suggest the need for greater adherence to the IIIF design principles and the exploration of alternatives that would still ensure compliance. ↩︎

  247. A few months after this investigation, Aviary, the Glycerine Viewer, Theseus, and the Curation Viewer have been added to the support matrix. ↩︎

  248. More information can be found at https://github.com/julsraemy/loud-socialfabrics/tree/main/01_consensus-advocacy/guidelines. ↩︎

  249. Embedding HTML in descriptive properties: https://iiif.io/api/cookbook/recipe/0007-string-formats/ ↩︎

  250. Rigths statement: https://iiif.io/api/cookbook/recipe/0008-rights/ ↩︎

  251. Support Deep Viewing with Basic Use of a IIIF Image API Service: https://iiif.io/api/cookbook/recipe/0005-image-service/ ↩︎

  252. Image in Annotations: https://iiif.io/api/cookbook/recipe/0377-image-in-annotation/ ↩︎

  253. Image of the Gänseliesel in Göttingen taken at the 2019 IIIF Conference: https://iiif.io/api/image/3.0/example/reference/918ecd18c2592080851777620de9bcb5-gottingen/full/max/0/default.jpg ↩︎

  254. Mahler Symphony Audio: https://fixtures.iiif.io/audio/indiana/mahler-symphony-3/CD1/medium/128Kbps.mp4 ↩︎

  255. Lunchroom Manners Video: https://fixtures.iiif.io/video/indiana/lunchroom_manners/high/lunchroom_manners_1024kb.mp4 ↩︎

  256. It must be said that each time a new pattern is created or modified on the Linked Art website, it is likely that existing URIs of JSON-LD serialisation may also change as they have been dynamically minted. ↩︎

  257. Baseline Patterns: https://linked.art/model/base/ ↩︎

  258. More information can be found at https://github.com/julsraemy/loud-socialfabrics/blob/main/01_consensus-advocacy/guidelines/patterns/linked_art_patterns_similarity.csv ↩︎

  259. The survey was being promoted primarily on the Slack workspace of both communities, on their respective mailing list, as well as on Mastodon (see Raemy, 2023). ↩︎

  260. Among the survey respondents, 16 individuals reported involvement in both the IIIF and Linked Art communities, demonstrating a notable cross-community engagement. In addition, 38 participants were active in either IIIF or Linked Art, with a significant majority, 36 respondents, aligning themselves with the IIIF community. In contrast, a much smaller group, only 2 respondents, indicated that they were exclusively involved with Linked Art. Interestingly, a significant proportion, 25 individuals, indicated no affiliation with either community, highlighting the diverse levels of engagement and interest within the CH field. ↩︎

  261. IIIF Ambassadors: https://iiif.io/community/ambassadors/ ↩︎

  262. Codeberg: https://codeberg.org/ ↩︎

  263. IIIF Commons: https://github.com/IIIF-Commons ↩︎

  264. The contrasting models of support between these Mirador and UV illustrate the diversity of needs and approaches within the IIIF ecosystem. The goal is to balance the development of versatile, multi-functional tools with the creation of specialised, task-driven applications. ↩︎

  265. Geonames: https://www.geonames.org/ ↩︎

  266. The legacy data model in JSON-LD ca be viewed at https://raw.githubusercontent.com/Participatory-Image-Archives/pia-data-model/main/ontology/SGV-EKWS/salsah/sgv_old_ontology.json. ↩︎

  267. Cf. https://en.wikipedia.org/wiki/Camel_case ↩︎

  268. Within PIA, a dedicated tool was created so that the conservator-restorers could write notes about their work on objects from the three collections: https://pia-restoration.dhlab.unibas.ch/. ↩︎

  269. Author’s translation: ‘We need to give up on the idea of syntactic or structural interoperability through the use of a single model, whether for producing, storing or managing data within an information system’. ↩︎

  270. CHAOSS: https://chaoss.community/ ↩︎

  271. IIIF Annual Conference and Showcase - Los Angeles, CA, USA - June 4-7, 2024: https://iiif.io/event/2024/los-angeles/ ↩︎

  272. Micrio: https://micr.io/ ↩︎

  273. Message written on the IIIF Slack Workspace on 28 October 2022. ↩︎

  274. mirador-image-tools: https://github.com/ProjectMirador/mirador-image-tools ↩︎

  275. For instance, this user interface view of Claude Monet (1840-1926): https://lux.collections.yale.edu/view/person/642a0152-1567-4fbe-93f3-66f11c5cab9a and its Linked Art counterpart: https://lux.collections.yale.edu/data/person/642a0152-1567-4fbe-93f3-66f11c5cab9a ↩︎

  276. QLever: https://github.com/ad-freiburg/qlever ↩︎

  277. The title of the section is an homage to Bruno Latour and a passage found in his book ‘We have never been modern’. ↩︎

  278. The term ‘Global South’ has been replaced as it overgeneralises vastly different economies, cultures, and development stages across Africa, Asia, Latin America, and Oceania. It creates a problematic binary that masks the complex and diverse realities of these regions, while implying a homogeneity that doesn’t exist. More specific regional descriptors better acknowledge the unique contexts and contributions of institutions from these varied regions. ↩︎

Bibliography

Adamou, Alessandro, Picca, Davide, Hou, Yumeng, & Loreto Granados-García, Paula. (2023). The Facets of Intangible Heritage in Southern Chinese Martial Arts: Applying a Knowledge-driven Cultural Contact Detection Approach. Journal on Computing and Cultural Heritage, 16(3), 63:1-63:27. https://doi.org/10.1145/3606702
Adamou, Alessandro. (2022). Shout LOUD on a road trip to FAIRness: Experience with integrating open research data at the Bibliotheca Hertziana. Journal of Art Historiography, 2022(27s). https://arthistoriography.wordpress.com/27s-dec22/
Ahmad, Yahaya. (2006). The Scope and Definitions of Heritage: From Tangible to Intangible. International Journal of Heritage Studies, 12(3), 292–300. https://doi.org/10.1080/13527250600604639
Akhtar, Nadeem. (2014). Social Network Analysis Tools. 2014 Fourth International Conference on Communication Systems and Network Technologies, 388–392. https://doi.org/10.1109/CSNT.2014.83
Akrich, Madeleine, Callon, Michel, & Latour, Bruno (Eds.). (2006). Sociologie de la traduction: Textes fondateurs. Presses des Mines. https://doi.org/10.4000/books.pressesmines.1181
Alexiev, Vladimir. (2018). Museum Linked Open Data: Ontologies, Datasets, Projects. Digital Presentation and Preservation of Cultural and Scientific Heritage, 8, 19–50. https://doi.org/10.55630/dipp.2018.8.1
Allen, Laurie. (2023). Why Experiment: Machine Learning at the Library of Congress. In The Library of Congress. https://blogs.loc.gov/thesignal/2023/11/why-experiment-machine-learning-at-the-library-of-congress/
Alter, George, Rizzolo, Flavio, & Schleidt, Kathi. (2023). View points on data points: A shared vocabulary for cross-domain conversations on data and metadata. IASSIST Quarterly, 47(1), 1–39. https://doi.org/10.29173/iq1051
Anderson, Michael L. (2014). After Phrenology: Neural Reuse and the Interactive Brain. The MIT Press. https://doi.org/10.7551/mitpress/10111.001.0001
Andresen, S. L. (2002). John McCarthy: Father of AI. IEEE Intelligent Systems, 17(5), 84–85. https://doi.org/10.1109/MIS.2002.1039837
Andresoo, Jane. (2018). The hundred-year-old national library’s message to future generations. Bosniaca, 90–94. https://doi.org/10.37083/bosn.2018.23.90
Angius, Nicola, Primiero, Giuseppe, & Turner, Raymond. (2021). The Philosophy of Computer Science. In Edward N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Spring 2021). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/spr2021/entries/computer-science/
Appleby, Michael, Childress, Dawn, Crane, Tom, Mixter, Jeff, Sanderson, Robert, & Warner, Simeon. (2022). IIIF Content Search API 2.0. In International Image Interoperability Framework. https://iiif.io/api/search/2.0/
Appleby, Michael, Childress, Dawn, Crane, Tom, Mixter, Jeff, Sanderson, Robert, & Warner, Simeon. (2023). IIIF Authorization Flow API 2.0. In International Image Interoperability Framework. https://iiif.io/api/auth/2.0/
Appleby, Michael, Childress, Dawn, Crane, Tom, Mixter, Jeff, Sanderson, Robert, Warner, Simeon, & Whitaker, Maria. (2022). IIIF Content State API 1.0. In International Image Interoperability Framework. https://iiif.io/api/content-state/1.0/
Appleby, Michael, Crane, Tom, Sanderson, Robert, Stroop, Jon, & Warner, Simeon. (2018). IIIF Design Principles. In International Image Interoperability Framework. https://iiif.io/api/annex/notes/design_principles/
Appleby, Michael, Crane, Tom, Sanderson, Robert, Stroop, Jon, & Warner, Simeon. (2020). IIIF Image API 3.0. In International Image Interoperability Framework. https://iiif.io/api/image/3.0/
Appleby, Michael, Crane, Tom, Sanderson, Robert, Stroop, Jon, & Warner, Simeon. (2020). IIIF Presentation API 3.0. In International Image Interoperability Framework. https://iiif.io/api/presentation/3.0/
Appleby, Michael, Crane, Tom, Sanderson, Robert, Stroop, Jon, & Warner, Simeon. (2021). IIIF Change Discovery API 1.0.0. In International Image Interoperability Framework. https://iiif.io/api/discovery/1.0/
Aslan, Zaki. (1997). Protective Structures for the Conservation and Presentation of Archaeological Sites. Journal of Conservation and Museum Studies, 3(0), 16. https://doi.org/10.5334/jcms.3974
Avram, Henriette D. (1968). The MARC Pilot Project. Final Report (ED029663; p. 173). Library of Congress. https://eric.ed.gov/?id=ED029663
Azzopardi, Elaine, Kenter, Jasper O., Young, Juliette, Leakey, Chris, O’Connor, Seb, Martino, Simone, Flannery, Wesley, Sousa, Lisa P., Mylona, Dimitra, Frangoudes, Katia, Béguier, Irène, Pafi, Maria, Silva, Arturo Rey da, Ainscough, Jacob, Koutrakis, Manos, Silva, Margarida Ferreira da, & Pita, Cristina. (2023). What are heritage values? Integrating natural and cultural heritage into environmental valuation. People and Nature, 5(2), 368–383. https://doi.org/10.1002/pan3.10386
Baader, Franz, & Lutz, Carsten. (2007). 13 Description logic. In Patrick Blackburn, Johan Van Benthem, & Frank Wolter (Eds.), Studies in Logic and Practical Reasoning (Vol. 3, pp. 757–819). Elsevier. https://doi.org/10.1016/S1570-2464(07)80016-4
Baca, Murtha, & Harpring, Patricia. (2017). Categories for the description of works of art [Report]. Getty Research Institute. https://apo.org.au/node/14985
Bachimont, Bruno. (2021). Archive et mémoire : Le numérique et les mnémophores. Signata. Annales Des Sémiotiques / Annals of Semiotics, 1(12). https://doi.org/10.4000/signata.2980
Bainbridge, Phil. (2021). Extract text from multiple Google Docs into a Sheet. In The Gift of Script. https://www.pbainbridge.co.uk/2021/07/extract-text-from-multiple-google-docs.html
Banerjee, Kyle. (2020). The Linked Data Myth. In Library Journal. https://www.libraryjournal.com/story/the-linked-data-myth
Barrile, Vincenzo, & Bernardo, Ernesto. (2022). Big Data and Cultural Heritage. In Francesco Calabrò, Lucia Della Spina, & María José Piñeira Mantiñán (Eds.), New Metropolitan Perspectives (Vol. 482, pp. 2708–2716). Springer International Publishing. https://doi.org/10.1007/978-3-031-06825-6_259
Bastian, Mathieu, Heymann, Sebastien, & Jacomy, Mathieu. (2009). Gephi: An Open Source Software for Exploring and Manipulating Networks. Proceedings of the International AAAI Conference on Web and Social Media, 3(1), 361–362. https://doi.org/10.1609/icwsm.v3i1.13937
Battle, Robert, & Benson, Edward. (2008). Bridging the semantic Web and Web 2.0 with Representational State Transfer (REST). Journal of Web Semantics, 6(1), 61–69. https://doi.org/10.1016/j.websem.2007.11.002
Bauer, Florian, & Kaltenböck, Martin. (2012). Linked Open Data: The Esentials. edition mono.
Becattini, Federico, Bongini, Pietro, Bulla, Luana, Bimbo, Alberto Del, Marinucci, Ludovica, Mongiovì, Misael, & Presutti, Valentina. (2023). VISCOUNTH: A Large-scale Multilingual Visual Question Answering Dataset for Cultural Heritage. ACM Transactions on Multimedia Computing, Communications, and Applications, 19(6), 193:1-193:20. https://doi.org/10.1145/3590773
Bechhofer, Sean, Buchan, Iain, De Roure, David, Missier, Paolo, Ainsworth, John, Bhagat, Jiten, Couch, Philip, Cruickshank, Don, Delderfield, Mark, Dunlop, Ian, Gamble, Matthew, Michaelides, Danius, Owen, Stuart, Newman, David, Sufi, Shoaib, & Goble, Carole. (2013). Why linked data is not enough for scientists. Future Generation Computer Systems, 29(2), 599–611. https://doi.org/10.1016/j.future.2011.08.004
Beckett, Dave. (2004). RDF/XML Syntax Specification (Revised). In W3C. https://www.w3.org/TR/REC-rdf-syntax/
Beckett, Davia, Berners-Lee, Tim, Prud’hommeaux, Eric, & Carothers, Gavin. (2014). RDF 1.1 Turtle. In W3C. https://www.w3.org/TR/turtle/
Bekiari, Chryssoula, Bruseker, George, Canning, Erin, Doerr, Martin, Michon, Philippe, Ore, Christian-Emil, Stead, Stephen, & Velios, Athanasios. (2024). CIDOC Conceptural Reference Model 7.1.3. https://www.cidoc-crm.org/Version/version-7.1.3
Bekiari, Chryssoula, Bruseker, George, Doerr, Martin, Ore, Christian-Emil, Stead, Stephen, & Velios, Athanasios. (2021). CIDOC Conceptural Reference Model 7.1.1. https://doi.org/10.26225/FDZH-X261
Benabdelkrim, Mohamed, Levallois, Clément, Savinien, Jean, & Robardet, Céline. (2020). Opening Fields: A Methodological Contribution to the Identification of Heterogeneous Actors in Unbounded Relational Orders. M@n@gement, 23(1), 4–18. https://doi.org/10.3917/mana.231.0004
Beretta, Francesco. (2021). A challenge for historical research: Making data FAIR using a collaborative ontology management environment (OntoME). Semantic Web, 12(2), 279–294. https://doi.org/10.3233/SW-200416
Beretta, Francesco. (2022). Interopérabilité des données de la recherche et ontologies fondationnelles : Un éco-système d’exnsions du CIDOC CRM pour les sciences humaines et sociales. In Nicolas Lasolle, Olivier Bruneau, & Jean Lieber (Eds.), Actes des journées humanités numériques et Web sémantique (pp. 2–22). Les Archives Henri-Poincaré - Philosophie et Recherches sur les Sciences et les Technologies (AHP-PReST); Laboratoire lorrain de recherche en informatique et ses applications (LORIA). https://doi.org/10.5281/zenodo.7014341
Bermès, Emmanuelle. (2023). Modélisons un peu : Le choix d’un type de bases de données. In Figoblog. https://figoblog.org/2023/12/13/modelisons-un-peu-le-choix-dun-type-de-bases-de-donnees/
Bernasconi, Eleonora, Ceriani, Miguel, & Mecella, Massimo. (2023). Linked Data interfaces: A survey. In Bardi Alessia, Falcon Alex, Ferilli Stefano, Marchesin Stefano, & Redavid Domenico (Eds.), Proceedings of the 19th Conference on Information and Research Science Connecting to Digital and Library Science (Vol. 3365, pp. 1–16). CEUR. https://ceur-ws.org/Vol-3365/#paper1
Berners-Lee, Tim, Cailliau, Robert, Luotonen, Ari, Nielsen, Henrik Frystyk, & Secret, Arthur. (1994). The World-Wide Web. Communications of the ACM, 37(8), 76–82. https://doi.org/10.1145/179606.179671
Berners-Lee, Tim, Fielding, Roy T., & Masinter, Larry M. (2005). Uniform Resource Identifier (URI): Generic Syntax (Request for {Comments} RFC 3986). Internet Engineering Task Force. https://doi.org/10.17487/RFC3986
Berners-Lee, Tim, Hendler, James, & Lassila, Ora. (2001). The Semantic Web. Scientific American, 284(5), 34–43. https://www.jstor.org/stable/26059207
Berners-Lee, Tim. (1991). WorldWideWeb - executive summary. In archive.md. https://archive.md/Lfopj
Berners-Lee, Tim. (1999). Realising the Full Potential of the Web. Technical Communication, 46(1), 79–82. https://www.jstor.org/stable/43088605
Berners-Lee, Tim. (2006). Linked Data. In W3C. https://www.w3.org/DesignIssues/LinkedData.html
Berners-Lee, Tim. (2010). Long Live the Web. Scientific American, 303(6), 80–85. https://www.jstor.org/stable/26002308
Berressem, Hanjo. (2015). Déjà Vu: Serres after Latour, Deleuze after Harman, ‘Nature Writing’ after ‘Network Theory’. Amerikastudien / American Studies, 60(1), 59–79. https://www.jstor.org/stable/44071895
Bezjak, Sonja, Conzett, Philipp, Fernandes, Pedro L., Görögh, Edit, Helbig, Kerstin, Kramer, Bianca, Labastida, Ignasi, Niemeyer, Kyle, Psomopoulos, Fotis, Ross-Hellauer, Tony, Schneider, René, Tennant, Jon, Verbakel, Ellen, & Clyburne-Sherin, April. (2019). The Open Science Training Handbook. FOSTER. https://doi.org/10.5281/zenodo.2587951
Bhawsar, Praphulla Ms, Bremer, Erich, Duggan, Máire A., Chanock, Stephen, Garcia-Closas, Montserrat, Saltz, Joel, & Almeida, Jonas S. (2023). ImageBox3: No-Server Tile Serving to Traverse Whole Slide Images on the Web [Preprint]. Research Square. https://doi.org/10.21203/rs.3.rs-2864977/v1
Binding, Ceri, Evans, Tim, Gilham, Jo, Tudhope, Douglas, & Wright, Holly. (2022). Linked Data for the Historic Environment. Internet Archaeology, 2022(59). https://doi.org/10.11141/ia.59.7
Bizer, Christian, Heath, Tom, & Berners-Lee, Tim. (2009). Linked Data - The Story So Far. International Journal on Semantic Web and Information Systems (IJSWIS), 5(3), 1–22. https://doi.org/10.4018/jswis.2009081901
Bizer, Christian, Heath, Tom, Idehen, Kingsley, & Berners-Lee, Tim. (2008). Linked data on the web (LDOW2008). Proceedings of the 17th International Conference on World Wide Web, 1265–1266. https://doi.org/10.1145/1367497.1367760
Blue Shield. (2016). Blue Shield Statutes (Articles of Association) (p. 16). https://web.archive.org/web/20230802104458/https://theblueshield.org/wp-content/uploads/2021/12/statute-Amendments_BSI_2016.pdf
Bolter, J. David, & Grusin, Richard A. (1999). Remediation: Understanding new media. MIT Press.
Bonney, Rick. (1996). Citizen Science: A Lab Tradition. Living Bird, 15(4), 7–15. https://www.biodiversitylibrary.org/page/59717773
Borgo, Stefano, Ferrario, Roberta, Gangemi, Aldo, Guarino, Nicola, Masolo, Claudio, Porello, Daniele, Sanfilippo, Emilio M., & Vieu, Laure. (2022). DOLCE: A descriptive ontology for linguistic and cognitive engineering. Applied Ontology, 17(1), 45–69. https://doi.org/10.3233/AO-210259
Boullier, Dominique. (2018). Médialab stories: How to align actor network theory and digital methods. Big Data & Society, 5(2), 2053951718816722. https://doi.org/10.1177/2053951718816722
Bourdieu, Pierre. (1988). Homo Academicus. Stanford University Press.
Bowman, Blythe A. (2008). Transnational Crimes Against Culture: Looting at Archaeological Sites and the “Grey” Market in Antiquities. Journal of Contemporary Criminal Justice, 24(3), 225–242. https://doi.org/10.1177/1043986208318210
Bradner, Scott O. (1997). Key words for use in RFCs to Indicate Requirement Levels (Request for {Comments} RFC 2119). Internet Engineering Task Force. https://doi.org/10.17487/RFC2119
Brickley, Dan, & Guha, R. V. (2014). RDF Schema 1.1. In W3C. https://www.w3.org/TR/rdf11-schema/
Brown, Karen, Cummins, Alissandra, & González Rueda, Ana S. (Eds.). (2023). Communities and Museums in the 21st Century: Shared Histories and Climate Action (1st ed.). Routledge. https://doi.org/10.4324/9781003288138
Brown, Susan, & Martin, Kimberley. (2020, June). Towards a Linked Infrastructure for Networked Cultural Scholarship. CSDH-SCHN 2020. https://doi.org/10.17613/nwnq-tm04
Brown, Susan, Canning, Erin, Martin, Kim, Roger, Sarah, & Schoenberger, Zach. (2021). Linking Communities of Practice. CSDH-SCHN 2021. https://doi.org/10.17613/w6r2-n763
Bruns, Sasha, Tietz, Tabea, & Sack, Harald. (2023). 2.6 Logical Inference with RDF(S) [Lecture]. https://open.hpi.de/courses/knowledgegraphs2023/items/1bKJLlFAZfOcNGm3ZTjlVk
Bruseker, George, Carboni, Nicola, & Guillem, Anaïs. (2017). Cultural Heritage Data Management: The Role of Formal Ontology and CIDOC CRM. In Matthew L. Vincent, Víctor Manuel López-Menchero Bendicho, Marinos Ioannides, & Thomas E. Levy (Eds.), Heritage and Archaeology in the Digital Age: Acquisition, Curation, and Dissemination of Spatial Cultural Heritage Data (pp. 93–131). Springer International Publishing. https://doi.org/10.1007/978-3-319-65370-9_6
Bueger, Christian. (2019). Praxiography. In Paul Atkinson, Sara Delamont, Alexandru Cernat, Joseph W. Sakshaug, & Richard A. Williams (Eds.), SAGE Research Methods Foundations. SAGE Publications Ltd. https://doi.org/10.4135/9781526421036807154
Burchardt, Jørgen. (2014). Researchers Outside APC-Financed Open Access: Implications for Scholars Without a Paying Institution. Sage Open, 4(4), 2158244014551714. https://doi.org/10.1177/2158244014551714
Callon, Michel. (1999). Actor-Network Theory—The Market Test. The Sociological Review, 47(1_suppl), 181–195. https://doi.org/10.1111/j.1467-954X.1999.tb03488.x
Callon, Michel. (2001). Actor Network Theory. In Neil J. Smelser & Paul B. Baltes (Eds.), International Encyclopedia of the Social & Behavioral Sciences (pp. 62–66). Pergamon. https://doi.org/10.1016/B0-08-043076-7/03168-5
Cameron, Fiona. (2007). Beyond the Cult of the Replicant: Museums and Historical Digital Objects—Traditional Concerns, New Discourses. In Fiona Cameron & Sarah Kenderdine (Eds.), Theorizing Digital Cultural Heritage: A Critical Discourse. The MIT Press. https://doi.org/10.7551/mitpress/9780262033534.003.0004
Camus, Alexandre, Paneva-Marinova, Dessislava, & Luchev, Detelin. (2013). Digital Humanities: Challenges of the Transformation of Tools and Objects of Knowledge in Contemporary Humanities. Digital Presentation and Preservation of Cultural and Scientific Heritage, 3, 109–118. https://doi.org/10.55630/dipp.2013.3.11
Candela, Gustavo, Escobar, Pilar, Carrasco, Rafael C., & Marco-Such, Manuel. (2018). Migration of a library catalogue into RDA linked open data. Semantic Web, 9(4), 481–491. https://doi.org/10.3233/SW-170274
Candela, Gustavo, Gabriëls, Nele, Chambers, Sally, Dobreva, Milena, Ames, Sarah, Ferriter, Meghan, Fitzgerald, Neil, Harbo, Victor, Hofmann, Katrine, Holownia, Olga, Irollo, Alba, Mahey, Mahendra, Manchester, Eileen, Pham, Thuy-An, Potter, Abigail, & Van Keer, Ellen. (2023). A checklist to publish collections as data in GLAM institutions. Global Knowledge, Memory and Communication, ahead-of-print(ahead-of-print). https://doi.org/10.1108/GKMC-06-2023-0195
Candela, Gustavo, Pereda, Javier, Sáez, Dolores, Escobar, Pilar, Sánchez, Alexander, Torres, Andrés Villa, Palacios, Albert A., McDonough, Kelly, & Murrieta-Flores, Patricia. (2023). An ontological approach for unlocking the Colonial Archive. Journal on Computing and Cultural Heritage, 3594727. https://doi.org/10.1145/3594727
Canning, Erin, Brown, Susan, Roger, Sarah, & Martin, Kimberley. (2022). The Power to Structure : Making Meaning from Metadata Through Ontologies. KULA: Knowledge Creation, Dissemination, and Preservation Studies, 6(3), 1–15. https://doi.org/10.18357/kula.169
Cantara, Linda. (2005). METS: The Metadata Encoding and Transmission Standard. Cataloging & Classification Quarterly, 40(3–4), 237–253. https://doi.org/10.1300/J104v40n03_11
Capadisli, Sarven. (2020). Linked Research on the Decentralised Web [{PhD} {Thesis}, Rheinischen Friedrich-Wilhelms-Universität Bonn]. https://hdl.handle.net/20.500.11811/8352
Capadisli, Sarven. (2023). Social and Technical Decisions — Inherently Social, Decentralised, and for Everyone. https://csarven.ca/presentations/inherently-social-decentralised-and-for-everyone
Caplan, Priscilla, & Guenther, Rebecca S. (2005). Practical Preservation: The PREMIS Experience. Library Trends, 54(1), 111–124. https://muse.jhu.edu/pub/1/article/193223
Capurro, Carlotta, & Plets, Gertjan. (2020). Europeana, EDM, and the Europeanisation of Cultural Heritage Institutions. Digital Culture & Society, 6(2), 163–190. https://doi.org/10.14361/dcs-2020-0209
Caraffa, Costanza, & Serena, Tiziana. (2015). Introduction: Photography, Archives and the Discourse of Nation. In Costanza Caraffa & Tiziana Serena (Eds.), Photo Archives and the Idea of Nation (pp. 3–16). De Gruyter. https://doi.org/10.1515/9783110331837
Carman, John. (2009). Where the Value Lies: The importance of materiality to the immaterial aspects of heritage. In Emma Waterton & Laurajane Smith (Eds.), Taking Archaeology Out of Heritage (pp. 192–208). Cambridge Scholars Publishing.
Carroll, Stephanie Russo, Garba, Ibrahim, Figueroa-Rodríguez, Oscar L., Holbrook, Jarita, Lovett, Raymond, Materechera, Simeon, Parsons, Mark, Raseroka, Kay, Rodriguez-Lonebear, Desi, Rowe, Robyn, Sara, Rodrigo, Walker, Jennifer D., Anderson, Jane, & Hudson, Maui. (2020). The CARE Principles for Indigenous Data Governance. Data Science Journal, 19(1), 43. https://doi.org/10.5334/dsj-2020-043
Carroll, Stephanie Russo, Herczog, Edit, Hudson, Maui, Russell, Keith, & Stall, Shelley. (2021). Operationalizing the CARE and FAIR Principles for Indigenous data futures. Scientific Data, 8(1), 108. https://doi.org/10.1038/s41597-021-00892-0
Cerf, V., & Kahn, R. (1974). A Protocol for Packet Network Intercommunication. IEEE Transactions on Communications, 22(5), 637–648. https://doi.org/10.1109/TCOM.1974.1092259
Chambers, Sally, Lemmers, Frédéric, Pham, Thuy-An, Birkholz, Julie M., Ducatteeuw, Vincent, Jacquet, Antoine, Dillen, Wout, Ali, Dilawar, Milleville, Kenzo, & Verstockt, Steven. (2021). Collections as Data: Interdisciplinary experiments with KBR’s digitised historical newspapers : A Belgian case study. DH Benelux 2021, Abstracts. http://hdl.handle.net/1854/LU-8712404
Chang, Liang, Sattler, Uli, & Gu, Tianlong. (2014). An ABox Revision Algorithm for the Description Logic EL_bot. In Meghyn Bienvenu, Magdalena Ortiz, Riccardo Rosati, & Mantas Simkus (Eds.), Informal Proceedings of the 27th International Workshop on Description Logics (Vol. 1193, pp. 459–470). CEUR. https://ceur-ws.org/Vol-1193/#paper_64
Chardonnens, Anne. (2020). La gestion des données d’autorité archivistiques dans le cadre du Web de données [Thèse de doctorat, Université libre de Bruxelles]. https://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/315804
Charles, David. (2017). Aristotle on Agency. In Oxford Handbooks Editorial Board (Ed.), The Oxford Handbook of Topics in Philosophy (p. 0). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199935314.013.6
Charles, Valentine, & Isaac, Antoine. (2015). Enhancing the Europeana Data Model (EDM) (p. 21) [White paper]. Europeana Foundation. http://pro.europeana.eu/files/Europeana_Professional/Publications/EDM_WhitePaper_17062015.pdf
Chiquet, Vera, Felsing, Ulrike, & Fornaro, Peter. (2023). A Participatory Interface for a Photo Archives. Archiving Conference, 20, 109–111. https://doi.org/10.2352/issn.2168-3204.2023.20.1.23
Chiquet, Vera. (2023). How to digitally preserve UNESCO intangible cultural heritage? A web-archive for ephemeral events at the Basler Carnival. Archiving Conference, 20, 105–108. https://doi.org/10.2352/issn.2168-3204.2023.20.1.22
Chute, Ryan, & Van De Sompel, Herbert. (2008). Introducing djatoka: A Reuse Friendly, Open Source JPEG 2000 Image Server. D-Lib Magazine, 14(9/10). https://doi.org/10.1045/september2008-chute
Ciula, Arianna, & Eide, Øyvind. (2016). Modelling in digital humanities: Signs in context. Digital Scholarship in the Humanities, fqw045. https://doi.org/10.1093/llc/fqw045
Clavaud, Florence, & Wildi, Tobias. (2021). ICA Records in Contexts-Ontology (RiC-O): A Semantic Framework for Describing Archival Resources. Linked Archives International Workshop 2021, 3019, 79–92. https://enc.hal.science/hal-03965776
Cobb, Joan. (2015). The Journey to Linked Open Data: The Getty Vocabularies. Journal of Library Metadata, 15(3–4), 142–156. https://doi.org/10.1080/19386389.2015.1103081
Coburn, Erin, Lanzi, Elisa, O’Keefe, Elizabeth, Stein, Regine, & Whiteside, Ann. (2010). The Cataloging Cultural Objects experience: Codifying practice for the cultural heritage community. IFLA Journal, 36(1), 16–29. https://doi.org/10.1177/0340035209359561
Coburn, Erin, Light, Richard, McKenna, Gordon, Stein, Regine, & Vitzthum, Axel. (2010). LIDO - Lightweight Information Describing Objects Version 1.0. https://lido-schema.org/schema/v1.0/lido-v1.0-specification.pdf
Coleman, Catherine Nicole. (2020). Managing Bias When Library Collections Become Data. International Journal of Librarianship, 5(1), 8–19. https://doi.org/10.23974/ijol.2020.vol5.1.162
Commock, Tracy, & Newell, Dionne. (2023). Connecting Museums through Citizen Science: Jamaica/US partnership in environmental preservation. In Karen Brown, Alissandra Cummins, & Ana S. González Rueda (Eds.), Communities and Museums in the 21st Century (pp. 137–159). Routledge. https://doi.org/10.4324/9781003288138-9
Constantopoulos, Panos, & Dallas, Costis. (2008). Aspects of a digital curation agenda for cultural heritage. 2008 IEEE International Conference on Distributed Human-Machine Systems. Athens, Greece: IEEE, 1–6.
Conway, Paul. (2015). Digital transformations and the archival nature of surrogates. Archival Science, 15(1), 51–69. https://doi.org/10.1007/s10502-014-9219-z
Cornut, Murielle, Raemy, Julien Antoine, & Spiess, Florian. (2023). Annotations as Knowledge Practices in Image Archives: Application of Linked Open Usable Data and Machine Learning. Journal on Computing and Cultural Heritage, 16(4), 1–19. https://doi.org/10.1145/3625301
Cornut, Murielle. (2023). Open, edit, save: Über die performative Materialität privater Fotoalben. In Ulrich Hägele (Ed.), Kuratierte Erinnerungen: Das Fotoalbum (pp. 157–170). Waxmann.
Corpas, Manuel, Kovalevskaya, Nadezda V., McMurray, Amanda, & Nielsen, Fiona G. G. (2018). A FAIR guide for data providers to maximise sharing of human genomic data. PLOS Computational Biology, 14(3), e1005873. https://doi.org/10.1371/journal.pcbi.1005873
Cossham, Amanda Frances. (2017). Models of the bibliographic universe [{PhD} {Thesis}, Monash University]. https://doi.org/10.4225/03/596e9bc6c1d09
Cossu, Stefano. (2019). Getty Common Image Service (p. 50) [Research \& {Design} {Report}]. J. Paul Getty Trust. https://www.getty.edu/project-files/iiif_getty_research_report.pdf
Cossu, Stefano. (2019). Labours of Love and Convenience: Dealing with Community-Supported Knowledge in Museums. Publications, 7(1), 19. https://doi.org/10.3390/publications7010019
Coyle, Karen, & Hillmann, Diane. (2007). Resource Description and Access (RDA): Cataloging Rules for the 20th Century. D-Lib Magazine, 13(1/2). https://doi.org/10.1045/january2007-coyle
Cramer, Dave. (2016). W3C and IDPF: Better Together? In Medium. https://medium.com/@dauwhe/w3c-and-idpf-better-together-c92988674444
Cramer, Tom. (2015). IIIF Consortium Formed. In International Image Interoperability Framework. https://iiif.io/news/2015/06/17/iiif-consortium/
Cramer, Tom. (2019). IIIF: A Brief History & Looking Forward. https://youtu.be/J_NgQhlFwA4
Crane, Tom. (2020). What is IIIF Content State? In Medium. https://tom-crane.medium.com/what-is-iiif-content-state-dd15a543939f
Crystal-Ornelas, Robert, Varadharajan, Charuleka, O’Ryan, Dylan, Beilsmith, Kathleen, Bond-Lamberty, Benjamin, Boye, Kristin, Burrus, Madison, Cholia, Shreyas, Christianson, Danielle S., Crow, Michael, Damerow, Joan, Ely, Kim S., Goldman, Amy E., Heinz, Susan L., Hendrix, Valerie C., Kakalia, Zarine, Mathes, Kayla, O’Brien, Fianna, Pennington, Stephanie C., … Agarwal, Deborah A. (2022). Enabling FAIR data in Earth and environmental science with community-centric (meta)data reporting formats. Scientific Data, 9(1), 700. https://doi.org/10.1038/s41597-022-01606-w
Cyganiak, Richard, Wood, David, & Lanthaler, Markus. (2014). RDF 1.1 Concepts and Abstract Syntax. In W3C. https://www.w3.org/TR/rdf11-concepts/
Czahajda, Radoslaw, Čairović, Neda, & Černko, Mitja. (2022). Live Online Education Efficiency Mediators From the Actor Network Theory Perspective. Frontiers in Education, 7. https://www.frontiersin.org/articles/10.3389/feduc.2022.859783
Daga, Enrico, Asprino, Luigi, Damiano, Rossana, Daquino, Marilena, Agudo, Belen Diaz, Gangemi, Aldo, Kuflik, Tsvi, Lieto, Antonio, Maguire, Mark, Marras, Anna Maria, Pandiani, Delfina Martinez, Mulholland, Paul, Peroni, Silvio, Pescarin, Sofia, & Wecker, Alan. (2022). Integrating Citizen Experiences in Cultural Heritage Archives: Requirements, State of the Art, and Challenges. Journal on Computing and Cultural Heritage, 15(1), 11:1-11:35. https://doi.org/10.1145/3477599
Dahlgren, Anna, & Hansson, Karin. (2020). The Diversity Paradox: Conflicting Demands on Metadata Production in Cultural Heritage Collections. Digital Culture & Society, 6(2), 239–256. https://doi.org/10.14361/dcs-2020-0212
Darmont, Jérôme, Favre, Cécile, Loudcher, Sabine, & Noûs, Camille. (2020). Data lakes for digital humanities. Proceedings of the 2nd International Conference on Digital Tools & Uses Congress, 1–4. https://doi.org/10.1145/3423603.3424004
Das, Souripriya, Sundara, Seema, & Cyganiak, Richard. (2012). R2RML: RDB to RDF Mapping Language. In W3C. https://www.w3.org/TR/r2rml/
Davis, Edie, & Heravi, Bahareh. (2021). Linked Data and Cultural Heritage: A Systematic Review of Participation, Collaboration, and Motivation. Journal on Computing and Cultural Heritage, 14(2), 21:1-21:18. https://doi.org/10.1145/3429458
De Muynke, Julien, Baltazar, Marie, Monferran, Martin, Voisenat, Claudie, & Katz, Brian F. G. (2022). Ears of the past, an inquiry into the sonic memory of the acoustics of Notre-Dame before the fire of 2019. Journal of Cultural Heritage. https://doi.org/10.1016/j.culher.2022.09.006
Debattista, Jeremy, Lange, Christoph, Scerri, Simon, & Auer, Sören. (2015). Linked ’Big’ Data: Towards a Manifold Increase in Big Data Value and Veracity. 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC), 92–98. https://doi.org/10.1109/BDC.2015.34
Degler, Duane, Butcosk, Charlie, & Chan, Rana. (2020). The Georgia O’Keeffe Museum Collections Website. In Design for Context. https://www.designforcontext.com/insights/okeeffe-museum-collections-browser
Delmas-Glass, Emmanuelle, & Sanderson, Robert. (2020). Fostering a community of PHAROS scholars through the adoption of open standards. Art Libraries Journal, 45(1), 19–23. https://doi.org/10.1017/alj.2019.32
Delmas-Glass, Emmanuelle. (2021). Lessons learned from digital collaborations: Standards and descriptive practices. In Lorraine A. Stuart, Thomas F. R. Clareson, & Joyce Ray (Eds.), Economic Considerations for Libraries, Archives and Museums (pp. 61–76). Routledge.
Demenchonok, Edward. (2018). Michel Foucault’s Theory of Practices of the Self and the Quest for a New Philosophical Anthropology. In Peace, Culture, and Violence (pp. 218–247). Brill. https://doi.org/10.1163/9789004361911_013
Denton, William. (2006). Functional Requirements for Bibliographic Records (FRBR): Hype or Cure-All? Portal: Libraries and the Academy, 6(2), 231–232. https://doi.org/10.1353/pla.2006.0018
Digital Preservation Coalition. (2017). Persistent Identifiers. In Digital Preservation Handbook. DPC. https://www.dpconline.org/handbook/technical-solutions-and-tools/persistent-identifiers
Dijkshoorn, Chris. (2023, October). Building Collection Data Infrastructure at the Rijksmuseum. EuropeanaTech 2023.
Dimou, Anastasia, & Vander Sande, Miel. (2022). RDF Mapping Language (RML). In RML. https://rml.io/specs/rml/
Doerr, Martin. (2003). The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata. AI Magazine, 24(3), 75–92. https://doi.org/10.1609/aimag.v24i3.1720
Drakopoulos, Georgios, Spyrou, Evaggelos, Voutos, Yorghos, & Mylonas, Phivos. (2019). A semantically annotated JSON metadata structure for open linked cultural data in Neo4j. Proceedings of the 23rd Pan-Hellenic Conference on Informatics, 81–88. https://doi.org/10.1145/3368640.3368659
Duin, Marcel. (2022). WebAssembly: Beyond the Browser. In Q42 Engineering. https://engineering.q42.nl/webassembly-beyond-the-browser/
Dunning, Alastair, Smaele, Madeleine de, & Böhmer, Jasmin. (2017). Are the FAIR Data Principles fair? International Journal of Digital Curation, 12(2), 177–195. https://doi.org/10.2218/ijdc.v12i2.567
Edmunds, Jeff. (2023). BIBFRAME Must Die. ScholarSphere, 1–7. https://doi.org/10.26207/V18M-0G05
Edwards, Elizabeth, & Hart, Janice (Eds.). (2004). Photographs Objects Histories (1st Edition). Routledge. https://doi.org/10.4324/9780203506493
Edwards, Elizabeth. (2011). Photographs: Material Form and the Dynamic Archive*. In Costanza Caraffa, Courtauld Institute of Art, & Kunsthistorisches Institut in Florenz (Eds.), Photo archives and the photographic memory of art history (pp. 47–56). Deutscher Kunstverlag.
Eggmann, Sabine, & Bischoff, Christine. (2010). Interview : Zwischen ‘think big’ und pragmatischem Realismus - zur konzeptuellen Ausrichtung der SGV. Schweizer Volkskunde : Korrespondenzblatt Der Schweizerischen Gesellschaft Für Volkskunde = Folklore Suisse : Bulletin de La Société Suisse Des Traditions Populaires = Folclore Svizzero : Bollettino Della Società Svizzera per Le Tradizioni Popolari, 100(4), 157–160. https://doi.org/10.5169/SEALS-1003866
Ehrlinger, Lisa, & Wöß, Wolfram. (2016). Towards a Definition of Knowledge Graphs. In Michael Martin, Martí Cuquet, & Erwin Folmer (Eds.), Joint Proceedings of the Posters and Demos Track of the 12th International Conference on Semantic Systems - SEMANTiCS2016 and the 1st International Workshop on Semantic Change & Evolving Semantics (SuCCESS’16) (Vol. 1695). CEUR. https://ceur-ws.org/Vol-1695/#paper4
Emanuel, Jeffrey P. (2018). Chapter 9 - Stitching Together Technology for the Digital Humanities With the International Image Interoperability Framework (IIIF). In Robin Kear & Kate Joranson (Eds.), Digital Humanities, Libraries, and Partnerships (pp. 125–135). Chandos Publishing. https://doi.org/10.1016/B978-0-08-102023-4.00009-4
Emmanuel, Isitor, & Stanier, Clare. (2016). Defining Big Data. Proceedings of the International Conference on Big Data and Advanced Wireless Technologies, 1–6. https://doi.org/10.1145/3010089.3010090
Endres, Bill. (2019). Digitizing Medieval Manuscripts: The St. Chad Gospels, Materiality, Recoveries, and Representation in 2D & 3D. In Digitizing Medieval Manuscripts. ARC, Amsterdam University Press. https://doi.org/10.1515/9781942401803
Escamilla, Emily, Salsabil, Lamia, Klein, Martin, Wu, Jian, Weigle, Michele C., & Nelson, Michael L. (2023). It’s Not Just GitHub: Identifying Data and Software Sources Included in Publications. In Omar Alonso, Helena Cousijn, Gianmaria Silvello, Mónica Marrero, Carla Teixeira Lopes, & Stefano Marchesin (Eds.), Linking Theory and Practice of Digital Libraries (pp. 195–206). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-43849-3_17
Etemad, Elika J., & Rivoal, Florian. (2023). W3C Process Document. In W3C. https://www.w3.org/2023/Process-20230612/
European Commission. Directorate General for Research and Innovation. (2018). Turning FAIR into reality: Final report and action plan from the European Commission expert group on FAIR data. Publications Office. https://data.europa.eu/doi/10.2777/1524
Fafalios, Pavlos, Marketakis, Yannis, Axaridou, Anastasia, Tzitzikas, Yannis, & Doerr, Martin. (2023). A workflow model for holistic data management and semantic interoperability in quantitative archival research. Digital Scholarship in the Humanities, 38(3), 1049–1066. https://doi.org/10.1093/llc/fqad018
Felsing, Ulrike, & Cornut, Murielle. (2024). Re-Imagining the Collection of the Kreis Family. Research in Arts and Education, 2024(1), 41–53. https://doi.org/10.54916/rae.142567
Felsing, Ulrike, & Frischknecht, Max. (2021). Critical Map Visualizations. In Christine Schranz (Ed.), Shifts in Mapping (pp. 95–124). transcript Verlag. https://doi.org/10.1515/9783839460412-008
Felsing, Ulrike, Fornaro, Peter, Frischknecht, Max, & Raemy, Julien Antoine. (2023). Community and Interoperability at the Core of Sustaining Image Archives. Digital Humanities in the Nordic and Baltic Countries Publications, 5(1), 40–54. https://doi.org/10.5617/dhnbpub.10649
Ferrazzi, Sabrina. (2021). The Notion of “Cultural Heritage” in the International Field: Behind Origin and Evolution of a Concept. International Journal for the Semiotics of Law - Revue Internationale de Sémiotique Juridique, 34(3), 743–768. https://doi.org/10.1007/s11196-020-09739-0
Fielding, R., Nottingham, M., & Reschke, J. (2022). HTTP Semantics (Request for {Comments} RFC9110; p. RFC9110). Internet Engineering Task Force. https://doi.org/10.17487/RFC9110
Fielding, Roy Thomas. (2000). Architectural Styles and the Design of Network-based Software Architectures [{PhD} {Thesis}, University of California]. https://ics.uci.edu/~fielding/pubs/dissertation/top.htm
Fink, Eleanor E. (2018). American Art Collaborative (AAC) Linked Open Data (LOD) Initiative (p. 81) [Overview and {Recommendations} for {Good} {Practices}]. American Art Collaborative. https://hdl.handle.net/10088/106410
Fiorentino, Sara, & Chinni, Tania. (2023). The Persistence of Memory. Exploring the Significance of Glass from Materiality to Intangible Values. Heritage, 6(6), 4834–4842. https://doi.org/10.3390/heritage6060257
Fišer, Darja, & Wuttke, Ulrike. (2018). Boost your eHumanities and eHeritage research with Research Infrastructures (PARTHENOS Webinar). https://hal.science/cel-01784097
Fitzpatrick, Kathleen. (2010). Reporting from the Digital Humanities 2010 Conference. In The Chronicle of Higher Education. https://web.archive.org/web/20190829004943/https://www.chronicle.com/blogs/profhacker/reporting-from-the-digital-humanities-2010-conference/25473
Floridi, Luciano. (2005). Is Semantic Information Meaningful Data? Philosophy and Phenomenological Research, 70(2), 351–370. https://www.jstor.org/stable/40040796
Floridi, Luciano. (2008). The Method of Levels of Abstraction. Minds and Machines, 18(3), 303–329. https://doi.org/10.1007/s11023-008-9113-7
Floridi, Luciano. (2010). Information: A very short introduction. Oxford University Press.
Floridi, Luciano. (2011). The philosophy of information. Oxford University Press.
Floridi, Luciano. (2023). On good and evil, the mistaken idea that technology is ever neutral, and the importance of the double-charge thesis [{SSRN} {Scholarly} {Paper}]. https://doi.org/10.2139/ssrn.4551487
Force, Donald C., & Smith, Randy. (2021). Context Lost: Digital Surrogates, Their Physical Counterparts, and the Metadata that Is Keeping Them Apart. The American Archivist, 84(1), 91–118. https://doi.org/10.17723/0360-9081-84.1.91
Fornaro, Peter, & Chiquet, Vera. (2020). Das Digital Humanities Lab der Universität Basel und die interdisziplinäre Zusammenarbeit mit der Kunstgeschichte und der Archäologie. Zeitschrift Für Schweizerische Archäologie Und Kunstgeschichte (ZAK), 77 (2020)(2–3), 111–124. https://doi.org/10.5169/SEALS-882469
FOSTER. (2019). Open Science. In Foster Taxonomy. FACILITATE OPEN SCIENCE TRAINING FOR EUROPEAN RESEARCH. https://www.fosteropenscience.eu/taxonomy/term/100
Foucault, Michel. (1982). The Subject and Power. Critical Inquiry, 8(4), 777–795. https://www.jstor.org/stable/1343197
France, Fenella G., & Forsberg, Andrew. (2019). Linked Open and Annotated Science and Heritage Data. Archiving Conference, 16(1), 151–155. https://doi.org/10.2352/issn.2168-3204.2019.1.0.35
France, Fenella G., & Forsberg, Andrew. (2021). Addressing the Challenges of Interoperability and Cultural Heritage Data. Archiving Conference, 18, 33–37. https://doi.org/10.2352/issn.2168-3204.2021.1.0.8
France, Fenella G., & Toth, Michael B. (2014). Integrating science and art: The scriptospatial visualization interface. IFLA WLIC 2014 Proceedings, 1–12. https://library.ifla.org/id/eprint/763/
Freire, Nuno, & Isaac, Antoine. (2019). Technical Usability of Wikidata’s Linked Data. In Witold Abramowicz & Rafael Corchuelo (Eds.), Business Information Systems Workshops (pp. 556–567). Springer International Publishing. https://doi.org/10.1007/978-3-030-36691-9_47
Freire, Nuno, Calado, Pável, & Martins, Bruno. (2018). Availability of Cultural Heritage Structured Metadata in the World Wide Web. In Leslie Chan & Pierre Mounier (Eds.), ELPUB 2018. https://doi.org/10.4000/proceedings.elpub.2018.20
Freire, Nuno, Isaac, Antoine, Robson, Glen, Brooks, John, & Manguinhas, Hugo. (2017). A survey of Web technology for metadata aggregation in cultural heritage. Information Services & Use, 37(4), 425–436. https://doi.org/10.3233/ISU-170859
Freire, Nuno, Meijers, Enno, Valk, Sjors de, Raemy, Julien A., & Isaac, Antoine. (2021). Metadata Aggregation via Linked Data: Results of the Europeana Common Culture Project. In Emmanouel Garoufallou & María-Antonia Ovalle-Perandones (Eds.), Metadata and Semantic Research (pp. 383–394). Springer International Publishing. https://doi.org/10.1007/978-3-030-71903-6_35
Freire, Nuno, Meijers, Enno, Valk, Sjors de, Voorburg, René, Isaac, Antoine, & Cornelissen, Roland. (2018). Aggregation of Linked Data : A case study in the cultural heritage domain. 2018 IEEE International Conference on Big Data (Big Data), 522–527. https://doi.org/10.1109/BigData.2018.8622348
Freire, Nuno, Robson, Glen, Howard, John B., Manguinhas, Hugo, & Isaac, Antoine. (2018). Cultural heritage metadata aggregation using web technologies: IIIF, Sitemaps and Schema.org. International Journal on Digital Libraries. https://doi.org/10.1007/s00799-018-0259-5
Fresa, Antonella. (2013). A Data Infrastructure for Digital Cultural Heritage: Characteristics, Requirements and Priority Services. International Journal of Humanities and Arts Computing, 7(supplement), 29–46. https://doi.org/10.3366/ijhac.2013.0058
Frischknecht, Max. (2022). Generating Perspectives: Applying Generative Design to critically explore the Atlas of Swiss Folklore. DARIAH-CH Study Day 2022 Posters. https://doi.org/10.24451/arbor.17911
Fritschi, Ramona. (2017). E-codices–Virtual Manuscript Library of Switzerland: Great Achievements and Even Greater Challenges to Come. Digital Philology: A Journal of Medieval Cultures, 6(2), 244–256. https://doi.org/10.1353/dph.2017.0013
Ganascia, Jean-Gabriel. (2015). Abstraction of levels of abstraction. Journal of Experimental & Theoretical Artificial Intelligence, 27(1), 23–35. https://doi.org/10.1080/0952813X.2014.940685
Gandon, Fabien, & Schreiber, Guus. (2014). RDF 1.1 XML Syntax. In W3C. https://www.w3.org/TR/rdf-syntax-grammar/
Gandon, Fabien. (2017). Pour tout le monde : Tim Berners-Lee, lauréat du prix Turing 2016 pour avoir inventé… le Web. Bulletin 1024, 2017(11), 129–154. https://doi.org/10.48556/SIF.1024.11.129
Gandon, Fabien. (2019). The Web We Mix: Benevolent AIs for a Resilient Web. Proceedings of the 10th ACM Conference on Web Science, 115–116. https://doi.org/10.1145/3292522.3329406
Gandon, Fabien. (2019). Web Science, Artificial Intelligence and Intelligence Augmentation (Seminar Dagstuhl Perspectives Workshop 18262. 10 Years of Web Science: Closing The Loop; pp. 10–13). Schloss Dagstuhl — Leibniz-Zentrum für Informatik. https://inria.hal.science/hal-01976768
Gautschy, Rita. (2022). Swiss National Data and Service Center for the Humanities (DaSCH). https://doi.org/10.5281/zenodo.7251759
Georgopoulos, Andreas. (2018). CIPA’s Perspectives on Cultural Heritage. In Sander Münster, Kristina Friedrichs, Florian Niebling, & Agnieszka Seidel-Grzesińska (Eds.), Digital Research and Education in Architectural Heritage (pp. 215–245). Springer International Publishing. https://doi.org/10.1007/978-3-319-76992-9_13
Giacomo, Giuseppe De, & Lenzerini, Maurizio. (1996). TBox and ABox reasoning in expressive description logics. Proceedings of the Fifth International Conference on Principles of Knowledge Representation and Reasoning, 316–327. https://dl.acm.org/doi/10.5555/3087368.3087406
Gilliland, Anne J. (2016). Setting the Stage. In Murtha Baca (Ed.), Introduction to metadata (Third edition). Getty Research Institute. https://www.getty.edu/publications/intrometadata/setting-the-stage/
Ginhoven, Sandra van, & Rasterhoff, Claartje. (2019). Art Markets and Digital Histories. Arts, 8(3), 105. https://doi.org/10.3390/arts8030105
GO FAIR. (2016). FAIRification Process. In GO FAIR. https://www.go-fair.org/fair-principles/fairification-process/
Gobbo, Federico, & Benini, Marco. (2016). What Can We Know of Computational Information? Measuring, Quantity, and Quality at Work in Programmable Artifacts. Topoi, 35(1), 203–212. https://doi.org/10.1007/s11245-014-9248-5
Gomes, Daniel, Miranda, João, & Costa, Miguel. (2011). A Survey on Web Archiving Initiatives. In Stefan Gradmann, Francesca Borri, Carlo Meghini, & Heiko Schuldt (Eds.), Research and Advanced Technology for Digital Libraries (Vol. 6966, pp. 408–420). Springer. https://doi.org/10.1007/978-3-642-24469-8_41
Gomez, Joshua, Clarke, Kevin S., & Vuong, Anthony. (2020). IIIF by the Numbers. The Code4Lib Journal, 2020(48). https://journal.code4lib.org/articles/15217
Gough, David. (2007). Weight of Evidence: A framework for the appraisal of the quality and relevance of evidence. Research Papers in Education, 22(2), 213–228. https://doi.org/10.1080/02671520701296189
Goyal, Yash, Khot, Tejas, Summers-Stay, Douglas, Batra, Dhruv, & Parikh, Devi. (2017). Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering. arXiv. https://doi.org/10.48550/arXiv.1612.00837
Graf, Stephan, Huber, Birgit, Özvegyi, Aila, & Peduzzi, Nicole. (2019). Das Fotoarchiv der Schweizerischen Gesellschaft für Volkskunde: Ein Ort der Forschung, der Begegnungen und Affekte. Rundbrief Fotografie, 26(1), 31–40.
Greenberg, Jane. (2005). Understanding Metadata and Metadata Schemes. Cataloging & Classification Quarterly, 40(3–4), 17–36. https://doi.org/10.1300/J104v40n03_02
Gruber, Thomas R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2), 199–220. https://doi.org/10.1006/knac.1993.1008
Gualandi, Bianca, Pareschi, Luca, & Peroni, Silvio. (2022). What do we mean by “data”? A proposed classification of data types in the arts and humanities. Journal of Documentation, 79(7), 51–71. https://doi.org/10.1108/JD-07-2022-0146
Guenther, Rebecca S. (2003). MODS: The Metadata Object Description Schema. Portal: Libraries and the Academy, 3(1), 137–150. https://doi.org/10.1353/pla.2003.0006
Guillem, Anaïs, Gros, Antoine, & De Luca, Livio. (2023, June). Faire parler les claveaux effondrés de la cathédrale Notre-Dame de Paris. Recueil Des Communications Du 4e Colloque Humanistica. https://hal.science/hal-04106101
Guillem, Anaïs, Gros, Antoine, Reby, Kevin, Abergel, Violette, & De Luca, Livio. (2023). RCC8 for CIDOC CRM: Semantic Modeling of Mereological and Topological Spatial Relations in Notre-Dame de Paris. In Antonis Bikakis, Roberta Ferrario, Stéphane Jean, Béatrice Markhoff, Alessandro Mosca, & Marianna Nicolosi Asmundo (Eds.), Proceedings of the International Workshop on Semantic Web and Ontology Design for Cultural Heritage (Vol. 3540). CEUR. https://ceur-ws.org/Vol-3540/#paper2
Hacıgüzeller, Piraye, Taylor, James Stuart, & Perry, Sara. (2021). On the Emerging Supremacy of Structured Digital Data in Archaeology: A Preliminary Assessment of Information, Knowledge and Wisdom Left Behind. Open Archaeology, 7(1), 1709–1730. https://doi.org/10.1515/opar-2020-0220
Hadro, Josh. (2019). Introduction to IIIF. https://youtu.be/l8kc8nH5f8I
Hadro, Josh. (2022). Newcomers Session: Introduction to the Community and Consortium. https://youtu.be/ojOy9fWlBRk
Hagberg, Aric, Swart, Pieter J., & Schult, Daniel A. (2008). Exploring network structure, dynamics, and function using NetworkX (LA-UR-08-05495; LA-UR-08-5495). Los Alamos National Laboratory (LANL), Los Alamos, NM (United States). https://www.osti.gov/biblio/960616
Hahnel, Mark, & Valen, Dan. (2020). How to (Easily) Extend the FAIRness of Existing Repositories. Data Intelligence, 2(1–2), 192–198. https://doi.org/10.1162/dint_a_00041
Haklay, Muki. (2013). Citizen Science and Volunteered Geographic Information: Overview and Typology of Participation. In Daniel Sui, Sarah Elwood, & Michael Goodchild (Eds.), Crowdsourcing Geographic Knowledge: Volunteered Geographic Information (VGI) in Theory and Practice (pp. 105–122). Springer Netherlands. https://doi.org/10.1007/978-94-007-4587-2_7
Hammond, Michael. (2017). What is an online community? A new definition based around commitment, connection, reciprocity, interaction, agency, and consequences. International Journal of Web Based Communities, 13(1), 118–136. https://doi.org/10.1504/IJWBC.2017.082717
Haraway, Donna Jeanne. (2003). The companion species manifesto: Dogs, people, and significant otherness. Prickly Paradigm Press.
Haraway, Donna Jeanne. (2016). Staying with the trouble: Making kin in the Chthulucene. Duke University Press.
Haraway, Donna. (1988). Situated Knowledges: The Science Question in Feminism and the Privilege of Partial Perspective. Feminist Studies, 14(3), 575–599. https://doi.org/10.2307/3178066
Haraway, Donna. (2008). Encounters with Companion Species: Entangling Dogs, Baboons, Philosophers, and Biologists. Configurations, 14(1), 97–114. https://doi.org/10.1353/con.0.0002
Haraway, Donna. (2016). Tentacular Thinking: Anthropocene, Capitalocene, Chthulucene. E-Flux, 2016(75). https://www.e-flux.com/journal/75/67125/tentacular-thinking-anthropocene-capitalocene-chthulucene/
Hardesty, Juliet, & Nolan, Allison. (2021). Mitigating Bias in Metadata: A Use Case Using Homosaurus Linked Data. Information Technology and Libraries, 40(3). https://doi.org/10.6017/ital.v40i3.13053
Hardin, Garrett. (1968). The Tragedy of the Commons. Science, 162(3859), 1243–1248. https://doi.org/10.1126/science.162.3859.1243
Harpring, Patricia. (2010). Development of the Getty Vocabularies: AAT, TGN, ULAN, and CONA. Art Documentation: Journal of the Art Libraries Society of North America, 29(1), 67–72. https://doi.org/10.1086/adx.29.1.27949541
Harpring, Patricia. (2018). Linking the Getty Vocabularies: The Content Perspective, Including an Update on CONA. 2018 Pacific Neighborhood Consortium Annual Conference and Joint Meetings (PNC), 1–8. https://doi.org/10.23919/PNC.2018.8579460
Hartig, Olaf, Champin, Pierre-Antoine, & Kellogg, Gregg. (2023). RDF 1.2 Concepts and Abstract Syntax. In W3C. https://www.w3.org/TR/rdf12-concepts/
Haslhofer, Bernhard, Simon, Rainer, Sanderson, Robert, & Sompel, Herbert Van de. (2011). The Open Annotation Collaboration (OAC) Model. 2011 Workshop on Multimedia on the Web, 5–9. https://doi.org/10.1109/MMWeb.2011.21
Hasnain, Ali, & Rebholz-Schuhmann, Dietrich. (2018). Assessing FAIR Data Principles Against the 5-Star Open Data Principles. In Aldo Gangemi, Anna Lisa Gentile, Andrea Giovanni Nuzzolese, Sebastian Rudolph, Maria Maleshkova, Heiko Paulheim, Jeff Z. Pan, & Mehwish Alam (Eds.), The Semantic Web: ESWC 2018 Satellite Events (pp. 469–477). Springer International Publishing. https://doi.org/10.1007/978-3-319-98192-5_60
Haugaard, Mark. (2022). Foucault and Power: A Critique and Retheorization. Critical Review, 34(3–4), 341–371. https://doi.org/10.1080/08913811.2022.2133803
Haxaire, Claudie. (2009). The Power of Ambiguity: The Nature and Efficacy of the Zamble Masks Revealed by “Disease Masks” Among the Gouro People (Côte d’Ivoire). Africa, 79(4), 543–569. https://doi.org/10.3366/E0001972009001065
Haynes, Ronald, Silverton, Edward, & Winchester, Julie. (2023). IIIF and 3D: The IIIF 3D Technical Specification Group. https://ark.dasch.swiss/ark:/72163/1/0810/frGZxfL_QgW9fGt8VUuOIwG
Haynes, Ronald. (2023). Evolving Standards in Digital Cultural Heritage – Developing a IIIF 3D Technical Specification. In Marinos Ioannides & Petros Patias (Eds.), 3D Research Challenges in Cultural Heritage III: Complexity and Quality in Digitisation (pp. 50–64). Springer International Publishing. https://doi.org/10.1007/978-3-031-35593-6_3
He, Y., Ma, Y. H., & Zhang, X. R. (2017). “Digital Heritage” Theory and Innovative Practice. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W5, 335–342. https://doi.org/10.5194/isprs-archives-XLII-2-W5-335-2017
Heath, Tom, & Bizer, Christian. (2011). Principles of Linked Data. In Tom Heath & Christian Bizer (Eds.), Linked Data: Evolving the Web into a Global Data Space (pp. 7–27). Springer International Publishing. https://doi.org/10.1007/978-3-031-79432-2_2
Hertz, Ellen, Graezer Bideau, Florence, Leimgruber, Walter, & Munz, Hervé. (2018). Politiques de la tradition. Le patrimoine culturel immatériel (Vol. 131). Presses polytechniques et universitaires romandes. https://edoc.unibas.ch/68569/
Hill, Linda, Buchel, Olha, Janée, Greg, & Zeng, Marcia Lei. (2002). Integration of Knowledge Organization Systems into Digital Library Architectures: Position Paper for 13th ASIST SIGICR Workshop, Reconceptualizing Classification Research. Advances in Classification Research Online, 13(1), 46–52. https://doi.org/10.7152/acro.v13i1.13835
Hillmann, Diane I., Marker, Rhonda, & Brady, Chris. (2008). Metadata Standards and Applications. The Serials Librarian, 54(1–2), 7–21. https://doi.org/10.1080/03615260801973364
HIMSS. (2020). Interoperability in Healthcare. In Healthcare Information and Management Systems Society. https://www.himss.org/resources/interoperability-healthcare
Hipler, Günter, Prongué, Nicolas, & Schneider, René. (2018). Swissbib und linked.swissbib.ch: Leistung und Potenziale einer offenen Plattform für Schweizer Bibliotheksdaten. In Alice Keller & Susanne Uhl (Eds.), Bibliotheken der Schweiz: Innovation durch Kooperation (pp. 160–172). De Gruyter. https://doi.org/10.1515/9783110553796-008
Hodge, Gail M. (2000). Systems of knowledge organization for digital libraries: Beyond traditional authority files. Digital Library Federation, Council on Library; Information Resources.
Hoffmann, Anna Lauren. (2021). Terms of inclusion: Data, discourse, violence. New Media & Society, 23(12), 3539–3556. https://doi.org/10.1177/1461444820958725
Hofmann, Kerstin P., Grunwald, Susanne, Lang, Franziska, Peter, Ulrike, Rösler, Katja, Rokohl, Louise, Schreiber, Stefan, Tolle, Karsten, & Wigg-Wolf, David. (2019). Ding-Editionen. Vom archäologischen (Be-)Fund übers Corpus ins Netz. E-Forschungsberichte Des DAI, 1–12. https://doi.org/10.34780/s7a5-71aj
Hopkins, Andrew. (2022). Glossary of acronyms used in Digital Humanities. Journal of Art Historiography, 2022(27s). https://doi.org/10.48352/UOBXJAH.00004194
Hou, Yumeng, & Kenderdine, Sarah. (2024). Ontology-based knowledge representation for traditional martial arts. Digital Scholarship in the Humanities, 1–18. https://doi.org/10.1093/llc/fqae005
Hou, Yumeng, Kenderdine, Sarah, Picca, Davide, Egloff, Mattia, & Adamou, Alessandro. (2022). Digitizing Intangible Cultural Heritage Embodied: State of the Art. Journal on Computing and Cultural Heritage, 15(3), 55:1-55:20. https://doi.org/10.1145/3494837
Huber, Birgit, & Frischknecht, Max. (2024). Digitalisierung und (De-)Konstruktion. Überlegungen zur Entwicklung eines Prototyps für die digitale Zugänglichmachung des «Atlas der Schweizerischen Volkskunde. In Sabine Eggmann & Konrad J. Kuhn (Eds.), Schweizerisches Archiv für Volkskunde  Archives suisses des traditions populaires (Vol. 2024/1, pp. 27–52). Chronos. https://doi.org/10.33057/CHRONOS.1785/27-51
Huber, Birgit. (2023). Die Entdeckung der «Brünig-Napf-Reuss-Linie». In Blog zur Schweizer Geschichte - Schweizerisches Nationalmuseum. https://blog.nationalmuseum.ch/2023/10/die-entdeckung-der-bruenig-napf-reuss-linie/
Hyvönen, Eero, Tuominen, Jouni, Alonen, Miika, & Mäkelä, Eetu. (2014). Linked Data Finland: A 7-star Model and Platform for Publishing and Re-using Linked Datasets. In Valentina Presutti, Eva Blomqvist, Raphael Troncy, Harald Sack, Ioannis Papadakis, & Anna Tordai (Eds.), The Semantic Web: ESWC 2014 Satellite Events (Vol. 8798, pp. 226–230). Springer International Publishing. https://doi.org/10.1007/978-3-319-11955-7_24
Hyvönen, Eero. (2012). Cultural Heritage on the Semantic Web. In Publishing and Using Cultural Heritage Linked Data on the Semantic Web (pp. 1–11). Springer International Publishing. https://doi.org/10.1007/978-3-031-79438-4_1
Hyvönen, Eero. (2020). Using the Semantic Web in digital humanities: Shift from data publishing to data-analysis and serendipitous knowledge discovery. Semantic Web, 11(1), 187–193. https://doi.org/10.3233/SW-190386
ICA Expert Group on Archival Description. (2023). Records in Context Conceptual Model 1.0. https://www.ica.org/sites/default/files/ric-cm-1.0_0.pdf
Idehen, Kingsley Uyi. (2017). Semantic Web Layer Cake Tweak, Explained. In OpenLink Software Blog. https://medium.com/openlink-software-blog/semantic-web-layer-cake-tweak-explained-6ba5c6ac3fab
Ioannides, Marinos, & Davies, Robert. (2019). Towards a Holistic Documentation and Wider Use of Digital Cultural Heritage. In Emmanouel Garoufallou, Fabio Sartori, Rania Siatri, & Marios Zervas (Eds.), Metadata and Semantic Research (pp. 76–88). Springer International Publishing. https://doi.org/10.1007/978-3-030-14401-2_7
Irish, Kathryn, & Saba, Jessica. (2023). Bots are the new fraud: A post-hoc exploration of statistical methods to identify bot-generated responses in a corrupt data set. Personality and Individual Differences, 213. https://doi.org/10.1016/j.paid.2023.112289
Irwin, Alan. (1995). Citizen Science: A Study of People, Expertise, and Sustainable Development. Routledge.
Isaksen, Leif, Simon, Rainer, Barker, Elton T. E., & Soto Cañamares, Pau de. (2014). Pelagios and the emerging graph of ancient world data. Proceedings of the 2014 ACM Conference on Web Science, 197–201. https://doi.org/10.1145/2615569.2615693
Izu, Benjamin Obeghare. (2022). The Sociocultural Significance of the Emedjo (Masquerade) Dance Among the Abraka People in Delta State, Nigeria. E-Journal of Humanities, Arts and Social Sciences, 413–423. https://doi.org/10.38159/ehass.2022394
Jackson, Steven J., Edwards, Paul N., Bowker, Geoffrey C., & Knobel, Cory P. (2007). Understanding infrastructure: History, heuristics and cyberinfrastructure policy. First Monday, 12(6). https://doi.org/10.5210/fm.v12i6.1904
Jacobs, Ian, & Walsh, Norman. (2004). Architecture of the World Wide Web, Volume One. In W3C. https://www.w3.org/TR/webarch/
Jacobsen, Annika, Miranda Azevedo, Ricardo de, Juty, Nick, Batista, Dominique, Coles, Simon, Cornet, Ronald, Courtot, Mélanie, Crosas, Mercè, Dumontier, Michel, Evelo, Chris T., Goble, Carole, Guizzardi, Giancarlo, Hansen, Karsten Kryger, Hasnain, Ali, Hettne, Kristina, Heringa, Jaap, Hooft, Rob W. W., Imming, Melanie, Jeffery, Keith G., … Schultes, Erik. (2020). FAIR Principles: Interpretations and Implementation Considerations. Data Intelligence, 2(1–2), 10–29. https://doi.org/10.1162/dint_r_00024
Jaillant, Lise, & Caputo, Annalina. (2022). Unlocking digital archives: Cross-disciplinary perspectives on AI and born-digital data. AI & SOCIETY, 37(3), 823–835. https://doi.org/10.1007/s00146-021-01367-x
Jaillant, Lise, & Rees, Arran. (2023). Applying AI to digital archives: Trust, collaboration and shared professional ethics. Digital Scholarship in the Humanities, 38(2), 571–585. https://doi.org/10.1093/llc/fqac073
Jansma, Sikke R., Dijkstra, Anne M., & Jong, Menno D. T. de. (2022). Co-creation in support of responsible research and innovation: An analysis of three stakeholder workshops on nanotechnology for health. Journal of Responsible Innovation, 9(1), 28–48. https://doi.org/10.1080/23299460.2021.1994195
Jaton, Florian. (2017). We get the algorithms of our ground truths: Designing referential databases in digital image processing. Social Studies of Science, 47(6), 811–840. https://doi.org/10.1177/0306312717730428
Jaton, Florian. (2021). Assessing biases, relaxing moralism: On ground-truthing practices in machine learning design and application. Big Data & Society, 8(1), 1–15. https://doi.org/10.1177/20539517211013569
Jaton, Florian. (2023). Groundwork for AI: Enforcing a benchmark for neoantigen prediction in personalized cancer immunotherapy. Social Studies of Science, 53(5), 787–810. https://doi.org/10.1177/03063127231192857
Jewkes, Rachel, & Murcott, Anne. (1996). Meanings of community. Social Science & Medicine, 43(4), 555–563. https://doi.org/10.1016/0277-9536(95)00439-4
Jodogne, Sébastien. (2023, October). On the Use of DICOM as a Storage Layer for IIIF. DaSCHCon 2023.2. http://hdl.handle.net/2078.1/279235
Josefsson, Simon. (2006). The Base16, Base32, and Base64 Data Encodings (Request for {Comments} RFC4648; p. RFC9110). Internet Engineering Task Force. https://doi.org/10.17487/RFC4648
Junginger, Pauline, & Dörk, Marian. (2021). Categorizing Queer Identities: An Analysis of Archival Practices Using the Concept of Boundary Objects. Journal of Feminist Scholarship, 19(19). https://doi.org/10.23860/jfs.2021.19.05
Kansy, Lambert, & Lüthi, Martin. (2022). Going digital – Ein digitaler Lesesaal für die Staatsarchive Basel-Stadt und St.Gallen. ABI Technik, 42(3), 144–156. https://doi.org/10.1515/abitech-2022-0028
Katz, Brian F. G. (2023, October). Digitally exploring the acoustic history of Notre-Dame Cathedral. EuropeanaTech 2023. https://youtu.be/JDcNV_X54oQ
Kelly, Mike. (2023). JSON Hypertext Application Language (Internet {Draft} draft-kelly-json-hal-10). Internet Engineering Task Force. https://datatracker.ietf.org/doc/draft-kelly-json-hal-10
Kembellec, Gérald. (2023). Je viens d’avoir une fulgurance et relisant un texte de B. Bachimont et en le croisant avec les recherches d’@azaroth42 sur #IIIF. L’"Hypotypose sémantique" qui décrirait finement tout ou partie d’une œuvre d’art par les schémas et les autorités. Reste le navigateur à créer... Https://t.co/Aj1BzQb0x8 [Tweet]. In Twitter. https://web.archive.org/web/20230411062913/https://twitter.com/kembellec/status/1643244527124135936
Kesteren, Anne van. (2023). Fetch Living Standard. In Web Hypertext Application Technology Working Group. https://fetch.spec.whatwg.org/
Kiley, Robert, & Crane, Tom. (2016). Publishing scientific images using the IIIF. In eLife. https://elifesciences.org/labs/aabe94cd/publishing-scientific-images-using-the-iiif
Klammt, Anne. (2019). LOUD Discourse: ›Linked Open Usable Data‹ (LOUD)-basierte Dokumentation zur Herstellung und Entwicklung typologischer Ordnungen am Beispiel archäologischer Keramikforschung / Documentation basée sur les « linked Open Usable Data » (LOUD) pour l’établissement et le développement de classements typologiques à partir de l’exemple de la céramologie. Deutsches Forum Für Kunstgeschichte (DFK Paris): Jahresbericht / Rapport Annuel, 77–79. https://doi.org/10.11588/dfkjb.2019.1.79405
Klic, Lukas. (2019). Digital publishing and research infrastructure for cultural heritage : An institutional roadmap [{PhD} {Thesis}, Università Ca’ Foscari Venezia]. http://dspace.unive.it/handle/10579/15587
Knoblock, Craig A., Szekely, Pedro, Fink, Eleanor, Degler, Duane, Newbury, David, Sanderson, Robert, Blanch, Kate, Snyder, Sara, Chheda, Nilay, Jain, Nimesh, Raju Krishna, Ravi, Begur Sreekanth, Nikhila, & Yao, Yixiang. (2017). Lessons Learned in Building Linked Data for the American Art Collaborative. In Claudia d’Amato, Miriam Fernandez, Valentina Tamma, Freddy Lecue, Philippe Cudré-Mauroux, Juan Sequeda, Christoph Lange, & Jeff Heflin (Eds.), The Semantic Web – ISWC 2017 (Vol. 10588, pp. 263–279). Springer International Publishing. https://doi.org/10.1007/978-3-319-68204-4_26
Knöchelmann, Marcel. (2019). Open Science in the Humanities, or: Open Humanities? Publications, 7(4), 65. https://doi.org/10.3390/publications7040065
Knublauch, Holger, & Kontokostas, Dimitris. (2017). Shapes Constraint Language (SHACL). In W3C. https://www.w3.org/TR/shacl/
Koch, Inês, Ribeiro, Cristina, & Teixeira Lopes, Carla. (2020). ArchOnto, a CIDOC-CRM-Based Linked Data Model for the Portuguese Archives. In Mark Hall, Tanja Merčun, Thomas Risse, & Fabien Duchateau (Eds.), Digital Libraries for Open Knowledge (Vol. 12246, pp. 133–146). Springer International Publishing. https://doi.org/10.1007/978-3-030-54956-5_10
Krötzsch, Markus, Simancik, Frantisek, & Horrocks, Ian. (2013). A Description Logic Primer. arXiv. https://doi.org/10.48550/arXiv.1201.4089
Kuhn, Thomas S. (1994). The structure of scientific revolutions (2. ed., enlarged, 21. print). Univ. of Chicago Press.
Lagoze, Carl, Van de Sompel, Herbert, Nelson, Michael, & Warner, Simeon. (2002). The Open Archives Initiative Protocol for Metadata Harvesting - v.2.0. In Open Archives Initiative. http://www.openarchives.org/OAI/openarchivesprotocol.html
Lagoze, Carl, Van de Sompel, Herbert, Nelson, Michael, Warner, Simeon, Sanderson, Robert, & Johnston, Pete. (2012). A Web-based resource model for scholarship 2.0: Object reuse & exchange. Concurrency and Computation: Practice and Experience, 24(18), 2221–2240. https://doi.org/10.1002/cpe.1594
Laney, Doug. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70), 1.
Laskey, Kathryn B., & Laskey, Kenneth. (2009). Service oriented architecture. WIREs Computational Statistics, 1(1), 101–105. https://doi.org/10.1002/wics.8
Lassila, Ora, & Swick, Ralph R. (1999). Resource Description Framework (RDF) Model and Syntax Specification. In W3C. https://www.w3.org/TR/1999/REC-rdf-syntax-19990222/
Latour, Bruno, & WooIgar, Steve. (1986). Laboratory Life: The Construction of Scientific Facts. Princeton University Press. https://doi.org/10.2307/j.ctt32bbxc
Latour, Bruno, Jensen, Pablo, Venturini, Tommaso, Grauwin, Sébastian, & Boullier, Dominique. (2012). “The whole is always smaller than its parts” - a digital test of Gabriel Tardes’ monads: “The whole is always smaller than its parts”. The British Journal of Sociology, 63(4), 590–615. https://doi.org/10.1111/j.1468-4446.2012.01428.x
Latour, Bruno. (1990). Postmodern? No, simply amodern! Steps towards an anthropology of science. Studies in History and Philosophy of Science Part A, 21(1), 145–171. https://doi.org/10.1016/0039-3681(90)90018-4
Latour, Bruno. (1993). We have never been modern. Harvard University Press.
Latour, Bruno. (1996). On actor-network theory: A few clarifications. Soziale Welt, 47(4), 369–381. https://www.jstor.org/stable/40878163
Latour, Bruno. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford University Press.
Latour, Bruno. (2022). Habiter la Terre : Entretiens avec Nicolas Truong. Éditions Les Liens qui libèrent ; Arte éditions.
Lave, Jean, & Wenger, Etienne. (1991). Situated learning: Legitimate peripheral participation. Cambridge University Press.
Lee, Christopher A. (2009). Open Archival Information System (OAIS) Reference Model. In Marcia J. Bates & Mary Niles Maack (Eds.), Encyclopedia of Library and Information Sciences, Third Edition (3rd ed., pp. 4020–4030). CRC Press. https://doi.org/10.1081/E-ELIS3-120044377
Lee, Yong-Ju, & Kim, Chang-Su. (2011). A Learning Ontology Method for RESTful Semantic Web Services. 2011 IEEE International Conference on Web Services, 251–258. https://doi.org/10.1109/ICWS.2011.59
Leeuwen, Jan van. (2014). On Floridi’s Method of Levels of Abstraction. Minds and Machines, 24(1), 5–17. https://doi.org/10.1007/s11023-013-9321-7
Leigh Star, Susan, Bowker, Geoffrey C., & Neumann, Laura J. (2003). Transparency beyond the Individual Level of Scale: Convergence between Information Artifacts and Communities of Practice. In Ann P. Bishop, Nancy A. Van House, & Barbara Pfeil Buttenfield (Eds.), Digital library use: Social practice in design and evaluation (pp. 241–269). MIT Press.
Leimgruber, Walter, Andris, Silke, & Bischoff, Christine. (2011). Von der Volkskunde zur Kulturwissenschaft. Schweizerisches Archiv Für Volkskunde = Archives Suisses Des Traditions Populaires, 107(1), 77–88. https://doi.org/10.5169/SEALS-154340
Leimgruber, Walter. (2008). Was ist immaterielles Kulturerbe? Bulletin / Schweizerische Akademie Der Geistes- Und Sozialwissenschaften, 2008, H. 2, 24–25. http://edoc.unibas.ch/dok/A5251330
Leimgruber, Walter. (2010). Switzerland and the UNESCO Convention on Intangible Cultural Heritage. Journal of Folklore Research, 47(1–2), 161–196. https://doi.org/10.2979/JFR.2010.47.1-2.161
Leiner, Barry M., Cerf, Vinton G., Clark, David D., Kahn, Robert E., Kleinrock, Leonard, Lynch, Daniel C., Postel, Jon, Roberts, Lawrence G., & Wolff, Stephen S. (1997). The past and future history of the Internet. Communications of the ACM, 40(2), 102–108. https://doi.org/10.1145/253671.253741
Lemmer-Webber, Christine, & Tallon, Jessica. (2018). ActivityPub. In W3C. https://www.w3.org/TR/activitypub/
Lemos, Daniela Lucas Da Silva, Martins, Dalton Lopes, Sá, Asla Medeiros E., Martins, Luciana Conrado, & Carmo, Danielle Do. (2022). A Proposal in Creating a Semantic Repository for Digital 3D Replicas: The Case of Modernist Sculptures in Public Spaces of Rio De Janeiro. KNOWLEDGE ORGANIZATION, 49(3), 151–171. https://doi.org/10.5771/0943-7444-2022-3-151
Lenzerini, Federico. (2011). Intangible Cultural Heritage: The Living Culture of Peoples. European Journal of International Law, 22(1), 101–120. https://doi.org/10.1093/ejil/chr006
Lewenstein, Bruce V. (2022). Is Citizen Science a Remedy for Inequality? The ANNALS of the American Academy of Political and Social Science, 700(1), 183–194. https://doi.org/10.1177/00027162221092697
Li, Linda C., Grimshaw, Jeremy M., Nielsen, Camilla, Judd, Maria, Coyte, Peter C., & Graham, Ian D. (2009). Evolution of Wenger’s concept of community of practice. Implementation Science, 4(1), 11. https://doi.org/10.1186/1748-5908-4-11
Li, Xigao, Azad, Babak Amin, Rahmati, Amir, & Nikiforakis, Nick. (2021). Good Bot, Bad Bot: Characterizing Automated Browsing Activity. 2021 IEEE Symposium on Security and Privacy (SP), 1589–1605. https://doi.org/10.1109/SP40001.2021.00079
Lim, Shirley, & Li Liew, Chern. (2011). Metadata quality and interoperability of GLAM digital images. Aslib Proceedings, 63(5), 484–498. https://doi.org/10.1108/00012531111164978
Lin, Tsung-Yi, Maire, Michael, Belongie, Serge, Hays, James, Perona, Pietro, Ramanan, Deva, Dollár, Piotr, & Zitnick, C. Lawrence. (2014). Microsoft COCO: Common Objects in Context. In David Fleet, Tomas Pajdla, Bernt Schiele, & Tinne Tuytelaars (Eds.), Computer Vision – ECCV 2014 (Vol. 8693, pp. 740–755). Springer International Publishing. https://doi.org/10.1007/978-3-319-10602-1_48
Linden, Alexander, & Fenn, Jackie. (2003). Understanding Gartner’s hype cycles (Strategic {Analysis} {Report} R-20-1971; p. 12). Gartner. https://web.archive.org/web/20231108132356/http://ask-force.org/web/Discourse/Linden-HypeCycle-2003.pdf
Linden, Alexander. (2015). Hype Cycle for Advanced Analytics and Data Science, 2015 [Strategic {Analysis} {Report}]. Gartner. https://www.gartner.com/en/documents/3087721
Lindenthal, Jutta, Meiners, Hanna-Lena, & Balzer, Detlev. (2023). LIDO Primer. In LIDO. https://lido-schema.org/documents/primer/latest/lido-primer.html
Lit, L. W. C. van. (2020). The Digital Materiality of Digitized Manuscripts. In Among Digitized Manuscripts. Philology, Codicology, Paleography in a Digital World (pp. 51–72). Brill. https://www.jstor.org/stable/10.1163/j.ctv2gjwzrd.6
Llewellyn, Clare, Sanderson, Robert, Page, Kevin R., Bhaugeerutty, Aruna, Shapland, Andrew, Davis, Kelly, Delmas-Glass, Emmanuelle, & Tyler, Bonnet. (2023). Enriching Exhibition Schloarship. In Anne Baillot, Walter Scholger, Toma Tasovac, & Georg Vogeler (Eds.), Digital Humanities 2023 Book of Abstracts (Vol. 2023, pp. 516–517). Alliance of Digital Humanities Organizations (ADHO). https://doi.org/10.5281/zenodo.8107930
Lodi, Giorgia, Asprino, Luigi, Nuzzolese, Andrea Giovanni, Presutti, Valentina, Gangemi, Aldo, Recupero, Diego Reforgiato, Veninata, Chiara, & Orsini, Annarita. (2017). Semantic Web for Cultural Heritage Valorisation. In Shalin Hai-Jew (Ed.), Data Analytics in Digital Humanities (pp. 3–37). Springer International Publishing. https://doi.org/10.1007/978-3-319-54499-1_1
Loulanski, Tolina. (2006). Revising the Concept for Cultural Heritage: The Argument for a Functional Approach. International Journal of Cultural Property, 13(2), 207–233. https://doi.org/10.1017/S0940739106060085
Lowenthal, David. (2005). Natural and cultural heritage. International Journal of Heritage Studies, 11(1), 81–92. https://doi.org/10.1080/13527250500037088
Lowry, Christopher S., & Stepenuck, Kristine F. (2021). Is Citizen Science Dead? Environmental Science & Technology, 55(8), 4194–4196. https://doi.org/10.1021/acs.est.0c07873
Ma, Yue, & Hitzler, Pascal. (2009). Paraconsistent Reasoning for OWL 2. In Axel Polleres & Terrance Swift (Eds.), Web Reasoning and Rule Systems (pp. 197–211). Springer. https://doi.org/10.1007/978-3-642-05082-4_14
Machiya, Daichi, Takahashi, Michiko, & Koguchi, Yuka. (2023). Improvement of metadata interoperability for promoting distribution and utilization of Japanese digital cultural resources. Proceedings of the International Conference on Dublin Core and Metadata Applications, 1–3. https://doi.org/10.23106/dcmi.953168702
Mahony, Simon. (2018). Cultural Diversity and the Digital Humanities. Fudan Journal of the Humanities and Social Sciences, 11(3), 371–388. https://doi.org/10.1007/s40647-018-0216-0
Manguinhas, Hugo, Matas, Ariadna, & Škrinjar, Maša. (2023). Enrichments policy for the common European data space for cultural heritage (Enrichments {Policy} CNECT/LUX/2021/OP/0070; p. 9). Europeana. https://pro.europeana.eu/post/enrichments-policy-for-the-common-european-data-space-for-cultural-heritage
Manovich, Lev. (2017). Cultural Analytics, Social Computing and Digital Humanities. In Mirko Tobias Schäfer & Karin van Es (Eds.), The Datafied Society (pp. 55–68). Amsterdam University Press. https://www.jstor.org/stable/j.ctt1v2xsqn.8
Manz, Marian Clemens, Raemy, Julien Antoine, & Fornaro, Peter. (2023). Recommended 3D Workflow for Digital Heritage Practices. Archiving Conference, 20, 23–28. https://doi.org/10.2352/issn.2168-3204.2023.20.1.5
Marcondes, Carlos Henrique. (2021). Integrated classification schemas to interlink cultural heritage collections over the web using LOD technologies. International Journal of Metadata, Semantics and Ontologies, 15(3), 170. https://doi.org/10.1504/IJMSO.2021.123040
Martinez Demarco, Sol. (2019). Empowering women through digital skills in Argentina: A tale of two stories. TATuP - Zeitschrift Für Technikfolgenabschätzung in Theorie Und Praxis, 28(2), 23–28. https://doi.org/10.14512/tatup.28.2.s23
Martinez Demarco, Sol. (2023). From digital inclusion to IT appropriation: Gendered aspects of appropriation imaginary and practices. GENDER – Zeitschrift Für Geschlecht, Kultur Und Gesellschaft, 15(1), 72–86. https://doi.org/10.3224/gender.v15i1.06
Masolo, Claudio, Borgo, Stefano, Gangemi, Aldo, Guarino, Nicola, & Oltramari, Alessandro. (2003). Wonder Web Deliverable D18: Ontology Library (Ontology {Infrastructure} for the {Semantic} {Web} Del 18; p. 343). Laboratory For Applied Ontology - ISTC-CNR. http://www.loa.istc.cnr.it/old/Papers/D18.pdf
Massari, Arcangelo, Peroni, Silvio, Tomasi, Francesca, & Heibi, Ivan. (2023, June). Representing provenance and track changes of cultural heritage metadata in RDF: A survey of existing approaches. DH2023 Book of Abstracts. https://doi.org/10.5281/zenodo.8108101
Mathieualexhache. (2021). OAIS Functional Model. https://commons.wikimedia.org/wiki/File:OAIS_Functional_Model_(en).svg
Mazzocchi, Fulvio. (2018). Knowledge organization system (KOS). Knowledge Organization, 45(1), 54–78. https://doi.org/10.5771/0943-7444-2018-1-54
McCarty, Willard. (2023). Pursuing a combinatorial habit of mind and machine. In Julianne Nyhan, Geoffrey Rockwell, Stéfan Sinclair, & Alexandra Ortolja-Baird (Eds.), On Making in the Digital Humanities (pp. 251–266). UCL Press. https://doi.org/10.14324/111.9781800084209
McGillivray, Barbara, Poibeau, Thierry, & Fabo, Pablo Ruiz. (2020). Digital Humanities and Natural Language Processing: Je t’aime... Moi non plus. Digital Humanities Quarterly, 014(2). https://www.digitalhumanities.org/dhq/vol/14/2/000454/000454.html
McMillan, David W., & Chavis, David M. (1986). Sense of community: A definition and theory. Journal of Community Psychology, 14(1), 6–23. https://doi.org/10.1002/1520-6629(198601)14:1<6::AID-JCOP2290140103>3.0.CO;2-I
Mehlenbacher, Ashley Rose. (2022). On Expertise: Cultivating Character, Goodwill, and Practical Wisdom. In On Expertise. Penn State University Press. https://doi.org/10.1515/9780271093130
Metz, Allison, Boaz, Annette, & Robert, Glenn. (2019). Co-creative approaches to knowledge production: What next for bridging the research to practice gap? Evidence & Policy, 15(3), 331–337. https://doi.org/10.1332/174426419X15623193264226
Meunier, Jean-Guy. (2017). Humanités numériques et modélisation scientifique. Questions de Communication, 2017(31), 19–48. https://doi.org/10.4000/questionsdecommunication.11040
Micle, Dorel. (2014). Archaeological Heritage Between Natural Hazard and Anthropic Destruction: The Negative Impact of Social Non-involvement in the Protection of Archaeological Sites. Procedia - Social and Behavioral Sciences, 163, 269–278. https://doi.org/10.1016/j.sbspro.2014.12.316
Middle, Sarah. (2022). Investigating Linked Data Usability for Ancient World Research [{PhD} {Thesis}, The Open University]. https://doi.org/10.21954/ou.ro.00014b1f
Mikhaylova, Daria, & Metilli, Daniele. (2023). Extending RiC-O to Model Historical Architectural Archives: The ITDT Ontology. Journal on Computing and Cultural Heritage, 16(4), 67:1-67:15. https://doi.org/10.1145/3606706
Mixter, Jeff. (2014). Using a Common Model: Mapping VRA Core 4.0 Into an RDF Ontology. Journal of Library Metadata, 14(1), 1–23. https://doi.org/10.1080/19386389.2014.891890
Mol, Annemarie. (2002). The Body Multiple: Ontology in Medical Practice. Duke University Press.
Morales, Susana. (2009). La apropiación de TIC: Una perspectiva. In Susana Morales & M. I. Loyola (Eds.), Los jóvenes y las TIC. Apropiación y uso en educación (pp. 99–120). Edición de las autoras.
Morales, Susana. (2017). Imaginación y software: Aportes para la construcción del paradigma de la apropiación. Del Gato Gris. http://hdl.handle.net/11086/27405
Morales, Susana. (2018). La apropiación de tecnologías. Ideas para un paradigma en construcción. In Acerca de la apropiación de tecnologías. Teoría, estudios y debates (pp. 23–33). Del Gato Gris.
Morrison, Robbie. (2021). Redrawn slide from presentation of Ana Persic, Division of Science Policy and Capacity-Building (SC/PCB), UNESCO (France) presentation to Open Science Conference 2021, ZBW — Leibniz Information Centre for Economics, Germany. https://commons.wikimedia.org/wiki/File:Osc2021-unesco-open-science-no-gray.png
Moutsatsos, Ioannis. (2017). DevOps and Life Sciences Informatics: Librarians and Triple-Eye-Eff to the rescue of scientific imaging. In DevOps and Life Sciences Informatics. https://imoutsatsos.blogspot.com/2017/04/librarians-and-triple-eye-eff-to-rescue.html
Mr Gee. (2023, October). Day 2 Closing – A multitude of tools. EuropeanaTech 2023. https://youtu.be/pOX9CrvAG7I
Müller, Katja. (2018). Digitale Objekte - subjektive Materie. Zur Materialität digitalisierter Objekte in Museum und Archiv. In Hans Peter Hahn & Friedemann Neumann (Eds.), Edition Kulturwissenschaft (1st ed., Vol. 182, pp. 49–66). transcript Verlag. https://doi.org/10.14361/9783839445136-004
Munjeri, Dawson. (2004). Tangible and Intangible Heritage: From difference to convergence. Museum International, 56(1–2), 12–20. https://doi.org/10.1111/j.1350-0775.2004.00453.x
Münster, S., Apollonio, F. I., Bell, P., Kuroczynski, P., Di Lenardo, I., Rinaudo, F., & Tamborrino, R. (2019). Digital Cultural Heritage meets Digital Humanities. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W15, 813–820. https://doi.org/10.5194/isprs-archives-XLII-2-W15-813-2019
Münster, Sander, Utescher, Ronja, & Ulutas Aydogan, Selda. (2021). Digital topics on cultural heritage investigated: How can data-driven and data-guided methods support to identify current topics and trends in digital heritage? Built Heritage, 5(1), 25. https://doi.org/10.1186/s43238-021-00045-7
Myers, Brad A., & Stylos, Jeffrey. (2016). Improving API usability. Communications of the ACM, 59(6), 62–69. https://doi.org/10.1145/2896587
Nakamura, Satoru. (2019, January). Open Data Initiatives at the University of Tokyo’s Digital Archives Construction Project, Kyushu University Collection, and Kyushu University Library. Open Data and Universities. https://hdl.handle.net/2324/2197530
Nargesian, Fatemeh, Zhu, Erkang, Miller, Renée J., Pu, Ken Q., & Arocena, Patricia C. (2019). Data lake management: Challenges and opportunities. Proceedings of the VLDB Endowment, 12(12), 1986–1989. https://doi.org/10.14778/3352063.3352116
Nasarek, Robert. (2020). Virtuelle Forschungsumgebungen und Sammlungsräume: Objekte digital modellieren und miteinander vernetzen. In Udo Andraschke & Sarah Wagner (Eds.), Digitale Gesellschaft (1st ed., Vol. 33, pp. 131–146). transcript Verlag. https://doi.org/10.14361/9783839455715-010
Nelson, Michael L., & Van de Sompel, Herbert. (2020). A 25 Year Retrospective on D-Lib Magazine. arXiv. https://doi.org/10.48550/arXiv.2008.11680
Nelson, Michael L., & Van de Sompel, Herbert. (2022). D-Lib Magazine pioneered Web-based Scholarly Communication. Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries, 1–12. https://doi.org/10.1145/3529372.3530929
Nelson, Peter A. (2021). The Role of GPR in Community-Driven Compliance Archaeology with Tribal and Non-tribal Communities in Central California. Advances in Archaeological Practice, 9(3), 215–225. https://doi.org/10.1017/aap.2021.14
Neudecker, Clemens. (2022). Cultural Heritage as Data: Digital Curation and Artificial Intelligence in Libraries. In Adrian Paschke, Georg Rehm, Clemens Neudecker, & Lydia Pintscher (Eds.), Proceedings of the Third Conference on Digital Curation Technologies (Qurator 2022) (Vol. 3234). CEUR. https://ceur-ws.org/Vol-3234/#paper2
Newbury, David. (2018). LOUD: Linked Open Usable Data and linked.art. 2018 CIDOC Conference, 1–11. https://cidoc.mini.icom.museum/wp-content/uploads/sites/6/2021/03/CIDOC2018_paper_153.pdf
Newbury, David. (2024). Linked Data in Production: Moving Beyond Ontologies. https://www.slideshare.net/slideshow/linked-data-in-production-moving-beyond-ontologies/266976602
Nielsen, Erland Kolding. (2008). Digitisation of Library Material in Europe: Problems, Obstacles and Perspectives anno 2007. LIBER Quarterly: The Journal of the Association of European Research Libraries, 18(1), 20–27. https://doi.org/10.18352/lq.7901
NISO. (2010). Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies (American {National} {Standard} ANSI/NISO Z39.19-2005 (R2010)). National Information Standards Organization. https://groups.niso.org/higherlogic/ws/public/download/12591/z39-19-2005r2010.pdf
Nowviskie, Bethany. (2015). Digital Humanities in the Anthropocene. Digital Scholarship in the Humanities, 30(suppl_1), i4–i15. https://doi.org/10.1093/llc/fqv015
O’Toole, Simon, & Tocknell, James. (2022). FAIR standards for astronomical data. arXiv. https://doi.org/10.48550/arXiv.2203.10710
Ogden, Charles Kay, & Richards, Ivor Armstrong. (1930). The Meaning Of Meaning. Kegan Paul, Trench, Trubner & Co. https://n2t.net/ark:/13960/t14n48t6b
Oluwatosin, Haroon Shakirat. (2014). Client-Server Model. IOSR Journal of Computer Engineering, 16(1), 67–71. https://doi.org/10.9790/0661-16195771
Ostrom, Elinor. (1990). Governing the Commons: The Evolution of Institutions for Collective Action (1st ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511807763
Owens, Trevor. (2011). Defining Data for Humanists: Text, Artifact, Information or Evidence? Journal of Digital Humanities, 1(1). https://journalofdigitalhumanities.org/1-1/defining-data-for-humanists-by-trevor-owens/
Owens, Trevor. (2013). Digital Cultural Heritage and the Crowd. Curator: The Museum Journal, 56(1), 121–130. https://doi.org/10.1111/cura.12012
Oxford English Dictionary. (2023). Agency. In Oxford English Dictionary (OED). Oxford University Press. https://doi.org/10.1093/OED/1249589150
Oxford English Dictionary. (2023). Artificial Intelligence. In Oxford English Dictionary (OED). Oxford University Press. https://doi.org/10.1093/OED/3194963277
Oxford English Dictionary. (2023). Citizen Science. In Oxford English Dictionary (OED). Oxford University Press. https://doi.org/10.1093/OED/6784505301
Oxford English Dictionary. (2023). Community. In Oxford English Dictionary (OED). Oxford University Press. https://doi.org/10.1093/OED/1005093760
Oxford English Dictionary. (2023). Phronesis. In Oxford English Dictionary (OED). Oxford University Press. https://doi.org/10.1093/OED/8835195828
Oxford English Dictionary. (2023). Practice. In Oxford English Dictionary (OED). Oxford University Press. https://doi.org/10.1093/OED/7409132020
Oxford English Dictionary. (2023). Semantics. In Oxford English Dictionary (OED). Oxford University Press. https://doi.org/10.1093/OED/2881032460
Padfield, Joseph, Bolland, Charlotte, Fitzgerald, Neil, McLaughlin, Anne, Robson, Glen, & Terras, Melissa. (2022). Practical applications of IIIF as a building block towards a digital National Collection [Arts and {Humanities} {Research} {Council} {Final} {Report}]. Towards a National Collection. https://doi.org/10.5281/zenodo.6884885
Padilla, Thomas, Allen, Laurie, Frost, Hannah, Potvin, Sarah, Russey Roke, Elisabeth, & Varner, Stewart. (2017). Always Already Computational: Collections as Data. Collections as Data. https://doi.org/10.17605/OSF.IO/MX6UK
Padilla, Thomas, Scates Kettler, Hannah, & Shorish, Yasmeen. (2023). Collections as Data: Part to Whole (p. 19) [Final {Report}]. Always Already Computational - Collections as Data. https://doi.org/10.5281/zenodo.10161976
Padilla, Thomas, Scates Kettler, Hannah, Varner, Stewart, & Shorish, Yasmeen. (2023). Vancouver Statement on Collections as Data [White paper]. Internet Archive Canada. https://doi.org/10.5281/zenodo.8341519
Page, Kevin R., Delmas-Glass, Emmanuelle, Beaudet, David, Norling, Samantha, Rother, Lynn, & Hänsli, Thomas. (2020). Linked Art: Networking Digital Collections and Scholarship. Digital Humanities 2020 Book of Abstracts, 504–509. https://dh2020.adho.org/wp-content/uploads/2020/07/139_LinkedArtNetworkingDigitalCollectionsandScholarship.html
Pagenstecher, Cord. (2009). Private Fotoalben als historische Quelle. Zeithistorische Forschungen/Studies in Contemporary History, 6(3), 449–463. https://doi.org/10.14765/ZZF.DOK-1803
Pan, Jeff Z., Razniewski, Simon, Kalo, Jan-Christoph, Singhania, Sneha, Chen, Jiaoyan, Dietze, Stefan, Jabeen, Hajira, Omeliyanenko, Janna, Zhang, Wen, Lissandrini, Matteo, Biswas, Russa, Melo, Gerard de, Bonifati, Angela, Vakaj, Edlira, Dragoni, Mauro, & Graux, Damien. (2023). Large Language Models and Knowledge Graphs: Opportunities and Challenges. arXiv. https://doi.org/10.48550/arXiv.2308.06374
Papadakis, Ioannis, Kyprianos, Konstantinos, & Stefanidakis, Michalis. (2015). Linked Data URIs and Libraries: The Story So Far. D-Lib Magazine, 21(5/6). https://doi.org/10.1045/may2015-papadakis
Paquet, Anna P. (2020). Linked Data and Linked Open Data Projects for Libraries, Archives and Museums: Constructing Pathways to Information Discovery and Cultural Heritage Sector Collaboration. Johns Hopkins Libraries. http://jhir.library.jhu.edu/handle/1774.2/63875
Patrón, Pedro, Miguelañez, Emilio, Petillot, Yvan R., Patrón, Pedro, Miguelañez, Emilio, & Petillot, Yvan R. (2011). Embedded Knowledge and Autonomous Planning: The Path Towards Permanent Presence of Underwater Networks. In Autonomous Underwater Vehicles (pp. 199–224). IntechOpen. https://doi.org/10.5772/24649
Pelacho, Maite, Rodríguez, Hannot, Broncano, Fernando, Kubus, Renata, García, Francisco Sanz, Gavete, Beatriz, & Lafuente, Antonio. (2021). Science as a Commons: Improving the Governance of Knowledge Through Citizen Science. In Katrin Vohland, Anne Land-Zandstra, Luigi Ceccaroni, Rob Lemmens, Josep Perelló, Marisa Ponti, Roeland Samson, & Katherin Wagenknecht (Eds.), The Science of Citizen Science (pp. 57–78). Springer International Publishing. https://doi.org/10.1007/978-3-030-58278-4\_4
Perrigo, Billy. (2023). Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic. Time. https://time.com/6247678/openai-chatgpt-kenya-workers/
Persic, Ana. (2021). Building a Global Consensus on Open Science – the future UNESCO Recommendation on Open Science. https://doi.org/10.5446/53434
Peterhans, Simon, Sauter, Loris, Spiess, Florian, & Schuldt, Heiko. (2022). Automatic Generation of Coherent Image Galleries in Virtual Reality. In Gianmaria Silvello, Oscar Corcho, Paolo Manghi, Giorgio Maria Di Nunzio, Koraljka Golub, Nicola Ferro, & Antonella Poggi (Eds.), Linking Theory and Practice of Digital Libraries (Vol. 13541, pp. 282–288). Springer International Publishing. https://doi.org/10.1007/978-3-031-16802-4_23
Petz, Georg. (2023). Linked Open Data. Zukunftsweisende Strategien. Bibliothek Forschung Und Praxis. https://doi.org/10.1515/bfp-2023-0006
Pfrunder, Peter. (1995). Ernst Brunner: Photographien, 1937-1962 (2. Aufl). Schweizerische Gesellschaft für Volkskunde ; Offizin.
Philip, Kavita. (2021). The Internet Will Be Decolonized. In Thomas S. Mullaney, Benjamin Peters, Mar Hicks, & Kavita Philip (Eds.), Your Computer Is on Fire (pp. 91–116). The MIT Press. https://doi.org/10.7551/mitpress/10993.003.0002
Pippin, Robert B. (1991). Idealism and Agency in Kant and Hegel. The Journal of Philosophy, 88(10), 532–541. https://www.jstor.org/stable/2027098
Pirgova-Morgan, Luba. (2023). Looking towards a brighter future: The potentiality of AI and digital transformations to library spaces (p. 111) [Digital {Futures} {Research} {Report}]. University of Leeds Libraries. https://library.leeds.ac.uk/downloads/download/196/artificial-intelligence-ai-in-libraries
Pitti, Daniel V. (1999). Encoded archival description: An introduction and overview. New Review of Information Networking, 5(1), 61–69. https://doi.org/10.1080/13614579909516936
Plato. (n.d.). The Republic (Benjamin Jowett, Trans.). Internet Classics Archive. https://classics.mit.edu/Plato/republic.html
Pohl, Adrian, Steeg, Fabian, & Christoph, Pascal. (2018). Lobid – Dateninfrastruktur für Bibliotheken. Informationspraxis, 4(1). https://doi.org/10.11588/ip.2018.1.52445
Pohl, Adrian. (2021). How We Built a Spatial Subject Classification Based on Wikidata. The Code4Lib Journal, 2021(51). https://journal.code4lib.org/articles/15875
Portalés, Cristina, Rodrigues, João M. F., Rodrigues Gonçalves, Alexandra, Alba, Ester, & Sebastián, Jorge. (2018). Digital Cultural Heritage. Multimodal Technologies and Interaction, 2(3), 58. https://doi.org/10.3390/mti2030058
Poulopoulos, Vassilis, & Wallace, Manolis. (2022). Digital Technologies and the Role of Data in Cultural Heritage: The Past, the Present, and the Future. Big Data and Cognitive Computing, 6(3), 181–199. https://doi.org/10.3390/bdcc6030073
Poupeau, Gautier. (2018). Réflexions et questions autour du Web sémantique. In Les petites cases. https://web.archive.org/web/20240813032044/https://www.lespetitescases.net/reflexions-et-questions-autour-du-web-semantique
Prongué, Nicolas, & Schneider, René. (2015). Modelling library linked data in practice: Three Swiss case studies. In Franjo Pehar, Christian Schlögl, & Christian Wolff (Eds.), Re:inventing Information Science in the Networked Society: Proceedings of the 14th International Symposium on Information Science (Vol. 66, pp. 118–128). Verlag Werner Hülsbusch. https://arodes.hes-so.ch/record/988
Purday, Jon. (2009). Think culture: Europeana.eu from concept to construction. Bibliothek Forschung Und Praxis, 33(2), 170–180. https://doi.org/10.1515/bfup.2009.018
Rabun, Sheila. (2016). Scoping the “IIIF universe”: First Steps to Discovery (p. 21) [Research {Proposal}]. Information School, University of Washington.
Raemy, Julien Antoine, & Gautschy, Rita. (2023). Élaboration d’un processus pour les images 3D reposant sur IIIF. In Simon Gabay, Elina Leblanc, Nicola Carboni, Radu Suciu, Gabriella Lini, Marie Barras, & Fatiha Idmhand (Eds.), Recueil des communications du 4e colloque Humanistica (pp. 1–3). Humanistica, l’association francophone des humanités numériques/digitales. https://doi.org/10.5451/unibas-ep94862
Raemy, Julien Antoine, & Sanderson, Robert. (2023). Analysis of the Usability of Automatically Enriched Cultural Heritage Data. arXiv. https://doi.org/10.48550/arXiv.2309.16635
Raemy, Julien Antoine, & Sanderson, Robert. (2024). Analysis of the Usability of Automatically Enriched Cultural Heritage Data. In Fernando Moral-Andrés, Elena Merino-Gómez, & Pedro Reviriego (Eds.), Decoding Cultural Heritage: A Critical Dissection and Taxonomy of Human Creativity through Digital Tools (pp. 69–93). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-57675-1_4
Raemy, Julien Antoine, & Schneider, René. (2019). Suggested measures for deploying IIIF in Swiss cultural heritage institutions [White paper]. Haute école de gestion de Genève. https://doi.org/10.5281/zenodo.2640416
Raemy, Julien Antoine, Gray, Tanya, Collinson, Alwyn, & Page, Kevin R. (2023, July). Enabling Participatory Data Perspectives for Image Archives through a Linked Art Workflow (Poster). Digital Humanities 2023 Posters. https://doi.org/10.5281/zenodo.7878358
Raemy, Julien Antoine, Gray, Tanya, Collinson, Alwyn, & Page, Kevin R. (2023). Enabling Participatory Data Perspectives for Image Archives through a Linked Art Workflow. In Anne Baillot, Walter Scholger, Toma Tasovac, & Georg Vogeler (Eds.), Digital Humanities 2023 Book of Abstracts (Vol. 2023, pp. 515–516). Alliance of Digital Humanities Organizations (ADHO). https://doi.org/10.5451/unibas-ep95099
Raemy, Julien Antoine. (2017). The International Image Interoperability Framework (IIIF): Raising awareness of the user benefits for scholarly editions [Bachelor’s thesis, HES-SO University of Applied Sciences; Arts, Haute école de gestion de Genève]. https://sonar.ch/hesso/documents/314853
Raemy, Julien Antoine. (2020). Enabling better aggregation and discovery of cultural heritage content for Europeana and its partner institutions [Master’s thesis, HES-SO University of Applied Sciences; Arts, Haute école de gestion de Genève]. https://sonar.ch/hesso/documents/315109
Raemy, Julien Antoine. (2021). Applying Effective Data Modelling Approaches for the Creation of a Participatory Archive Platform. In Yumeng Hou (Ed.), Human Factors in Digital Humanities (pp. 1–5). Institut des humanités digitales. https://doi.org/10.5451/unibas-ep87517
Raemy, Julien Antoine. (2022, October). Back and Forth from Boundary Objects to IIIF Resources: The Recipes of a Community-driven Initiative Specifying Standards. DARIAH-CH Study Day 2022 Posters. https://doi.org/10.5281/zenodo.7015256
Raemy, Julien Antoine. (2022). Améliorer la valorisation des données du patrimoine culturel grâce au Linked Open Usable Data (LOUD). In Nicolas Lasolle, Olivier Bruneau, & Jean Lieber (Eds.), Actes des journées humanités numériques et Web sémantique (pp. 132–149). Les Archives Henri-Poincaré - Philosophie et Recherches sur les Sciences et les Technologies (AHP-PReST); Laboratoire lorrain de recherche en informatique et ses applications (LORIA). https://doi.org/10.5451/unibas-ep89725
Raemy, Julien Antoine. (2023). Characterising the IIIF and Linked Art Communities: Survey report (p. 29) [Report]. University of Basel. https://hal.science/hal-04162572
Raemy, Julien Antoine. (2023). DaSCHCon 2023.2 Wrap-up. https://ark.dasch.swiss/ark:/72163/1/0810/J3fJY80lSsWkooqot76n8w_
Raemy, Julien Antoine. (2023). Dear Mastodon, #IIIF and #LinkedArt enthusiasts or newcomers (please boost!),Are you involved or have you already been in contact with the… [Mastodon post]. In Mastodon. https://hcommons.social/@julsraemy/110077871549376350
Raemy, Julien Antoine. (2023). Pseudonymised Dataset of the Characterising the IIIF and Linked Art communities survey [Survey results]. Zenodo. https://doi.org/10.5281/zenodo.8143828
Raemy, Julien Antoine. (2024). Interlinking Cultural Heritage Data with Community-driven Principles and Standards. https://julsraemy.ch/prezi/pia-ringvorlesung-2024.html
Raemy, Julien Antoine. (2024). Some notes from the 2024 IIIF Conference held in Los Angeles. In Thoughts and discombobulations of Julien A. Raemy. https://julsraemy.ch/posts/2024/06/26/iiif-conference-la/
Ram, Sudha, & Liu, Jun. (2009). A New Perspective on Semantics of Data Provenance. Proceedings of the First International Conference on Semantic Web in Provenance Management, 526, 35–40. https://ceur-ws.org/Vol-526/InvitedPaper_1.pdf
Rautenberg, Michel. (1998). L’émergence patrimoniale de l’ethnologie : Entre mémoire et politiques publiques. In Patrimoine et modernité (pp. 279–289). L’Harmattan.
Raza, Zahid, Mahmood, Khalid, & Warraich, Nosheen Fatima. (2019). Application of linked data technologies in digital libraries: A review of literature. Library Hi Tech News, 36(3), 9–12. https://doi.org/10.1108/LHTN-10-2018-0067
Respaldiza Hidalgo, María Aránzazu, Wachowicz, Monica, & Vázquez Hoehne, Antonio. (2011). Metadata Visualization of Cultural Heritage Information within a Collaborative Environment. Proceedings of XXIIIrd International CIPA Symposium. https://oa.upm.es/11636/
Ribes, David, & Lee, Charlotte P. (2010). Sociotechnical Studies of Cyberinfrastructure and e-Research: Current Themes and Future Trajectories. Computer Supported Cooperative Work (CSCW), 19(3), 231–244. https://doi.org/10.1007/s10606-010-9120-0
Ridge, Mia (Ed.). (2017). Crowdsourcing our cultural heritage (First issued in paperback). Routledge.
Ridge, Mia, Blickhan, Samantha, Ferriter, Meghan, Mast, Austin, Brumfield, Ben, Wilkins, Brendon, Cybulska, Daria, Burgher, Denise, Casey, Jim, Luther, Kurt, Goldman, Michael Haley, White, Nick, Willcox, Pip, Brumfield, Sara Carlstead, Coleman, Sonya J., & Prytz, Ylva Berglund. (2021). 12. Connecting with communities. In The Collective Wisdom Handbook: Perspectives on Crowdsourcing in Cultural Heritage - community review version (1st edition). British Library. https://doi.org/10.21428/a5d7554f.1b80974b
Ridge, Mia, Blickhan, Samantha, Ferriter, Meghan, Mast, Austin, Brumfield, Ben, Wilkins, Brendon, Cybulska, Daria, Burgher, Denise, Casey, Jim, Luther, Kurt, Goldman, Michael Haley, White, Nick, Willcox, Pip, Brumfield, Sara Carlstead, Coleman, Sonya J., & Prytz, Ylva Berglund. (2021). 5. Designing cultural heritage crowdsourcing projects. In The Collective Wisdom Handbook: Perspectives on Crowdsourcing in Cultural Heritage - community review version (1st edition). British Library. https://doi.org/10.21428/a5d7554f.1b80974b
Ridge, Mia, Ferriter, Meghan, & Blickhan, Samantha. (2023). Recommendations, Challenges and Opportunities for the Future of Crowdsourcing in Cultural Heritage: A White Paper. British Library. https://doi.org/10.21428/a5d7554f.2a84f94b
Ridge, Mia. (2023, October). Enriching lives: Connecting communities and culture with the help of machines. EuropeanaTech 2023. https://doi.org/10.5281/zenodo.8429858
Riley, Jenn. (2009). Seeing Standards: A Visualization of the Metadata Universe. In Jenn Riley. https://jennriley.com/metadatamap/
Riley, Jenn. (2017). Understanding metadata. What is metadata, and what is it for? National Information Standards Organization (NISO).
Riva, Pat, Le Boeuf, Patrick, & Žumer, Maja. (2017). IFLA Library Reference Model: A Conceptual Model for Bibliographic Information [{IFLA}-{LRM}]. International Federation of Library Associations; Institutions. https://repository.ifla.org/handle/123456789/40
Roberts, Rebecca C., Jabbar, Junaid Abdul, Jones, Huw, Orengo, Hector A., Madella, Marco, & Petrie, Cameron A. (2021). Paper and pixels: Historic maps as a multifaceted resource. Abstracts of the ICA, 3, 1–2. https://doi.org/10.5194/ica-abs-3-250-2021
Robineau, Régis. (2019). Introduction aux protocoles IIIF. https://doi.org/10.5281/zenodo.3760306
Robinson, Cathy J., Kong, Taryn, Coates, Rebecca, Watson, Ian, Stokes, Chris, Pert, Petina, McConnell, Andrew, & Chen, Caron. (2021). Caring for Indigenous Data to Evaluate the Benefits of Indigenous Environmental Programs. Environmental Management, 68(2), 160–169. https://doi.org/10.1007/s00267-021-01485-8
Robson, Glen, Cossu, Stefano, Pillay, Ruven, & Smith, Michael D. (2023). Evaluating HTJ2K as a Drop-In Replacement for JPEG2000 with IIIF. The Code4Lib Journal, 2023(57). https://journal.code4lib.org/articles/17596
Robson, Glen. (2021). BL Training - IIIF Image API. https://youtu.be/28tkpIdnh4g
Rodighiero, Dario. (2021). Mapping Affinities: Democratizing Data Visualization. Métis Presses. https://dash.harvard.edu/handle/1/37368046
Roke, Elizabeth Russey, & Tillman, Ruth Kitchin. (2022). Pragmatic Principles for Archival Linked Data. The American Archivist, 85(1), 173–201. https://doi.org/10.17723/2327-9702-85.1.173
Romein, C. Annemieke, Kemman, Max, Birkholz, Julie M., Baker, James, De Gruijter, Michel, Meroño-Peñuela, Albert, Ries, Thorsten, Ros, Ruben, & Scagliola, Stefania. (2020). State of the Field: Digital History. History, 105(365), 291–312. https://doi.org/10.1111/1468-229X.12969
Rosenthaler, Lukas, & Fornaro, Peter. (2016). The ‘International Image Interoperability Framework’ and its Implication to Preservation. Archiving Conference, 13(1), 95–99. https://doi.org/10.2352/issn.2168-3204.2016.1.0.95
Rosenthaler, Lukas, Fornaro, Peter, & Clivaz, Claire. (2015). DASCH: Data and Service Center for the Humanities. Digital Scholarship in the Humanities, 30(Supl_1), i43–i49. https://doi.org/10.1093/llc/fqv051
Rosenthaler, Lukas. (2023). Long-Term Archiving of Digital Assets in Arts and Humanities: IIIF and Beyond. https://ark.dasch.swiss/ark:/72163/1/0810/hEOJNSNeQi6GzuumLps6Lgm
Rossenova, Lozana, & Di Franco, Karen. (2022). Iterative Pasts and Linked Futures: A Feminist Approach to Modeling Data in Archives and Collections of Artists’ Publishing. Perspectives on Data, 2. https://doi.org/10.53269/9780865593152/05
Rossenova, Lozana, Duchesne, Paul, & Blümel, Ina. (2022). Wikidata and Wikibase as complementary research data management services for cultural heritage data. In Lucie-Aimée Kaffee, Simon Razniewski, Gabriel Amaral, & Kholoud Saad Alghamdi (Eds.), Proceedings of the 3rd Wikidata Workshop 2022 (Vol. 3262). CEUR. https://ceur-ws.org/Vol-3262/#paper15
Rossenova, Lozana. (2023). IIIF for 3D – making web interoperability multi-dimensional. In Europeana PRO. https://pro.europeana.eu/post/iiif-for-3d-making-web-interoperability-multi-dimensional
Rousi, Maria, Meditskos, Georgios, Vrochidis, Stefanos, & Kompatsiaris, Ioannis. (2021). Supporting the Discovery and Reuse of Digital Content in Creative Industries using Linked Data. 2021 IEEE 15th International Conference on Semantic Computing (ICSC), 100–103. https://doi.org/10.1109/ICSC50631.2021.00025
Russell, Stuart J., & Norvig, Peter. (2010). Artificial intelligence: A modern approach (3rd ed). Prentice Hall.
SAA Dictionary. (2023). Taxonomy. In Dictionary of Archives Terminology. Society of American Archivists. https://dictionary.archivists.org/entry/taxonomy.html
Sabharwal, Arjun. (2015). 2 - Archives and special collections in the digital humanities. In Arjun Sabharwal (Ed.), Digital Curation in the Digital Humanities (pp. 27–47). Chandos Publishing. https://doi.org/10.1016/B978-0-08-100143-1.00002-7
Sacramento, Eveline R., Sardo, Susana, Miguel, Ana Flávia, Caixinha, Hélder J. M., & Cortês, Cristina. (2022). Considering the Question of the interoperability. Páginas A&b Arquivos & Bibliotecas, 3(18), 120–133. https://doi.org/10.21747/21836671/pag18a7
Saha, Barna, & Srivastava, Divesh. (2014). Data quality: The other face of Big Data. 2014 IEEE 30th International Conference on Data Engineering, 1294–1297. https://doi.org/10.1109/ICDE.2014.6816764
Sanderson, Robert, & Albritton, Benjamin. (2013). Shared Canvas Data Model 1.0. In International Image Interoperability Framework. https://iiif.io/api/model/shared-canvas/1.0/
Sanderson, Robert, Albritton, Benjamin, Schwemmer, Rafael, & Van de Sompel, Herbert. (2011). SharedCanvas: A collaborative model for medieval manuscript layout dissemination. Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, 175–184. https://doi.org/10.1145/1998076.1998111
Sanderson, Robert, Ciccarese, Paolo, & Van de Sompel, Herbert. (2013). Designing the W3C open annotation data model. Proceedings of the 5th Annual ACM Web Science Conference, 366–375. https://doi.org/10.1145/2464464.2464474
Sanderson, Robert, Ciccarese, Paolo, & Young, Benjamin. (2017). Web Annotation Data Model. In W3C. https://www.w3.org/TR/annotation-model/
Sanderson, Robert. (2003). Linking Past and Future: An Application of Dynamic HTML for Medieval Manuscript Editions [{PhD} {Thesis}]. University of Liverpool.
Sanderson, Robert. (2013). RDF: Resource Description Failures and Linked Data Letdowns. Journal of Digital Humanities, 2(3). https://journalofdigitalhumanities.org/2-3/rdf-resource-description-failures-and-linked-data-letdowns/
Sanderson, Robert. (2015). Linked Data Best Practices and BibFrame. https://www.slideshare.net/azaroth42/linked-data-best-practices-and-bibframe
Sanderson, Robert. (2016). Community Challenges for Practical Linked Open Data. https://www.slideshare.net/azaroth42/community-challenges-for-practical-linked-open-data-linked-pasts-keynote
Sanderson, Robert. (2018). Shout it Out: LOUD. https://www.slideshare.net/Europeana/shout-it-out-loud-by-rob-sanderson-europeanatech-conference-2018
Sanderson, Robert. (2019). Introduction to Linked Art Model. https://www.slideshare.net/azaroth42/introduction-to-linked-art-model
Sanderson, Robert. (2019). Keynote: Standards and Communities: Connected People, Consistent Data, Usable Applications. 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), 28. https://doi.org/10.1109/JCDL.2019.00009
Sanderson, Robert. (2020). Cultural Heritage Research Data Ecosystem. https://www.slideshare.net/azaroth42/sanderson-cni-2020-keynote-cultural-heritage-research-data-ecosystem
Sanderson, Robert. (2020). The Importance of being LOUD. https://www.slideshare.net/azaroth42/the-importance-of-being-loud
Sanderson, Robert. (2021). Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data. https://www.slideshare.net/azaroth42/introduction-to-linked-art-model
Sanderson, Robert. (2023). Semantic Completeness vs Data Usability in Cultural Heritage Collections: Lessons Learnt from LUX, Linked Art and IIIF. https://youtu.be/87Igks5t82U
Sanderson, Robert. (2023). Understanding Linked Art. https://www.slideshare.net/azaroth42/understanding-linked-art
Scharnhorst, Andrea, Flohr, Pascal, Tykhonov, Vyacheslav, De Vries, Jerry, Hollander, Hella, Touber, Jetze, Hugo, Wim, Smiraglia, Richard, Le Franc, Yann, Siebes, Ronald, & Meijers, Enno. (2023). Knowledge Organisation Systems in the Humanities—Semantic Interoperability in Practice. Proceedings of the Association for Information Science and Technology, 60(1), 1113–1115. https://doi.org/10.1002/pra2.962
Schier, Flint. (1986). Hume and the Aesthetics of Agency. Proceedings of the Aristotelian Society, 87, 121–135. https://www.jstor.org/stable/4545059
Schmidt, Sophie C., & Thiery, Florian. (2020). SPARQLing Ogham Stones: New Options for Analyzing Analog Editions by Digitization in Wikidata. In Tara Andrews, Franziska Diehr, Thomas Efer, Andreas Kuczera, & Joris van Zundert (Eds.), Graph Technologies in the Humanities - Proceedings 2020 (Vol. 3110, p. 211). CEUR. https://ceur-ws.org/Vol-3110/#paper11
Schmidt, Sophie C., Thiery, Florian, & Trognitz, Martina. (2022). Practices of Linked Open Data in Archaeology and Their Realisation in Wikidata. Digital, 2(3), 333–364. https://doi.org/10.3390/digital2030019
Schmoll, Friedemann. (2009a). Die Vermessung der Kultur: Der ‘Atlas der deutschen Volkskunde’ und die Deutsche Forschungsgemeinschaft, 1928-1980. Steiner.
Schmoll, Friedemann. (2009b). Richard Weiss : Skizzen zum internationalen Wirken des Schweizer Volkskundlers. Schweizerisches Archiv Für Volkskunde/ Archives Suisses Des Traditions Populaires, 2009(105), 15–32. https://doi.org/10.5169/SEALS-118266
Schöch, Christof. (2013). Big? Smart? Clean? Messy? Data in the Humanities. Journal of Digital Humanities, 2(3). https://journalofdigitalhumanities.org/2-3/big-smart-clean-messy-data-in-the-humanities/
Schultes, Erik, & Wittenburg, Peter. (2019). FAIR Principles and Digital Objects: Accelerating Convergence on a Data Infrastructure. In Yannis Manolopoulos & Sergey Stupnikov (Eds.), Data Analytics and Management in Data Intensive Domains (pp. 3–16). Springer International Publishing. https://doi.org/10.1007/978-3-030-23584-0_1
Scroggie, Kymberley R., Burrell-Sander, Klementine J., Rutledge, Peter J., & Motion, Alice. (2023). GitHub as an open electronic laboratory notebook for real-time sharing of knowledge and collaboration. Digital Discovery, 2(4). https://doi.org/10.1039/D3DD00032J
Selbst, Andrew D., Boyd, Danah, Friedler, Sorelle A., Venkatasubramanian, Suresh, & Vertesi, Janet. (2019). Fairness and Abstraction in Sociotechnical Systems. Proceedings of the Conference on Fairness, Accountability, and Transparency, 59–68. https://doi.org/10.1145/3287560.3287598
Semeraro, Concetta, Lezoche, Mario, Panetto, Hervé, & Dassisti, Michele. (2021). Digital twin paradigm: A systematic literature review. Computers in Industry, 130, 103469. https://doi.org/10.1016/j.compind.2021.103469
Serres, Michel. (2014). Le parasite. Pluriel.
Shao, Guodong, & Kibira, Deogratias. (2018). DIGITAL MANUFACTURING: REQUIREMENTS AND CHALLENGES FOR IMPLEMENTING DIGITAL SURROGATES. 2018 Winter Simulation Conference (WSC), 1226–1237. https://doi.org/10.1109/WSC.2018.8632242
Sheldon, Robert. (2023). What is internationalization (i18n)? In WhatIs.com. https://www.techtarget.com/whatis/definition/internationalization-I18N
Shepherd, Elizabeth, & Smith, Charlotte. (2000). The Application of ISAD(G) to the Description of Archival Datasets. Journal of the Society of Archivists, 21(1), 55–86. https://doi.org/10.1080/00379810050006911
Siemens, Ray. (2016). Communities of practice, the methodological commons, and digital self-determination in the Humanities. Digital Studies / Le Champ Numérique, 5(3). https://doi.org/10.16995/dscn.31
Simandiraki-Grimshaw, Anna. (2023). What is a museum object according to a museum database? In TETRARCHs. https://www.tetrarchs.org/index.php/2023/09/19/what-is-a-museum-object-according-to-a-museum-database/
Simon, Agnès, Wenz, Romain, Michel, Vincent, & Di Mascio, Adrien. (2013). Publishing Bibliographic Records on the Web of Data: Opportunities for the BnF (French National Library). In Philipp Cimiano, Oscar Corcho, Valentina Presutti, Laura Hollink, & Sebastian Rudolph (Eds.), The Semantic Web: Semantics and Big Data (pp. 563–577). Springer. https://doi.org/10.1007/978-3-642-38288-8_38
Simpson, Robert, Page, Kevin R., & De Roure, David. (2014). Zooniverse: Observing the world’s largest citizen science platform. Proceedings of the 23rd International Conference on World Wide Web, 1049–1054. https://doi.org/10.1145/2567948.2579215
Smith, Marcus. (2021). Linked open data and aggregation infrastructure in the cultural heritage sector: A case study of SOCH, a linked data aggregator for Swedish open cultural heritage. In Information and Knowledge Organisation in Digital Humanities (pp. 64–85). Routledge. https://doi.org/10.4324/9781003131816-4
Snell, James M., & Prodromou, Evan. (2017). Activity Streams 2.0. In W3C. https://www.w3.org/TR/activitystreams-core/
Snydman, Stuart, Sanderson, Robert, & Cramer, Tom. (2015). The International Image Interoperability Framework (IIIF): A community & technology approach for web-based images. Archiving Conference, 12, 16–21. https://doi.org/10.2352/issn.2168-3204.2015.12.1.art00005
Soiland-Reyes, Stian, Castro, Leyla Jael, Garijo, Daniel, Portier, Marc, Goble, Carole, & Groth, Paul. (2022). Updating Linked Data practices for FAIR Digital Object principles. Research Ideas and Outcomes, 8, e94501. https://doi.org/10.3897/rio.8.e94501
Spiess, Florian, & Schuldt, Heiko. (2022). Multimodal Interactive Lifelog Retrieval with vitrivr-VR. Proceedings of the 5th Annual on Lifelog Search Challenge, 38–42. https://doi.org/10.1145/3512729.3533008
Spiess, Florian, & Stauffiger, Markus. (2023). Forschung und Archive: Erschliessung und Zugänglichkeit neu gedacht. Arbido, 2023(1). https://arbido.ch/de/ausgaben-artikel/2023/archiv-der-zukunft/forschung-und-archive-erschliessung-und-zugaenglichkeit-neu-gedacht
Spiess, Florian, Rossetto, Luca, & Schuldt, Heiko. (2024). Exploring Multimedia Vector Spaces with vitrivr-VR. In Stevan Rudinac, Alan Hanjalic, Cynthia Liem, Marcel Worring, Björn Thor Jónsson, Bei Liu, & Yoko Yamakata (Eds.), MultiMedia Modeling (Vol. 14557, pp. 317–323). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-53302-0_27
Sporleder, Caroline. (2010). Natural Language Processing for Cultural Heritage Domains. Language and Linguistics Compass, 4(9), 750–768. https://doi.org/10.1111/j.1749-818X.2010.00230.x
Sprochi, Amanda. (2016). Where Are We Headed? Resource Description and Access, Bibliographic Framework, and the Functional Requirements for Bibliographic Records Library Reference Model. The International Information & Library Review, 48(2), 129–136. https://doi.org/10.1080/10572317.2016.1176455
Star, Susan Leigh, & Griesemer, James R. (1989). Institutional Ecology, ’Translations’ and Boundary Objects: Amateurs and Professionals in Berkeley’s Museum of Vertebrate Zoology, 1907-39. Social Studies of Science, 19(3), 387–420. https://www.jstor.org/stable/285080
Star, Susan Leigh, & Ruhleder, Karen. (1994). Steps towards an ecology of infrastructure: Complex problems in design and access for large-scale collaborative systems. Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, 253–264. https://doi.org/10.1145/192844.193021
Star, Susan Leigh. (1999). The Ethnography of Infrastructure. American Behavioral Scientist, 43(3), 377–391. https://doi.org/10.1177/00027649921955326
Star, Susan Leigh. (2010). This is Not a Boundary Object: Reflections on the Origin of a Concept. Science, Technology, & Human Values, 35(5), 601–617. https://doi.org/10.1177/0162243910377624
Stein, Regine, & Balandi, Oguzhan. (2019). Using LIDO for Evolving Object Documentation into CIDOC CRM. Heritage, 2(1), 1023–1031. https://doi.org/10.3390/heritage2010066
Stengers, Isabelle, & Muecke, Stephen. (2018). Another science is possible: A manifesto for slow science. Polity.
Stocker, Christian. (2023). Use LinkedDataGPT to query Open Linked Data from the City of Zurich. In Liip. https://www.liip.ch/en/blog/use-linkeddatagpt-to-query-open-linked-data-from-the-city-of-zurich
Strasser, Bruno J., & Haklay, Muki. (2018). Citizen Science:Expertise, Demokratie und öffentliche Partizipation. Empfehlungen des SchweizerischenWissenschaftsrates SWR. In Schweizerischer Wissenschaftsrat. https://www.swir.ch/images/stories/pdf/de/Policy_Analysis_SSC_1_2018_Citizen_Science_WEB.pdf
Strien, Daniel van, Bell, Mark, McGregor, Nora Rose, & Trizna, Michael. (2022). An Introduction to AI for GLAM. Proceedings of the Second Teaching Machine Learning and Artificial Intelligence Workshop, 20–24. https://proceedings.mlr.press/v170/strien22a.html
Susnjak, Teo. (2023). Applying BERT and ChatGPT for Sentiment Analysis of Lyme Disease in Scientific Literature. arXiv. https://doi.org/10.48550/arXiv.2302.06474
Tarde, Gabriel. (2000). Social Laws: An Outline of Sociology. Batoche Books, Kitchener.
Target, Sinclair. (2018). Whatever Happened to the Semantic Web? In Two-Bit History. https://twobithistory.org/2018/05/27/semantic-web.html
Tasovac, Toma, Chambers, Sally, & Tóth-Czifra, Erzsébet. (2020). Cultural Heritage Data from a Humanities Research Perspective: A DARIAH Position Paper. DARIAH-EU. https://hal.science/hal-02961317
Tennant, Jonathan, Agarwal, Ritwik, Baždarić, Ksenija, Brassard, David, Crick, Tom, Dunleavy, Daniel J., Evans, Thomas Rhys, Gardner, Nicholas, Gonzalez-Marquez, Monica, Graziotin, Daniel, Greshake Tzovaras, Bastian, Gunnarsson, Daniel, Havemann, Johanna, Hosseini, Mohammad, Katz, Daniel S., Knöchelmann, Marcel, Madan, Christopher R., Manghi, Paolo, Marocchino, Alberto, … Yarkoni, Tal. (2020). A tale of two ’opens’: Intersections between Free and Open Source Software and Open Scholarship [Preprint]. SocArXiv. https://doi.org/10.31235/osf.io/2kxq8
Tennant, Roy. (2002). MARC must die. Library Journal, 127(17), 26–27. http://soiscompsfall2007.pbworks.com/f/marc%20must%20die.pdf
Terras, Melissa, Coleman, Stephen, Drost, Steven, Elsden, Chris, Helgason, Ingi, Lechelt, Susan, Osborne, Nicola, Panneels, Inge, Pegado, Briana, Schafer, Burkhard, Smyth, Michael, Thornton, Pip, & Speed, Chris. (2021). The value of mass-digitised cultural heritage content in creative contexts. Big Data & Society, 8(1), 20539517211006165. https://doi.org/10.1177/20539517211006165
Thalhath, Nishad, Nagamori, Mitsuharu, Sakaguchi, Tetsuo, & Sugimoto, Shigeo. (2021). Wikidata Centric Vocabularies and URIs for Linking Data in Semantic Web Driven Digital Curation. In Emmanouel Garoufallou & María-Antonia Ovalle-Perandones (Eds.), Metadata and Semantic Research (pp. 336–344). Springer International Publishing. https://doi.org/10.1007/978-3-030-71903-6_31
Thiery, Florian, Homburg, Timo, Schmidt, Sophie Charlotte, Trognitz, Martina, & Przybilla, Monika. (2020). SPARQLing Geodesy for Cultural Heritage – New Opportunities for Publishing and Analysing Volunteered Linked (Geo-)Data. FIG Working Week 2020 Proceedings. https://doi.org/10.5281/zenodo.3751770
Thiery, Florian, Schmidt, Sophie Charlotte, & Homburg, Timo. (2020). SPARQLing Publication of Irish – Ogham Stones as LOD. In Julian Bogdani, Riccardo Montalbano, & Paolo Rosati (Eds.), Archeo. FOSS: 2020 Proceeding of the 14th international conference 15-17 october 2020 (pp. 119–127). Archaeopress.
Thiery, Florian. (2019). Sphere 7 Data: LOUD and FAIR Data for the Research Community [Working {Paper}]. Römisch-Germanisches Zentralmuseum. https://doi.org/10.5281/zenodo.2643469
Thiery, Florian. (2020). Linked COVID-19 Data – Semantische Modellierung von Linked Geodata. Zfv – Zeitschrift Für Geodäsie, Geoinformation Und Landmanagement, 2020(4), 198–204. https://doi.org/10.12902/zfv-0312-2020
Tilkov, Stefan. (2017). A Brief Introduction to REST. In InfoQ. https://www.infoq.com/articles/rest-introduction/
Tóth-Czifra, Erzsébet, Błaszczyńska, Marta, Gelati, Francesco, Femmy, Gelati, Blümm, Mirjam, Buelinckx, Erik, Chiquet, Vera, Gautschy, Rita, Gietz, Peter, Király, Péter, Vivas-Romero, Maria, Scholger, Walter, Szleszyński, Bartłomiej, & Wuttke, Ulrike. (2023). Research Data Management for Arts and Humanities: Integrating Voices of the Community [Report]. DARIAH-EU. https://doi.org/10.5281/zenodo.8059626
Tuominen, Jouni, Hyvönen, Eero, & Leskinen, Petri. (2017). Bio CRM: A Data Model for Representing Biographical Data for Prosopographical Research. In Antske Fokkens, Serge ter Braake, Ronald Sluijter, Paul Arthur, & Eveline Wandl-Vogt (Eds.), Proceedings of the Second Conference on Biographical Data in a Digital World 2017 (Vol. 2119, pp. 59–66). CEUR. https://ceur-ws.org/Vol-2119/#paper10
Tweed, Christopher, & Sutherland, Margaret. (2007). Built cultural heritage and sustainable urban development. Landscape and Urban Planning, 83(1), 62–69. https://doi.org/10.1016/j.landurbplan.2007.05.008
UNESCO Institute for Statistics. (2009). UNESCO Framework for Cultural Statistics (FCS). United Nations Educational, Scientific; Cultural Organization. https://doi.org/10.15220/978-92-9189-075-0-en
UNESCO. (2009). Charter on the Preservation of the Digital Heritage (Circular {Letter} CL/3865). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000179529
UNESCO. (2019). Preliminary study of the technical, financial and legal aspects of the desirability of a UNESCO recommendation on Open Science (Programme and Meeting Document 40 C/63; p. 24). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000370291
UNESCO. (2021). Implementation of the UNESCO Recommendation on Open Science (Programme and Meeting Document SC-PCB-SPP/2021/OS/UROS; p. 36). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000379949
UNESCO. (2022). Basic texts of the 2003 Convention for the Safeguarding of the Intangible Cultural Heritage (Programme and Meeting Document CLT-2022/WS/3). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000383762
UNESCO. Culture for Development Indicators. (2014). Methodology Manual. United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000229608
Van de Sompel, Herbert, & Nelson, Michael L. (2015). Reminiscing About 15 Years of Interoperability Efforts. D-Lib Magazine, 21(11/12). https://doi.org/10.1045/november2015-vandesompel
Van de Sompel, Herbert. (2023). FAIR Digital Objects and FAIR Signposting. https://doi.org/10.5281/zenodo.7977333
Van der Auwera, Sigrid. (2013). UNESCO and the protection of cultural property during armed conflict. International Journal of Cultural Policy, 19(1), 1–19. https://doi.org/10.1080/10286632.2011.625415
Vandenhende, Lise, & Van Hoorick, Geert. (2017). The management of cultural heritage and nature : Complementary or conflicting regulations? EELF Annual Conference, 5th, Abstracts. http://hdl.handle.net/1854/LU-8722614
Vecco, Marilena. (2010). A definition of cultural heritage: From the tangible to the intangible. Journal of Cultural Heritage, 11(3), 321–324. https://doi.org/10.1016/j.culher.2010.01.006
Venturini, Tommaso, & Latour, Bruno. (2009, May). The Social Fabric. Actes Futur En Seine 2009. https://hal-sciencespo.archives-ouvertes.fr/hal-01293394
Venturini, Tommaso, Bounegru, Liliana, Gray, Jonathan, & Rogers, Richard. (2018). A reality check(list) for digital methods. New Media & Society, 20(11), 4195–4217. https://doi.org/10.1177/1461444818769236
Venturini, Tommaso, Cardon, Dominique, & Cointet, Jean-Philippe. (2015). Méthodes digitales: Approches quali/quanti des données numériques - Curation and Presentation of the Special Issue. Réseaux : Communication, Technologie, Société, 188(6). https://doi.org/10.3917/res.188.0009
Vicente-Saez, Ruben, & Martinez-Fuentes, Clara. (2018). Open Science now: A systematic literature review for an integrated definition. Journal of Business Research, 88, 428–436. https://doi.org/10.1016/j.jbusres.2017.12.043
Vienni-Baptista, Bianca, Fletcher, Isabel, & Lyall, Catherine (Eds.). (2023). Foundations of Interdisciplinary and Transdisciplinary Research: A Reader. Bristol University Press. https://doi.org/10.56687/9781529235012
Vinck, Dominique. (2019). Les métiers de l’ombre de la Fête des Vignerons. Editions Antipodes. https://doi.org/10.33056/antipodes.1711
Vitale, Valeria, & Rainer, Simon. (2023). Linked Data Networks: How, Why and When to Apply Network Analysis to LOD. In Tom Brughmans, Barbara J. Mills, Jessica Munson, Matthew A. Peeples, Tom Brughmans, Barbara J. Mills, Jessica Munson, & Matthew A. Peeples (Eds.), The Oxford Handbook of Archaeological Network Research (pp. 378–389). Oxford University Press.
W3C OWL Working Group. (2012). OWL 2 Web Ontology Language Document Overview (Second Edition). In W3C. https://www.w3.org/TR/owl2-overview/
Walsham, G. (1997). Actor-Network Theory and IS Research: Current Status and Future Prospects. In Allen S. Lee, Jonathan Liebenau, & Janice I. DeGross (Eds.), Information Systems and Qualitative Research (pp. 466–480). Springer US. https://doi.org/10.1007/978-0-387-35309-8_23
Watson, Gary. (1975). Free Agency. The Journal of Philosophy, 72(8), 205–220. https://doi.org/10.2307/2024703
Webber, Jim. (2012). A programmatic introduction to Neo4j. Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, 217–218. https://doi.org/10.1145/2384716.2384777
Wei, Jie, Xu, Zhuoming, & Xia, Wenze. (2016). DQAF: Towards DQV-Based Dataset Quality Annotation Using the Web Annotation Data Model. 2016 13th Web Information Systems and Applications Conference (WISA), 24–27. https://doi.org/10.1109/WISA.2016.15
Weibel, Stuart L., & Koch, Traugott. (2000). The Dublin Core Metadata Initiative: Mission, Current Activities, and Future Directions. D-Lib Magazine, 6(12). https://doi.org/10.1045/december2000-weibel
Weigl, David M., Goebl, Werner, Baker, David J., Crawford, Tim, Zubani, Federico, Gkiokas, Aggelos, Gutierrez, Nicolas F., Porter, Alastair, & Santos, Patricia. (2021). Notes on the Music: A social data infrastructure for music annotation. Proceedings of the 8th International Conference on Digital Libraries for Musicology, 23–31. https://doi.org/10.1145/3469013.3469017
Weil, Marie O. (1996). Community Building: Building Community Practice. Social Work, 41(5), 481–499. https://doi.org/10.1093/sw/41.5.481
Weinthal, Dianne, & Childress, Dawn. (2019). IIIF for Open Access. https://escholarship.org/uc/item/260616w7
Weiss, Richard. (1940). Atlas der schweizerischen Volkskunde : Die bisherigen Erfahrungen der Exploratoren. Schweizerisches Archiv Für Volkskunde/ Archives Suisses Des Traditions Populaires., 38(1), 105–118. https://doi.org/10.5169/SEALS-113634
Wenger, Etienne. (2011). Communities of practice: A brief introduction. National Science Foundation, 1–7. http://hdl.handle.net/1794/11736
Wessman, Anna, Thomas, Suzie, Rohiola, Ville, Kuitunen, Jutta, Ikkala, Esko, Tuominen, Jouni, Koho, Mikko, & Hyvönen, Eero. (2019, May). A Citizen Science Approach to Archaeology : Finnish Archaeological Finds Recording Linked Open Database (SuALT). Proceedings of the DHN 2019 Conference. https://hdl.handle.net/10138/303221
White, Jon. (2023). 10 Lessons Learned From Creating My First Mirador Plugin. In cogapp. https://blog.cogapp.com/10-lessons-learned-from-creating-my-first-mirador-plugin-d13b011a35eb
Wietschorke, Jens. (2014). Historische Kulturanalyse. In Christine Bischoff, Karoline Oehme-Jüngling, & Walter Leimgruber (Eds.), Methoden der Kulturanthropologie (1. Auflage, pp. 160–176). Haupt Verlag.
Wigg-Wolf, David, Hofmann, Kerstin P., Tolle, Karsten, Rösler, Katja, Möller, Markus, Deligio, Chrisowalandis, Tietz, Julia, & Nicolai, Caroline von. (2022). Frankfurt am Main, Deutschland. ClaReNet. Klassifikation und Repräsentation keltischer Münzprägungen im Netz. Das Projekt von 2021 bis 2024. E-Forschungsberichte Des DAI, 1-21 (§). https://doi.org/10.34780/9rgb-or3d
Wijegunaratne, Indrajit, & Fernandez, George. (1998). The Three-Tier Application Architecture. In Indrajit Wijegunaratne & George Fernandez (Eds.), Distributed Applications Engineering: Building New Applications and Managing Legacy Applications with Distributed Technologies (pp. 41–78). Springer. https://doi.org/10.1007/978-1-4471-1550-2_3
Wilkinson, Mark D., Dumontier, Michel, Aalbersberg, IJsbrand Jan, Appleton, Gabrielle, Axton, Myles, Baak, Arie, Blomberg, Niklas, Boiten, Jan-Willem, Silva Santos, Luiz Bonino da, Bourne, Philip E., Bouwman, Jildau, Brookes, Anthony J., Clark, Tim, Crosas, Mercè, Dillo, Ingrid, Dumon, Olivier, Edmunds, Scott, Evelo, Chris T., Finkers, Richard, … Mons, Barend. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018. https://doi.org/10.1038/sdata.2016.18
Windhager, Florian, Federico, Paolo, Schreder, Gunther, Glinka, Katrin, Dork, Marian, Miksch, Silvia, & Mayr, Eva. (2019). Visualization of Cultural Heritage Collection Data: State of the Art and Future Challenges. IEEE Transactions on Visualization and Computer Graphics, 25(6), 2311–2330. https://doi.org/10.1109/TVCG.2018.2830759
Wohlin, Claes. (2014). Guidelines for snowballing in systematic literature studies and a replication in software engineering. Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, 1–10. https://doi.org/10.1145/2601248.2601268
Wood, David, Zaidman, Marsha, Ruth, Luke, & Hausenblas, Michael. (2014). Linked data: Structured data on the Web. Manning.
Zeng, Marcia Lei, & Qin, Jian. (2022). Metadata (Third edition). ALA Neal-Schuman.
Zeng, Marcia Lei. (2008). Knowledge Organization Systems (KOS). Knowledge Organization, 35(2–3), 160–182. https://doi.org/10.5771/0943-7444-2008-2-3-160
Zeng, Marcia Lei. (2019). Semantic enrichment for enhancing LAM data and supporting digital humanities. Review article. El Profesional de La Información, 28(1). https://doi.org/10.3145/epi.2019.ene.03
Zou, Xiaozhu, Xiong, Siyi, Li, Zhi, & Jiang, Ping. (2018). Constructing Metadata Schema of Scientific and Technical Report Based on FRBR. Computer and Information Science, 11(2), 34–39. https://doi.org/10.5539/cis.v11n2p34
Zourou, Katerina, & Ziku, Mariana. (2022). Citizen Enhanced Open Science in Cultural Heritage - Review and analysis of practices in Higher Education (Study 2020-1-BE02-KA203-07427). Erasmus+: EU Programme for Education, Training, Youth; Sport. https://www.citizenheritage.eu
Žumer, Maja. (2007). Functional requirements for bibliographic records: FRBR: The end of the road or a new beginning. Bulletin of the American Society for Information Science and Technology, 33(6), 27–29. https://doi.org/10.1002/bult.2007.1720330608
Zundert, Joris van. (2018). On Not Writing a Review about Mirador: Mirador, IIIF, and the Epistemological Gains of Distributed Digital Scholarly Resources. Digital Medievalist, 11(1), 1–48. https://doi.org/10.16995/dm.78