Resources
Alongside our core activities, CDCH develops software and tools to support researchers and the University. Read about our two current tools:
FREE: Automatic Information Extraction from Documents
Humanities researchers often work intensively with textual sources such as books, journal articles, and field reports, many of which are available as PDFs. Locating, comparing, and reusing information across these materials can be both time-consuming and repetitive.
FREE is designed to support this research practice by automating parts of the extraction process, while keeping scholarly judgement central. Researchers decide which content is relevant and how it should be structured for their work.
Working with FREE is familiar and fits naturally into a humanities research workflow. Simply upload your PDFs and navigate them as you would normally. As you read, FREE automatically highlights passages, tables, and numerical values that are relevant for data extraction. You can then select these highlighted elements, choosing exactly what matters for your research to build a custom extraction template, without writing any code. This lets you use your domain expertise to extract meaningful data quickly and flexibly.
Behind the scenes, FREE gathers the selected information and turns it into structured data. Every extracted value remains linked to its original location in the PDF, making it easy to check, verify, and cite your sources. This ensures transparency and trust in the extracted data, which is essential for academic work.
Once the extraction is complete, you can download your results in formats that fit your workflow, such as Excel or JSON files for analysis or structured data files for databases. FREE also supports processing many documents at once, making it possible to work with entire collections rather than individual files.
By combining familiar reading practices with automated extraction, FREE allows humanities researchers to spend less time on manual data entry and more time on interpretation, comparison, and analysis.
Relay: Reliable AI Generation from Collections
Relay is an AI assistant designed to provide highly reliable, generated answers based on pre-existing collections. Relay’s core idea is built on two main pillars: full on-premises deployment within private organizational infrastructures, and a high level of source verification in generated answers.
The system is customizable and uses internal infrastructure to offer a private environment where your data remains under your full control. By adopting best-practice approaches to reduce hallucinations and enforce rigorous citation verification, Relay ensures that each generated answer can be traced back to the original documents, allowing users to easily inspect and verify the underlying sources.
Relay serves as a foundational platform for applications that require reliable information retrieval, including use cases such as humanists working with textual collections, students conducting verifiable literature reviews, or users requiring clearly traceable connections between claims and their source material.
Features
- Your data, under your control: The system is built to support integrating any data source securely.
- Model agnostic: Relay works with any large language model, locally or on the cloud.
- Advanced search: A variety of signals are captured from your question when searching the collection.
- Multilingual support: Are your collections in multiple languages? Then Relay can interpret the sources and respond in your language.
- Source verification: Relay will automatically refer back to the relevant sources for the statements it generates so you can verify the answers.