HomeNews & UpdatesDeveloping the Database

Developing the Database

As noted in my last blog post, a fundamental part of developing Dickens Search is our commitment to open research methods and making as much of the work we do as possible transparent. This blog post will provide a summary of how I created the database, and some of the rationale behind our choice of platform and our approach to metadata. In my next blog post, I provide a list of the plugins we’re using.

The website has been built using Omeka Classic, a free-to-use content management system developed specifically for digital editions and online collections. The website domain was bought from IONOS, because of their periodic offers (i.e. one year free). Hosting is provided by ReclaimHosting, one of the only (if not the only) provider to offer Omeka as a one-click installation, so those less technically minded can get it up and running quickly.

Omeka admin view

I chose Omeka for several reasons: firstly, it enables us to use an out-of-the-box solution which already adheres to high archival standards. The default Omeka installation allows you to add items to your database by inputting data into fields for each item in Dublin Core, an archival metadata standard used by many major collections. (If you’re not familiar with metadata, essentially all this means is that it offers specific fields such as Title, Date, Source, Rights, etc. that are common to digital collections, so that not only is the data stored in a well-designed structure, but that it will be easier to use for digital transformation later.) Various Omeka plugins can be pointed to specific fields, like Date or Text fields, and will then draw on those fields across the collection to facilitate search, analysis and visualisation. Other inspiration for this choice came from projects like the George Eliot Archive, edited by Beverley Park Rilett and also built in Omeka, which has shown how useful the platform can be for creating a digital archive of this kind.

Secondly, although Omeka does not have on-call technical support (as a platform like WordPress does, for a hefty fee), it has very active forums for problem-solving and advice. Because Omeka has a very specific purpose (i.e. digital editions), there are many people asking and answering questions which will very likely apply to you, in contrast to the kinds of online discussions for something like WordPress, used much more broadly. This very active community is also engaged in developing and refining plugins, which are free to install, often developed for a specific project and made available on GitHub in case they’re of use to others. With some basic knowledge of HTML and PHP, it’s been possible not only to add functionality to the website using these plugins but also to tweak some of them to make our website more user-friendly (such as hiding some elements or data from view but keeping it for the files in the database, changing text colour and the arrangement of information, and even translating one plugin from French to English).

In addition to the default Dublin Core metadata set, you can define specific item metadata which gives you different fields for, for example, poems versus short stories, novels, and so on (e.g. for speeches we have a Venue field, which is not relevant to other kinds of database item). These custom fields are used in addition to the Dublin Core fields, so that the fundamental details about each item (in the case of Dickens Search, each text) are recorded in the same way. Both kinds of metadata are then downloadable in several different formats (atom, dc-rdf, dcmes-xml, json and omeka-xml).

Much of the early decision-making of the project has revolved around which fields to use and how to use them. We have a style guide, which is a living document about how to fill in the fields. This is a supplement to information about each field given within the Omeka CMS itself (only visible to admin users). Some of the issues we’ve faced are genre issues to do with Dickens’s works themselves, e.g. whether to classify all verse as poems (rather than sub-dividing into types such as songs), or whether sketches are short stories, journalism or a third category. For each collection, we will publish a blog post explaining these kinds of decisions, in addition to providing a short introduction more focused on what is present in or absent from each collection. Ultimately, creating this database has been a learning process and we anticipate making changes as it develops. Different types of writing have presented different challenges, and by far the biggest task ahead of us is transcribing and uploading Dickens’s novels. 

In my next blog post, I will provide an overview of the plugins currently installed on the site. If anyone is interested in further information about the development of the database, please contact us

How to Cite:

Bell, Emily. ‘Developing the Database.' Dickens Search. 9 July 2021. Accessed [date]. https://dickenssearch.com/developing-the-database.