Versioning in the Digital Edition of Fernando Pessoa


Ulrike Henny-Krahmer (University of Rostock)

International workshop
"Pessoa digital: development and sustainability measures
for the Digital Edition of Fernando Pessoa"
July 3, 2023


Slides at: https://hennyu.github.io/pessoa_23/

Overview

  1. Versioning in digital scholarly editions
  2. Current practice in "Pessoa digital"
  3. Proposals for new approaches

1. Versioning in digital scholarly editions

Digital scholarly editions can change...

  • correction of errors
  • addition of new materials
  • new scholarly findings
The fact that such changes can occur on an ongoing basis is both one of the great potentials and one of the great terrors of digital scholarly resources.
(Broyles 2020, para. 1)

Why can changes be a problem?

  • "Standards for scholarly citation, which solidified around print resources, take advantage of [...] objectual stability."
  • "changes may not merely be possible but required in order to keep [the digital resource] operational"

(Broyles 2020, para. 1, 2)

Why can changes be a problem?

  • link rot: "online resources linked in references simply disappear from the internet"
  • context drift: "links function but the content on the website has changed since it was referenced"

(Broyles 2020, para. 3)

What to do?

Broyles (2020, para. 4): "in order to make sure such editions are citeable and their history is intelligible, their creators and publishers must assign version numbers in tandem with any changes made to edition content"

What to do?

  • version numbers as a simple and practical method
  • to identify a state of a resource
  • to communicate the history of a resource
  • to communicate the relationship between different states

(Broyles 2020, para. 6)

But...

no consensus practice in digital scholarly editing:

  • how should different versions be identified?
  • what should version numbers communicate?

What is the current practice for versioning in DSE?

Bürgermeister 2023:
65 out of 257 examined editions have a type of versioning

Example humboldt digital:

https://edition-humboldt.de/

Example humboldt digital:

Overviews of different versions, statistics

Example humboldt digital:

Description of the new contents and developments in words for each version

Example Der Sturm:

https://sturm-edition.de/

Example Der Sturm:

Example Der Sturm:

Example Der Sturm:

Example Der Sturm:

Existing strategies for versioning

  1. description of changes
  2. revision description (directly in data)
  3. systems of version control
  4. versioning of the whole system

(Bürgermeister 2023)

2. Current practice in "Pessoa digital"

Versioning from a content perspective

  • Internal:
    project-internal work from the start (since October 2014)
  • Beta:
    website on-line, but work in progress (since 2016)
  • Version 1.0:
    completion of poetry and prose published in lifetime and editorial projects 1913-35 (2021)
  • Version 2.0:
    addition of more texts published in lifetime (poetry and prose), modernized spelling (2022)

Description of version 1.0

"The website presents, in its 1.0 version, the edition of Fernando Pessoa’s editorial projects, elaborated between 1913 and 1935, as well as the poetry published by the author in journals and magazines, from 1914 onwards, and the prose published in journals and literary magazines, after 1912. This edition includes 248 documents of the literary estate, 72 publications of poetry and 91 of prose."

Description of version 2.0

The website presents, in its 2.0 version, the edition of Fernando Pessoa’s editorial projects, elaborated between 1913 and 1935, as well as the poetry published by the author, from 1914 onwards, and the prose published after 1912, independently or in books. This edition includes 248 documents of the literary estate, 78 publications of poetry and 133 of prose. All texts published by the author are presented in two versions, with original and modernized spelling.

Citing of versions

Citation suggestions contain version number, e.g.:

whole edition:
Sepúlveda, Pedro, Ulrike Henny-Krahmer, and Jorge Uribe (eds). Digital Edition of Fernando Pessoa. Projects and Publications. Lisbon and Cologne: IELT, New University of Lisbon and CCeH, University of Cologne 2017-2022. Version 2.0. <http://www.pessoadigital.pt>. DOI: 10.18716/cceh/pessoa.

single document:
Sepúlveda, Pedro, Ulrike Henny-Krahmer, and Jorge Uribe (eds). "BNP/E3 8-3v." Digital Edition of Fernando Pessoa. Projects and Publications. Lisbon/Cologne: IELT/CCeH, University of Cologne 2017-2022. Version 2.0. <http://www.pessoadigital.pt/doc/BNP_E3_8-3v/diplomatic-transcription> DOI: 10.18716/cceh/pessoa

Versioning from a technical perspective

From the beginning, the edition was developed in a public Git repository: https://github.com/cceh/pessoa

Additions and deletions over time

Commits over time

Contributors over time

3. Proposals for new approaches

How to improve versioning in "Pessoa digital"?

  • Split content (XML-TEI)
    and application (eXist-db App)
  • Create new Git repository for content
  • "pessoa-data" vs. "pessoa-app"

How to improve versioning in "Pessoa digital"?

Why and how?

  • Data can exist without the app.
  • Data can be archived and re-used without the app.
  • App depends on data in a specific version.
  • App can refer to dataset.
  • Both data and app can be developed further independently of each other.
  • E.g.: Version A2.0-D2.1

How to improve versioning in "Pessoa digital"?

... on the level of contents & data ...

Describe changes in TEI files!

How to improve versioning in "Pessoa digital"?

... on the code level ...

  • Create technical Releases of both repositories.
  • Number the releases appropriately
    (e.g., with "semantic versioning").
  • Document and describe the releases appropriately in words
    (what was done?)
  • Archive these releases
    (e.g., on Zenodo).

What are releases?

You can create a release to package software, along with release notes and links to binary files, for other people to use.

Releases are deployable software iterations you can package and make available for a wider audience to download and use.

Releases are based on Git tags, which mark a specific point in your repository's history. A tag date may be different than a release date since they can be created at different times. For more information about viewing your existing tags, see "Viewing your repository's releases and tags."

Source: https://docs.github.com/en/repositories/releasing-projects-on-github/about-releases

What is semantic versioning?

Source: Wikipedia

What is Zenodo?

Source: Wikipedia

Summarized:

  • recognize that content and code are different layers that evolve together in a digital edition, but also independently of each other
  • relate the development cycles of both layers to each other
  • define and establish meaningful development units

Conlusions - Some essential things

  • documentation of versions
    (explanatory, for content and technical level)
  • fixing of certain states of a digital edition
    (archiving of versions, accessibility of versions)
  • specification of version numbers in citation proposals
  • awareness of processuality and changeability of digital editions

References

Thank you!

Slides at: https://hennyu.github.io/pessoa_23/

CC-BY 4.0