Update on Genealogies of Knowledge Project and Research Network

13 May 2020

Dear colleagues,

We hope this finds you safe and well in these strange times.

After four years of intense interdisciplinary collaboration and exchange, the Genealogies of Knowledge project has now formally come to the end of its funded period. We would like to take this opportunity to thank you for your interest in following the development of the project, and to update you on some exciting new plans the project team has put in place for the future.


Although our corpus-building activities have slowed down for the time being, we continue to add texts to the Modern and Internet English corpora, and the resources we have been developing during the Genealogies project remain freely available for download from our website. These corpora include:

  • An Ancient Greek corpus (totalling 3.3 million tokens) featuring a selection of scientific, philosophical and political treatises and commentaries, written between the 5th century BC and the 2nd/3rd century AD by key authors in the history of ideas such as Plato, Aristotle, Galen, Hippocrates and Isocrates;
  • A Medieval Arabic corpus (3.3 million tokens) consisting of translations of and commentaries on ancient Greek texts produced between the 8th and 13th centuries AD by prolific translators such as Hunayn Ibn Ishaq, in addition to a number of original texts by important authors such as Al-Farabi, Averroes and Avicenna;
  • A Latin corpus (1.5 million tokens) comprising the works of classical Roman thinkers such as Cicero (106-43 BC), alongside a number of Latin (re)translations and commentaries produced by figures such as Thomas Aquinas and Robert Grosseteste, who played a central role in the development and dissemination of Greek philosophy in Europe from the 13th century onwards;
  • A Modern English corpus (21+ million tokens) made up of original texts, translations and retranslations of political and scientific texts published in the UK and US during the 19th, 20th and 21st centuries. This corpus not only features multiple retranslations of works by classical Greek and Roman authors such as Plato, Thucydides and Plutarch, but also translations of works by more modern writers such as Ludwig Wittgenstein, René Descartes, Karl Popper, Karl Marx, Michel Foucault, Étienne Balibar and G.W.F. Hegel. Non-translations include works by authors such as John Stuart Mill and Jodi Dean.
  • An Internet English corpus (4.2+ million tokens) which includes a diversity of blog posts, articles and opinion pieces published online by alternative media and news outlets across the political spectrum, from Indymedia and Discover Society on the political left, to Newsmax and Amerika.org on the right.

These resources can be accessed via a dedicated concordance browser and specially designed suite of text analysis and visualisation tools. Full documentation explaining how to access and use the software tools can be found here. We warmly encourage colleagues to explore these resources and use them in their research.


In order to gain a sense of the kinds of research that are opened up through these resources and the software tools developed by the Genealogies project, we invite you to consult the full list of the project’s publications. All of these journal articles are available as open access publications.We would like to draw particular attention to a special issue of the humanities journal Palgrave Communications to which five contributions by members of the project team are already available:

Research Network

Going forward, the team continues to extend its activities in this promising new area of enquiry through the creation of a dedicated research network. This aims to support the development of further corpora, software tools and analytical methodologies relevant to research conducted by the Genealogies of Knowledge project. Coordinated by Mona Baker, Jan Buts and Henry Jones, the Network seeks to connect scholars across the humanities in order to promote greater interdisciplinary collaboration of mutual benefit to fields as diverse as translation studies, classics, cultural studies, linguistics, intellectual history, digital culture and computer science. The Network is already collaborating with colleagues at Hamad Bin Khalifa University in Doha, Qatar, who are working to develop corpora of translations into and out of Modern Standard Arabic in order to examine Arab discourses on women’s position in the polity and the evolving role of women in Arab societies. We are additionally working closely with a group at the Centre for Advanced Studies in Oslo who specialise in the medical humanities. The collaboration here is centred on an investigation of the historical development of the discursive regime of evidence-based medicine, the central paradigm in contemporary health science.

The Genealogies of Knowledge Research Network will be organising a further series of events at different locations to promote this new strand of CTS research among local and international audiences. These will include not only conferences and seminars, but also hands-on workshops intended to introduce postgraduate and early career researchers to a new generation of corpus-based translation studies, to equip them with the knowledge and tools needed to develop their own research projects. All Network activities are publicised through our website and via our Twitter account.

And finally, a note on software developments

Those who have accessed the GoK corpora recently may have noticed that the way you start the GoK corpus browser has changed. From now on, users must download the tool and run it directly from their computer, rather than access it through the web site (Java WebStart). We changed the way the GoK tool is run to avoid having to purchase a “code signing certificate”, which would create a dependency between the project and an external certification agency. Furthermore, as the non-free software (non-open-source) implementation of Java WebStart most widely used now blocks self-signed applications from running, we felt that moving to this different mode of delivery of the GoK tool would be more consistent with the Free/Libre Software ethos of the Genealogies project, while allowing us greater flexibility in developing and deploying our software.


With warm regards,

The Genealogies of Knowledge Project Team