Guidelines

I. METHODOLOGICAL ISSUES¹

A. Bias related to the extension and quality of the corpus of references

1. Extension and quality of the patristic corpus

Ultimately, BiblIndex will allow research in all patristic literature, on reliable and comparable data. For now, a number of difficulties remain.

First, the corpus is still neither exhaustive nor homogeneous. The data available is the combined wealth of data inherited from the CADP (see "Story of the project"): that is,

270,000 verified and homogeneous references for Philo of Alexandria;
all of the Greek and Latin patristic works from the first three centuries and part of those from the 4^th century;
100,000 other digitalized entries, but not yet published, which one can assume have undergone the same validation process as the previous ones, mainly concerning Athanasius of Alexandria, Jerome, John Chrysostom and Procopius of Gaza;
and approximately 400,000 other references, unverified, typed up from the CADP paper archives, as well as references from various corpora, written between the 4^th and 11^th centuries.

As for the major authors of the 4^th century, we believe that we are approaching exhaustiveness. For the 5^th century, Cyril of Alexandria and Theodoret of Cyrus are fully covered, which is also the case for later authors such as Gregory the Great (6^th century) and Maximus the Confessor (7^th century). Many Catenae have been analysed: As far as possible, each fragment has been attributed to its alleged author.

The corpus is constantly evolving as new references are entered – and in any case, even when it is more or less stabilized, the inability to access citations of lost works will always distort the calculations of proportions: only the orders of magnitude is reliably ascertained. As soon as you work on a specific corpus, you should check whether the desired works are available online (see "Patristic corpus").

2. Heterogeneity of citation survey methods

Different weighting factors of the results, related to the data analysis itself, must also be taken into account.

Firstly, the linguistic variety: The details of the biblical text used, considering the differences of each version, can induce different meanings behind the same verse number from one language to another. This will be corrected automatically in BiblIndex thanks to the fine correspondence established between all the verses or sub-verses of ancient Bibles.

Secondly, at the present time, there is the imbalance of the corpus in favour of Greek texts (see Future prospects).

Thirdly, the work has to be done from heterogeneous data. The data has undergone various checking processes; besides, it is undergirded by varying conceptions of what a biblical reference is, a bias that even the precise CADP guidelines did not fully eradicate: because the material does not consist only of verbatim quotations, each analyst has his or her own conception of the allusion or the quotation. Besides, some other methodological divergences may be encountered. Some analysts, in the paper archives, have indicated the intra-biblical parallels as new references; in the case of continuous verse ranges, some counted one reference per verse, others one per range. Lemmas used in biblical commentaries were generally noted, but not always. One of the most irksome problems concerns the multiple repetitions of the same text within a paragraph: the researcher will tend to see only a single, developed quotation, while the systematic list counts each new occurrence of the quoted text.

3. Deficiencies in the basic typology

As the distinction between quotation and allusion (when this information is available in the CDP data) cannot be objective, it remains an insufficient sorting criterion.

To assess the real importance of how often a verse occurs, one would have to weigh the raw numbers by a careful analysis of the role of the verse in the text: is it a term cited in passing by a verbal echo; a crucial link in the course of an argument; or a lemma; or verse discussed in isolation? Nor can we do without an understanding of the contexts and methods for including the verse. In short, the details of what we put in place in TEI encoding are essential to refine the statistical studies.

B. Bias related to biblical references

1. Impact of splitting into verses

The division into verses, made after the works were composed, risks projecting modern questions back onto the works themselves. When internet users of BiblIndex make requests, they are generally formulated as follows: where, in a particular patristic corpus, do we find such a verse number, or a certain verse number with another verse number?

Often, however, this ‘verse’ unit is only an approximation to define the text quoted by an author. In case of long continuous quotes, the approximation is weak because the data contains ranges of verses. However, difficulties arise when several extracts of discontinuous verses are used, or simply parts of verses, which is the overwhelming majority in the practice of the Fathers. Thus, it can be said that Bernard quotes all the books of the Old Testament, but he quotes only 1/15 of its words. He quotes Malachi 4:2 thirty-three times, but he only uses the expression sol iustitiae.

2. Impact of biblical granularity choices

Depending on the scope of the biblical reference chosen by the researcher, the results may give rise to different interpretations. We can see a particularly clear example with the occurrences of Genesis in Irenaeus’ works. If one wants to conduct thematic studies from a specific expression in specific verses, the results by verse number do not eliminate the ‘noise’ caused by other portions of the verse.

The methodological reservations to be enumerated are therefore numerous. They should be mentioned in each paper using BiblIndex data. Readers should keep in mind that all the results given are only approximations, which will have to be constantly refined. Statistics are extremely valuable for giving general views and points of comparison, for directing research towards such-and-such areas that may be more meaningful and intriguing. They cannot, however, give a sufficient and fair account of what the Biblical text was to a particular author.

II. THE BIBLICAL REFERENCE DOCUMENTS USED BY BIBLINDEX²

To enlarge its corpus as quickly as possible, Biblia Patristica used a single biblical reference framework, a composite Bible made up of the following:

• the books of the Hebrew Bible as given in modern editions of the Hebrew Old Testament and the Greek New Testament;

• the books of the Septuagint in Rahlfs’ edition for seven other texts that do not belong to the Hebrew corpus: 1 and 2 Maccabees, Wisdom of Solomon, Ecclesiasticus (Sirach), Tobit, Judith, Baruch (1-6)³ and the Greek additions to Esther and Daniel.

The Vulgate and the Vetus Latina were not taken into account. The verse numbering of the Jerusalem Bible was always used.

A. Multilingualism

The terms of reference were thus defined not from the objects to be dealt with but from the modern researcher’s constraints: while the canon accepted by the Fathers was varied and fluid, the division of the Hebrew Bible results from a consensus of modern scholars. It was more practical to use a single, albeit inadequate, scheme. Nonetheless, quotations in different languages have to refer to different ‘Bibles’. Even in the same author’s work, references to multiple sources may be found, as in the case of Jerome with his use of Hebrew, Greek and various Latin texts.

BiblIndex, like Biblia Patristica, requires biblical systems of reference so that analysts can give easily identifiable chapter and verse numbers to modern Bible readers. However, a reference edition is normally a form of text unknown to Fathers who quoted the Bible. Yet, the same problem arises as for the Thesaurus Linguae Latinae when it uses texts re-constructed in the editing process of the Vetus Latina: a reference edition built in this way remains a form of the text unknown to the Fathers who cited the Bible. These referentials are in no way prescriptive and have no more real existence for the Fathers than the Jerusalem Bible, even if they are a better approximation. There is, of course, no question of our ‘referring’ Cyprian’s text to the Vulgate. Rather, when reading the biblical text of Cyprian, we may note what sets it apart from the Vulgate. In the absence of the biblical text which was actually used by the Fathers, the only function of the biblical reference documents is provisionally to relate any patristic text to a shared point of comparison.

Insofar as the text of the patristic quotations will be gradually integrated into the database, we can eventually reconstruct the Bible of a particular author and, if necessary, set new referentials, depending on the research interest. One could, for example, compare the Bibles of two authors, or one author’s Bible at different periods of his life, or several authors’ Bibles in a specific geographic area over a given period, and so on. However, this ideal phase will only be completed when a very consistent number of texts have been analysed and integrated into the corpus of BiblIndex. It must be added that the use of biblical referentials allows, in this first phase of the project, the integration of both texts analysed anew and reference lists reconstructed from biblical apparatus or the available archives. This is a sine qua non for a significant increase of the corpus.

Initially in Biblindex, in addition to translations into modern languages – the new TOB in French, the New Revised Standard Version (NRSV) with the so-called deuterocanonical books/Apocrypha in English – which will be seen only in the user interface, five biblical reference documents are proposed:

the Biblia Hebraica Stuttgartiensia;
Rahlfs’ Septuagint and the Greek New Testament (NA 27);
the Weber–Gryson Vulgate for the Latin texts;
the Peshitta and the Syriac New Testament;
the Zohrab Bible (Venice, 1805, recently reprinted) for texts written directly in Armenian. For Armenian versions of Greek and Syriac texts, the Bible of the original source (i.e. Septuagint or Peshitta) is taken as the reference.

(see "Text credits")

B. Verse numbering

In addition, there are various reasons no single Bible can be used as the point of reference:

Some books, whether canonical or not, are not included in some versions: Jdt, Tob, 1-4Macc, 1 or 4 Ezra, Bar, 1 Enoch, Jubilees, Odes, Ps 151-153, 3 Cor, etc.
Some books and chapters are only included in some versions.
Several verses are only included in some versions: For example, several verses of the Masoretic Text are not in the LXX text, such as 1 Kings 6, 11-13; conversely, Dan 3, 24-90 (LXX) is not in the MT, etc.
The order, names and numbers may differ for books, chapters and even verses. Specifically, a verse in one version may relate to 2 or 3 verses in another: Exod 39, 41-43 (MT) = Exod 39, 18.22.23 (LXX); Num 26, 15-47 (MT) = Num 26, 24-27.15-23.32-47.28-31 (LXX), etc.

We provide a data model allowing automatic switching between verse numbering in the different Bibles. A full correspondence table, verse by verse, and when necessary verse part by verse part, has been included in the database. Each analyst and user can thus choose a biblical reference document without needing to indicate the various relationships between different versions.

For convenience, the verse is kept as the basic unit: one will always be able to search by verse number in Biblindex. Later on, the corpus will be keyword searchable.

C. Deuterocanonical and Apocryphal Books

According to their guidelines, quotations of the most frequently used deuterocanonical writings or Apocrypha should have been listed in Biblia Patristica. In practice, only 1 and 2 Maccabees, Wisdom, Sirach, Tobit, Judith, Baruch 1-5 (and 6?), and the Greek additions to Esther and Daniel were taken into account.

For BiblIndex, the selection criterion is as follows: from the moment we know that a text was considered part of the Scriptures by a Church Father at a given time, its occurrences should be noted. Our compilation of biblical reference documents will therefore include all possible books quoted in a specific language, independently from the selections made in modern Bibles; the lists of books will vary depending on the reference language. Other apocryphal texts, which were seen as belonging to Scriptures in different Christian communities in antiquity, will be included.

In addition to Psalm 151, the Prayer of Manasseh, 3 and 4 Ezra, and the Prologue of Sirach, which were not taken into account by Biblia Patristica although they belong to the modern editions of the Septuagint or Vulgate, the following texts have already been added to our reference documents (not an exhaustive list):

the Psalms of Solomon (Greek, Syriac);
the Odes (Greek, Syriac);
the Letter to the Laodiceans (Latin);
Psalms 152-155 (Syriac);
2 Baruch, the Letter of Baruch and the Revelation of Baruch, the Syriac Revelation of Ezra (Syriac);
the 3^rd Letter to the Corinthians (Armenian).

Integrating apocryphal texts in BiblIndex is a question that must be very carefully considered: where should we stop? At first, the reference documents will be limited to the list described above, but other apocryphal books might be added if they have been quoted as part of the Scriptures by the Fathers (e.g. the Shepherd of Hermas). This decision to enlarge the field of biblical texts included in the search tool has not yet been activated retrospectively, but will be put in place in future analyses and will be progressively applied to the texts already treated.

Explicit quotations that cannot be identified, such as the agrapha (words of Jesus which are not preserved in any known biblical book), will be identified in BiblIndex by the acronym UBO (Unidentified Biblical Object) and set aside for further study.

I. METHODOLOGICAL ISSUES1