New Testament Greek Linguistic Databases

File this under “Rant.”

We have a good dozen different Greek morphology databases/annotation. We have (currently) two New Testament Greek Syntax databases.

Of all of them, only one cites its sources and provides extended analysis and description of why it says what it says.


That’s right.

How in the world is that acceptable???

I’ve talked about this before. I’ll admit that I haven’t tried to contact everyone, but I’ve tried a couple — I’ve tried to contact GRAMCORD twice. Nothing.

But that’s not the point. If this work in Greek morphology and syntax is supposed to be scholarly, where was peer review? Where was the analysis? Where was the argumentation? I shouldn’t need to ask about it after the fact. This documentation should have been available when the database was created. If you were doing papyriology and provided the editio precepts for a newly discovered papyrus, could you get away with just saying, “Okay guys, here it is. This is my reconstruction. This is my translation.” And not defend or argue a single point?

This is unacceptable and inadequate for scholarship.

And yet,

we have accepted it.

We have treated it as adequate.

What other field can you produce a massive project, give it to the world as significant and valuable for scholarly use…


Do you think it’s acceptable to go and do such a project present a couple of papers on it at a academic conference or two and assume that’s enough? All of you have been put on report. I want documentation. I want defense of your analysis. I want at least bibliography.

Would your thesis or dissertation have been approved had you not cited anything?

UPDATE: We’re now at 2, surprisingly as of checking my e-mail this morning: PROIEL has completed the Gospels and is currently working on the rest of the NT. There is limited documentation (112 page pdf). I say “limited,” because again, there are no sources, no bibliography, no references. This is a step in the right direction, but continues to be inadequate. Again, if this 112 page document were a dissertation, it wouldn’t be approved. Even still, it’s slightly embarrassing that I should write such a post the night before a new database is announced. Well, I’m happy for it and hope more will document their work in a similar manner (except with a bibliography and references).

25 thoughts on “New Testament Greek Linguistic Databases

  1. Interesting rant, er, point.

    As someone designing a similar project, the plethora sources have gone into grant proposals. I never thought about revising it for a general audience, but it could easily be done. Would this type of thing satisfy your raving desire?

    1. It probably would.

      Let me word it this way:

      Years ago when I was taking my very first grammatical analysis class, if any student tried to get away with only using the formalism in their work with little or no prose description of the grammar problem at hand, they failed. A formalism and analysis without a prose description is almost useless. In the library, those are the grammars that aren’t used in 50 years because nobody can understand them.

      You can’t do analysis with prose.

      1. I sympathize, but I think where there should be some room for understanding is that in such projects there is very little tradition for this. These databases are not thought of as scholarly arguments (even if each analysis is an “argument” about that particular constituent’s role). Indeed, they are not given the same credit in terms of promotion, tenure, etc. Up to this point, many have been viewed (and many view them) as ‘non-scholarly’ tools, like a translation (I quote someone who will remain unnamed).

        Perhaps instead of a ‘rant’, you should present this (in the future) more positively as an sound argument for presenting the databases in line with other scholarly endeavors.

        I would agree.

        1. Up to this point, many have been viewed (and many view them) as ‘non-scholarly’ tools, like a translation (I quote someone who will remain unnamed).

          I might suggest that’s *because* there is a lack of documentation.

          Perhaps instead of a ‘rant’, you should present this (in the future) more positively as an sound argument for presenting the databases in line with other scholarly endeavors.

          That’s fair. I will have to do that. After my presentation & paper on Saturday, I’ll have the ammunition for it as well. One of the central things I want documentation for is so that when a database provides some sort of wacky analysis (e.g. claiming that prepositions are modifiers rather than heads of prepositional phrases or conflating syntactic concepts like Subject and Object with semantic concepts such as Predicate), I have something to go back to and check their reasoning on that point.

  2. “I might suggest that’s *because* there is a lack of documentation.”

    Not in the case I was referring to, since that person was in charge of a project. Rather, I submit that it has to do with the view of computer tools and the field of biblical studies/ANE studies. Only recently have they come to be viewed as serious, scholarly research tools.

    1. Okay, I see what you’re saying.

      And the situation reflects that: the only morphological database to document their work is the Friberg database which was done not by Biblical scholars, but by linguists and translators.

      In any case, I do hope that the situation changes. Thanks for your comments, Dr. Holmstedt.

      1. Mike,

        It’s been my pleasure. Your blog is the most linguistically astute I’m aware of in the area of biblical studies. I could only wish for a whole crop of young scholars with similar interests and background (with perhaps a few of the generative strain, minimalist or not).😉

        On the topic — your comments here are important and I’ll start planning not only a “searching guide” (which I had already intended), but an extended discussion of the principles in our work.

  3. I agree with with the necessity of providing at least some level of analysis, if not scholarly. Documentation of this type would be almost analogous to what Metzger’s Textual Commentary is to the UBS: An explanation of decisions.

    In effect, a grammatical/morphological/syntatical database is almost a collation of the analyses of relevant comments on grammar/morphology/syntax one would find in a traditional technical print commentary (e.g., SIL’s SSA, NIGTC, etc.), but without the comments, and exhaustive (covering every word, clause, etc., hence, database).

    For anyone using grammatical/morphological/syntatical databases for research purposes, it would be necessary to be informed of the decisions that were made that underlie the statistical counts from queries.

    This could be the beginning of a new type of commentary–a truly digital commentary for the digital age:

    Tagged OT + Tagging Notes
    Tagged NT + Tagging Notes

    We could possibly even see these forms of commentaries done for a corpus, not even the whole OT/NT, such as, Pauline epistles, Johannine writings, etc.

    Tagged Pauline Epistles + Tagging notes
    Tagged Johannine Writings + Tagging notes

    So one could purchase the tagged Pauline writings which were tagged by a Pauline scholar/linguist, that includes a separate module containing the tagging notes.

    I definitely think that your suggestion is a step in the right direction.

    1. Darryl, I like what you’re thinking and it would be great to see such a work developed. What I had in mind is something far more abstract:

      Something that outlined the implemented structured of the clause generally with theoretical explanation as to way a given analysis was chosen (e.g. why a flat clause structure was used rather than a configurational hierarchical structure). A good example of what I’m looking for is seen in’s analysis of prepositional phrases. Prepositions are treated as nominal modifiers rather than phrasal heads. I know of no linguistic theory or framework that treats them in this way. Prepositions are always the heads of prepositional phrases — hence the name. So what I’d be looking for is an explanation of why Opentext did what they did at this point, with citations, analysis, and arguments.

      So more abstract than a phrase by phrase or clause by clause commentary, but still dealing with related structural issues.

  4. “Rant” is probably the correct description!🙂

    Before you demand too much of databases such as Gramcord, you need to realize that they are simply morphological databases. They identify the form of every word. There is *relatively* little to document or justify at that level. If the text has, say, λεγομεν, then it gets tagged as 1PPAI λεγω. What would you like documented? Yes, there are a few (relatively speaking) places where one must decide whether a neuter word is nom or acc. Or is a 2P form indic or impv? In most such cases, however, there is little debate as to the correct decision. There are also some things like how to tag crasis forms.

    There is also quite a large collection of related documents that go with the Gramcord database. They are sometimes on the disk/s–they were in the old days (20+ yrs ago) when we got Gramcord for DOS (!) on 5.25″ floppies!–but probably not too many people read them. It included, if I remember right, a discussion of how crasis forms were handled. There was also a large printed manual in a 3-ring binder that discussed a lot of this stuff. I don’t know what is distributed these days by those using the Gramcord database. I assume Logos still includes that database–you might inquire there as to where the accompanying docs are. If they are not included, then your gripe may be with Logos, not Gramcord.

    Gramcord originated as, I think, the first such computerized database ever, so it’s probably not justified to criticize them for not doing something that no one else had ever done or had ever asked for.

    Most of the Boyer articles don’t relate to tagging for the Gramcord database. Rather they build on that database and provide an analysis of results from using it. There was, I think, one article in GTJ that described the Gramcord project that might be relevant.

    Now when you begin to deal with *syntactic* databases such as OpenText or Cascadia, etc., then there may be greater justification for a greater level of documentation. One of the reasons that the Friberg database, though supposedly morphological, needs more documentation is that they did not tag strictly by form but included functional categories–in formal slots! (If I remember right, one example was tagging imperatival futures as imperatives rather than future indicatives.) That is a very different approach than, say Gramcord.

    There are apparently several other syntactical databases in preparation, not all of them announced yet. At this point, of the two syntactical ones that I know of, we apparently have a 50% documentation situation–which is a little different than saying “only 1!”

    1. There is actually quite a bit of information in morphological databases that could (and should!) be documented. Probably most significant, is the representation of Greek’s Parts-of-Speech system. Many assignments, particularly of demonstratives, quantifiers, and various particles are treated in quite an arbitrary manner in most.

      I would say that the amount of documentation in Friberg has far less to do with its categories and more with the people who made it: linguists & bible translates rather than Biblical scholars. GRAMCORD’s categories are far, far more subjective (going quite far into issues and questions of exegesis well beyond morphology — consider, for example all of their categories for subordinate conjunctions) with no *currently* available explanation. I’d be interested in the contents of those disks that you’re talking about. Maybe I should e-mail them for a third time.

      I’m familiar with five NT Greek databases either completed or in preparation (six if I include my own). Currently one of them has some documentation, but its descriptive and gives little or no justification for either its theoretical framework or its decisions (see the updated section of the post).

  5. If you’ve carefully cataloged all the differences between the tagging of all those databases,then you know more about it than I do. Of the basic morphology ones like Gramcord, I suspect that you would find that the terminology used was fairly standard, grammar book terminology for the 1960s-70s before most of the more common descriptive systems of today were in existence or at least widely known. Let’s not be too hasty with anachronistic demands or criticisms before reading the documents that they did provide. It doesn’t take too much work for someone who knows the areas where there are options to figure out the classifications they used even in such things as, say the several different uses of αὐτος, or what is called a particle. Any of the software systems that implement these categories are almost self-documenting in that regard. And I suspect there is more formal documentation than you’ve seen. Could any of them be better? Sure. Be glad for what we’ve got.

    1. You’re always so level headed when I’m trying to be dramatic.

      I’ll accept your points — I probably could have written a few of them. I haven’t cataloged all of the difference, but I have done quite a few — probably a third — maybe half.

      And I’ll admit that a few of the seemingly arbitrary decisions in certain databases have provided for serendipitous discovery.

      Even still, I wish that more formal documentation was accessible. It would be significantly easier than trying to go backwards and reconstruct it — which I’ve had to do on a number of occasions (including this past week).

      The tone of my post was an attention getter. It seems to have worked🙂

    2. I’d also say that the terminology is still what is used in 99% of Greek textbooks and classes (both in Classics and in biblical studies). Before we complain about anachronistic grammatical tagging we’ll have to produce good enough grammars (and enough agreement about new terminology) that students are actually learning the terminology we’re going to use.

  6. Hi Mike,

    some thoughts from a corpus creator.🙂

    First of all, there is always a trade-off between making things happen and documenting them, especially for a small project like PROIEL.

    Second, our guidelines originated not as a dissertation but as guidelines for the annotators, and I think this is fairly typical for such guidelines. In annotation, consistency is the most important thing: you want to make the annotators do what you tell them, not to tell them about other ways you could have done it.

    Third, there are deplorably few formal analyses of ancient Greek that could have been cited for or against specific analyses. We hope that will change with the advent of syntactic corpora!

    That said, we are trying to document our work in different ways. On our project page you will find links to papers in Journal of Greek linguistics, Traitement automatique des langues and the proceedings of LREC 2008, where we try to justify our annotation schemes. On the web site there are also some (sadly unpublished) slides about the relationship between our formalism and Lexical Functional Grammar.

    1. Hi Dag,

      I was a little surprised with your announcement yesterday. Its caused a bit of a stir for me. I’m giving a paper examining available syntax databases for the GNT and now wish I had time to include a discussion of your work.

      I, again, want to make it clear that the rather dramatic tone of my post was merely for getting people’s attention (it worked quite well).

      The central point I wanted to make wasn’t about formal analysis of ancient Greek which you could cite, but more specifically the broader linguistic literature. I’ll be looking through the articles and papers you’ve referred to and slides eagerly. And I’d encourage you to develop some sort of theoretical description of your model that’s more unified in one place rather than scattered out across different journals and conferences. Accessibility is key here.

      Or…well, perhaps its a good thing for you that there’s only one person like me on the internet…who wants this sort of stuff.😉

  7. I think there’s some misunderstanding here about how textual scholarship works. It may arise in part from what I think is a destructive obsession with “documentation” in biblical scholarship, as if documentation legitimizes what one says. Footnotes and bibliographies are only useful for certain kinds of statements, and for the most part these databases don’t fall into those categories.

    Look at published texts of the NT (e.g., NA 27). There is no long bibliography, no footnotes, etc. Nor is there discussion about how decisions were reached (even Metzger’s Textual Commentary is by no means complete). These volumes are also never “peer reviewed” in the same way that academic monographs are. It is simply impossible for any reviewer to do more than a spot-check. The “peer-review” of such works happens in reviews and in the course of their use. That’s the way it has always been for practical reasons. It’s also why these works are now produced by committees–the committee functions to provide peer review in process.

    So how does this relate to databases? To the extent that databases simply reproduce existing manuscripts and eclectic texts, all we need are explanations of what eclectic text was used and maybe the rationale for which alternate readings were selected (if there’s any apparatus). I don’t think a long bibliography of sources (for what?) would serve any purpose. For morphology the only real question is how accurately the work has been done, but this is not rocket science. I don’t see what “sources” one would cite. In the course of use people will soon discover how accurate a database is. For syntax databases, what interests me is an explanation of method. But here too I don’t see “sources” as particularly important. I’m more interested in the methodology actually used and then in reading academic reviews of that methodology.



    1. So how does this relate to databases?

      I’d say it doesn’t.

      My reference to manuscripts and the creation of editio precepts princeps in particular was only an analogy. Fundamentally, what is done in Biblical Studies on linguistic databases stands completely at odds with all linguistic scholarship.

      For example, Russian has some pretty crazy stuff going on with cardinal numbers and case. Now, one way to deal with the language’s challenge in that regard is to posit a Quantifier Phrase (QP) where the number functions as the head of the phrase and the noun functions as a modifier. It works well, but fundamentally, if I’m going to make such a claim, I need to defend it, argue for it, and provide evidence that it is the best analysis for dealing with the data.

      The Syntactically Analyzed Greek New Testament makes the rather outrageous claim that nouns are the head word of prepositional phrases. It’s a rather remarkable analysis. *QUALITY* linguistic work necessitates that if Porter & co. want to make such a claim, they need to argue the point. They need to give evidence for proposing the analysis they do. Anything less is not good enough. Linguistic work *requires* documentation. The moment a scholar stops providing it, they’ve left the field.

      But here too I don’t see “sources” as particularly important. I’m more interested in the methodology actually used and then in reading academic reviews of that methodology.

      Those are the same thing. Porter can say “prepositional phrases have nouns as their heads” all he wants. He can then argue for it and refer to the relevant literature that affirms what he said, whether it be methodological framework literature or typological literature about how various languages work in this regard. But Porter & Co. does neither. We’re just left with “this is how it is.” That’s not good enough. Either cite something that shows this is a framework specific issue for dealing with PPs or argue for your interpretation from the structure Greek and the structure of language as a whole, citing the relevant languages which also have the supposed phenomena you’re representing.

      Fundamentally, the point is that creating a grammar database of any kind ought be viewed as the equivalent of creating a grammar itself. Opentext has analysis the entire New Testament linguistically. Where’s the grammar to go with it?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s