A CCG-Based Version of the DisCoCat Framework


Over the last decade, the DisCoCat model has proved to be a valuable tool for studying compositional aspects of language. However, the strong dependency of the model on a specific grammar formalism, namely pregroup grammars, gives rise to two important problems, one theoretical and one practical. On the theory side, pregroups are a context-free grammar, which means their generative capacity is inadequate to cover every aspect of natural language. On the practical side, due to the fact that the formalism is still less well known, there are few if any statistical parsers or other tools for grammatical derivations at a large scale. We solve these problems by reformulating DisCoCat as a passage from Combinatory Categorial Grammar (CCG) to a category of semantics. We provide a proof of concept for our method, converting “Alice in Wonderland” into DisCoCat form, a corpus that we make available to the community.

Richie Yeung and Dimitri Kartsaklis

