Data patterns in multiple botanical descriptions: Implications for automatic processing of legacy data

An analysis of conventional paperā€based botanical descriptions from Floras was undertaken as part of the development of MultiFlora, a system for the automatic production of a queryable database from such legacy data. The descriptions of five species of Ranunculus L. (buttercups and crowfoots) in six different English language Floras, from Europe and North America, show a surprising lack of uniformity in the suite of properties described. There is also considerable variation in the way property states are recorded. These findings have implications for the automatic production of taxonomic databases. This study is a proof of concept exercise, in which the taxa used are of negligible importance in themselves.