Currently over half a million geneRIFs have been created for genes from almost 1000 different species.
GeneRIFs are often extracted directly from the document that is identified by the PubMed ID, very frequently from its title or from its final sentence.
The latter case is implemented via records in Gene with the symbol NEWENTRY.
Note the wide variability with respect to the presence or absence of punctuation and of sentence-initial capital letters.
GeneRIFs are an unusual type of textual genre, and they have recently been the subject of a number of articles from the natural language processing community.