Since starting to use lucene heavily at work about a year ago I’ve been watching the lucene list out of the corner of my eye for tips and tricks. Today I saw an email go by that referenced a recent patch that lazily creates SegmentMergeInfo.docMap objects. I guess the point isn’t so much what the object is, but the mere change in lazily creating the object yielded some pretty impressive performance gains:

Performance Results: A simple single field index with 555,555 documents, and 1000 random deletions was queried 1000 times with a PrefixQuery matching a single document. Performance Before Patch: indexing time = 121,656 ms querying time = 58,812 ms Performance After Patch: indexing time = 121,000 ms querying time = 598 ms A 100 fold increase in query performance!

Umm, 100 fold increase in performance. That’s quite a patch!