Lucene note: build your index all at once
It looks like [[Lucene]] doesn’t like searching through indices that have been built in lots of little chunks. I get proper results if I build an index of 300 blog posts like this:
- open index
- for post in posts:
- add post to index
- close index
… but if I do it like this, it seems to stop indexing them after the first couple of hundred:
- for post in posts:
- open index
- add post to index
- close index
Update: It looks like the reason for this behaviour was that I was passing a true
value as the create
parameter to IndexWriter
’s constructor in the test. I’m seeing the same behaviour in my application code, though, which sets it to false
. Odd.
Update 2 (2003-06-12): Figured the application out now (finally)! The problem was a combination of the use of different analyzers in indexing and search and a broken custom query object that was blowing away some of the hits.