Hi,
as of Lucene 3.1 the StandardAnalyzer uses a new version of StandardTokenizer which implements Unicode Standard Annex #29. The old version of the StandardTokenizer is now called ClassicTokenizer.
The now called ClassicTokenizer always treated tokens with numbers differently. In the documentation it says, eg: "Splits words at hyphens, unless there's a number in the token, in which case the whole token is interpreted as a product number and is not split. " I would assume that's the behavior you are seeing.
As a solution you can always create your own tokenizer, eg by starting with the code for ClassicTokenizer. Have a look at this thread as well -
http://lucene.472066.n3.nabble.com/Inco ... 34767.html--Hardy