Omgili, forum search, forums search, search forums, discussion search,discussions search, search discussions, board search, boards search, search boards
  Advanced Search

How to avoid filtering stop words like "IS" in StandardAnalyzer

On Fri, 27 Jan 2012 23:40:32 -0500, Cheng <...@gmail.com

Hi,

I don't want to filter certain stop words within the StandardAnalyzer? Can
I do so?

Ideally, I would like to have a customized StandardAnalyzer.

Thanks.



On Sat, 28 Jan 2012 09:48:52 -0200, Pedro Lacerda <...@gmail.com

Hi Cheng,

You can provide your own set of stop words as the second argument of
StandardAnalyzer constructor.

new StandardAnalyzer(version, new HashSet());

Pedro Lacerda

2012/1/28 Cheng <...@gmail.com

On Sat, 28 Jan 2012 12:51:45 +0100, "Uwe Schindler" <...@thetaphi.de

Right, but Collections.emptySet() should be used :-)

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uw...@thetaphi.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java...@lucene.apache.org
For additional commands, e-mail: java...@lucene.apache.org

On Sat, 28 Jan 2012 12:56:44 +0100, "Uwe Schindler" <...@thetaphi.de

Or even better: CharArraySet.EMPTY_SET - sorry for noise.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uw...@thetaphi.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java...@lucene.apache.org
For additional commands, e-mail: java...@lucene.apache.org

On Sat, 28 Jan 2012 21:32:36 -0500, Cheng <...@gmail.com

Pedro's suggestion seems to work fine. Not sure where I should use
CharArraySet.EMPTY_SET.

On Sat, Jan 28, 2012 at 6:56 AM, Uwe Schindler <...@thetaphi.de

On Sun, 29 Jan 2012 10:57:20 +0100, "Uwe Schindler" <...@thetaphi.de

Hi,

If you want to disable *all* stop words, then CharArraySet.EMPTY_SET is the
right choice. For performance reasons you should also use CharArraySet for
non-empty stop words instead of simple HashSet<String
Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uw...@thetaphi.de

StandardAnalyzer

---------------------------------------------------------------------
To unsubscribe, e-mail: java...@lucene.apache.org
For additional commands, e-mail: java...@lucene.apache.org

On Mon, 30 Jan 2012 08:22:44 -0200, Pedro Lacerda <...@gmail.com

I didn't know about CharArraySet.EMPTY_SET, thanks.

Pedro Lacerda

2012/1/29 Uwe Schindler <...@thetaphi.de