public class TrecDocMaker extends BasicDocMaker
Config properties:
| Modifier and Type | Field and Description |
|---|---|
protected File |
dataDir |
protected ThreadLocal |
dateFormat |
protected ArrayList |
inputFiles |
protected int |
iteration |
protected int |
nextFile |
protected BufferedReader |
reader |
BODY_FIELD, BYTES_FIELD, config, DATE_FIELD, forever, ID_FIELD, indexVal, NAME_FIELD, storeVal, termVecVal, TITLE_FIELD| Constructor and Description |
|---|
TrecDocMaker() |
| Modifier and Type | Method and Description |
|---|---|
protected void |
closeInputs() |
protected DateFormat |
getDateFormat(int n) |
protected DocData |
getNextDocData()
Return the data of the next document.
|
int |
numUniqueTexts()
Return how many real unique texts are available, 0 if not applicable.
|
protected void |
openNextFile() |
protected Date |
parseDate(String dateStr) |
protected StringBuffer |
read(String prefix,
StringBuffer sb,
boolean collectMatchLine,
boolean collectAll) |
void |
resetInputs()
Reset inputs so that the test run would behave, input wise, as if it just started.
|
void |
setConfig(Config config)
Set the properties
|
addBytes, addUniqueBytes, collectFiles, getByteCount, getCount, getHtmlParser, makeDocument, makeDocument, numUniqueBytes, printDocStatistics, resetUniqueBytes, setHTMLParserprotected ThreadLocal dateFormat
protected File dataDir
protected ArrayList inputFiles
protected int nextFile
protected int iteration
protected BufferedReader reader
public void setConfig(Config config)
DocMakersetConfig in interface DocMakersetConfig in class BasicDocMakerprotected void openNextFile()
throws NoMoreDataException,
Exception
NoMoreDataExceptionExceptionprotected void closeInputs()
protected StringBuffer read(String prefix, StringBuffer sb, boolean collectMatchLine, boolean collectAll) throws Exception
Exceptionprotected DocData getNextDocData() throws NoMoreDataException, Exception
BasicDocMakergetNextDocData in class BasicDocMakerNoMoreDataException - if data is exhausted (and 'forever' set to false).Exceptionprotected DateFormat getDateFormat(int n)
public void resetInputs()
DocMakerresetInputs in interface DocMakerresetInputs in class BasicDocMakerpublic int numUniqueTexts()
DocMakerCopyright © 2000-2013 Apache Software Foundation. All Rights Reserved.