Weather your analyzing your blog comments or looking at a single sentence one of the best way to discover the meaning of a peice of text is to perform text stemming and stop word removal to be left with the bones of the text. This post is a list of those words in an array string format. yay.
["a","able","about","above","abst","accordance","according","accordingly","across","act","actually", "added","adj","adopted","affected","affecting","affects","after","afterwards","again","against","ah","all", "almost","alone","along","already","also","although","always","am","among","amongst","an","and","announce", "another","any","anybody","anyhow","anymore","anyone","anything","anyway","anyways","anywhere", "apparently","approximately","are","aren","arent","arise","around","as","aside","ask","asking","at","auth", "available","away","awfully","b","back","be","became","because","become","becomes","becoming","been", "before","beforehand","begin","beginning","beginnings","begins","behind","being","believe","below","beside", "besides","between","beyond","biol","both","brief","briefly","but","by","c","ca","came","can","cannot","can't", "cause","causes","certain","certainly","co","com","come","comes","contain","containing","contains","could", "couldnt","d","date","did","didn't","different","do","does","doesn't","doing","done","don't","down","downwards", "due","during","e","each","ed","edu","effect","eg","eight","eighty","either","else","elsewhere","end","ending", "enough","especially","et","et-al","etc","even","ever","every","everybody","everyone","everything","everywhere","ex","except","f", "far","few","ff","fifth","first","five","fix","followed","following","follows","for","former","formerly","forth","found", "four","from","further","furthermore","g","gave","get","gets","getting","give","given","gives","giving","go","goes", "gone","got","gotten","h","had","happens","hardly","has","hasn't","have","haven't","having","he","hed","hence", "her","here","hereafter","hereby","herein","heres","hereupon","hers","herself","hes","hi","hid","him","himself","his", "hither","home","how","howbeit","however","hundred","i","id","ie","if","i'll","im","immediate","immediately", "importance","important","in","inc","indeed","index","information","instead","into","invention","inward","is","isn't", "it","itd","it'll","its","itself","i've","j","just","k","keep"," keeps","kept","keys","kg","km","know","known","knows","l","largely","last","lately","later","latter","latterly", "least","less","lest","let","lets","like","liked","likely","line","little","'ll","look","looking","looks","ltd","m","made", "mainly","make","makes","many","may","maybe","me","mean","means","meantime","meanwhile","merely","mg", "might","million","miss","ml","more","moreover","most","mostly","mr","mrs","much","mug","must","my","myself","n", "na","name","namely","nay","nd","near","nearly","necessarily","necessary","need","needs","neither","never", "nevertheless","new","next","nine","ninety","no","nobody","non","none","nonetheless","noone","nor","normally", "nos","not","noted","nothing","now","nowhere","o","obtain","obtained","obviously","of","off","often","oh","ok", "okay","old","omitted","on","once","one","ones","only","onto","or","ord","other","others","otherwise","ought", "our","ours","ourselves","out","outside","over","overall","owing","own","p","page","pages","part","particular", "particularly","past","per","perhaps","placed","please","plus","poorly","possible","possibly","potentially","pp", "predominantly","present","previously","primarily","probably","promptly","proud","provides","put","q","que", "quickly","quite","qv","r","ran","rather","rd","re","readily","really","recent","recently","ref","refs","regarding", "regardless","regards","related","relatively","research","respectively","resulted","resulting","results","right","run", "s","said","same","saw","say","saying","says","sec","section","see","seeing","seem","seemed","seeming","seems", "seen","self","selves","sent","seven","several","shall","she","shed","she'll","shes","should","shouldn't","show", "showed","shown","showns","shows","significant","significantly","similar","similarly","since","six","slightly","so", "some","somebody","somehow","someone","somethan","something","sometime","sometimes","somewhat", "somewhere","soon","sorry","specifically","specified","specify","specifying","state","states","still","stop", "strongly","sub","substantially","successfully","such","sufficiently","suggest","sup","sure"," t","take","taken","taking","tell","tends","th","than","thank","thanks","thanx","that","that'll","thats","that've", "the","their","theirs","them","themselves","then","thence","there","thereafter","thereby","thered","therefore", "therein","there'll","thereof","therere","theres","thereto","thereupon","there've","these","they","theyd","they'll", "theyre","they've","think","this","those","thou","though","thoughh","thousand","throug","through","throughout", "thru","thus","til","tip","to","together","too","took","toward","towards","tried","tries","truly","try","trying","ts", "twice","two","u","un","under","unfortunately","unless","unlike","unlikely","until","unto","up","upon","ups","us", "use","used","useful","usefully","usefulness","uses","using","usually","v","value","various","'ve","very","via","viz", "vol","vols","vs","w","want","wants","was","wasn't","way","we","wed","welcome","we'll","went","were","weren't", "we've","what","whatever","what'll","whats","when","whence","whenever","where","whereafter","whereas", "whereby","wherein","wheres","whereupon","wherever","whether","which","while","whim","whither","who", "whod","whoever","whole","who'll","whom","whomever","whos","whose","why","widely","willing","wish","with", "within","without","won't","words","world","would","wouldn't","www","x","y","yes","yet","you","youd","you'll", "your","youre","yours","yourself","yourselves","you've","z","zero"]
The following is the simpe program that i used to convert a text file of these Stop words into the array string above.
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package arrayify;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
/**
*
* @author Siriquelle
*/
public class Main {
/**
* @param args the command line arguments
*/
public static void main(String[] args)
{
try
{
BufferedReader in = new BufferedReader(new InputStreamReader(Main.class.getResourceAsStream("StopWords.txt")));
String str = "";
StringBuilder sb = new StringBuilder();
while ((str = in.readLine()) != null)
{
sb.append("\"").append(str).append("\"");
}
str = sb.toString().replaceAll("\"\"", "\",\"");
System.out.print(str);
in.close();
} catch (IOException e)
{
System.out.print(e.toString());
}
}
}
Hope this saves you some time. Many thanks to <a href=”http://www.ranks.nl/resources/stopwords.html” >http://www.ranks.nl/resources/stopwords.html </a> for the list of words.








Leave a Comment