quanteda-package | An R package for the quantitative analysis of textual data |
as.character.corpus | get or assign corpus texts |
as.character.tokens | coercion and checking functions for tokens objects |
as.corpus.corpuszip | coerce a compressed corpus to a standard corpus |
as.data.frame.dfm | coerce a dfm to a matrix or data.frame |
as.dfm | coercion and checking functions for dfm objects |
as.kwic | locate keywords-in-context |
as.list.dist | coerce a dist object into a list |
as.list.tokens | coercion and checking functions for tokens objects |
as.matrix.dfm | coerce a dfm to a matrix or data.frame |
as.tokens | coercion and checking functions for tokens objects |
as.tokens.list | coercion and checking functions for tokens objects |
char_ngrams | create ngrams and skipgrams from tokens |
char_segment | segment texts into component elements |
char_tolower | convert the case of character objects |
char_toupper | convert the case of character objects |
char_wordstem | stem the terms in an object |
collocations | detect collocations from text |
convert | convert a dfm to a non-quanteda format |
corpus | construct a corpus object |
corpus_reshape | change the document units of a corpus |
corpus_sample | randomly sample documents from a corpus |
corpus_segment | segment texts into component elements |
corpus_subset | extract a subset of a corpus |
data_char_inaugural | US presidential inaugural address texts |
data_char_mobydick | text of Herman Melville's Moby Dick |
data_char_sampletext | a paragraph of text for testing various text-based functions |
data_char_ukimmig2010 | immigration-related sections of 2010 UK party manifestos |
data_corpus_inaugural | US presidential inaugural address texts |
data_corpus_irishbudget2010 | Irish budget speeches from 2010 |
data_dfm_LBGexample | dfm from data in Table 1 of Laver, Benoit, and Garry (2003) |
dfm | create a document-feature matrix |
dfm_compress | compress a dfm or fcm by combining identical dimension elements |
dfm_lookup | apply a dictionary to a dfm |
dfm_remove | select features from a dfm or fcm |
dfm_sample | randomly sample documents or features from a dfm |
dfm_select | select features from a dfm or fcm |
dfm_smooth | weight the feature frequencies in a dfm |
dfm_sort | sort a dfm by frequency of one or more margins |
dfm_tolower | convert the case of the features of a dfm and combine |
dfm_toupper | convert the case of the features of a dfm and combine |
dfm_trim | trim a dfm using frequency threshold-based feature selection |
dfm_weight | weight the feature frequencies in a dfm |
dfm_wordstem | stem the terms in an object |
dictionary | create a dictionary |
docnames | get or set document names |
docnames<- | get or set document names |
docvars | get or set for document-level variables |
docvars<- | get or set for document-level variables |
fcm | create a feature co-occurrence matrix |
fcm_compress | compress a dfm or fcm by combining identical dimension elements |
fcm_remove | select features from a dfm or fcm |
fcm_select | select features from a dfm or fcm |
fcm_sort | sort an fcm in alphabetical order of the features |
fcm_tolower | convert the case of the features of a dfm and combine |
fcm_toupper | convert the case of the features of a dfm and combine |
featnames | get the feature labels from a dfm |
head.dfm | return the first or last part of a dfm |
is.collocations | check if an object is collocations type |
is.dfm | coercion and checking functions for dfm objects |
is.dictionary | check if an object is a dictionary |
is.fcm | create a feature co-occurrence matrix |
is.kwic | locate keywords-in-context |
is.tokens | coercion and checking functions for tokens objects |
kwic | locate keywords-in-context |
metacorpus | get or set corpus metadata |
metacorpus<- | get or set corpus metadata |
metadoc | get or set document-level meta-data |
metadoc<- | get or set document-level meta-data |
ndoc | count the number of documents or features |
nfeature | count the number of documents or features |
nscrabble | count the Scrabble letter values of text |
nsentence | count the number of sentences |
nsyllable | count syllables in a text |
ntoken | count the number of tokens or types |
ntype | count the number of tokens or types |
quanteda | An R package for the quantitative analysis of textual data |
sequences | find variable-length collocations with filtering |
sparsity | compute the sparsity of a document-feature matrix |
stopwords | access built-in stopwords |
tail.dfm | return the first or last part of a dfm |
textmodel | fit a text model |
textmodel-method | fit a text model |
textmodel_ca | correspondence analysis of a document-feature matrix |
textmodel_NB | Naive Bayes classifier for texts |
textmodel_wordfish | wordfish text model |
textmodel_wordscores | Wordscores text model |
textmodel_wordshoal | wordshoal text model |
textplot_scale1d | plot a fitted wordfish model |
textplot_wordcloud | plot features as a wordcloud |
textplot_xray | plot the dispersion of key word(s) |
texts | get or assign corpus texts |
texts<- | get or assign corpus texts |
textstat_dist | Distance matrix between documents and/or features |
textstat_keyness | calculate keyness statistics |
textstat_lexdiv | calculate lexical diversity |
textstat_readability | calculate readability |
textstat_simil | Distance matrix between documents and/or features |
tokens | tokenize a set of texts |
tokens_compound | convert token sequences into compound tokens |
tokens_lookup | apply a dictionary to a tokens object |
tokens_ngrams | create ngrams and skipgrams from tokens |
tokens_remove | select or remove tokens from a tokens object |
tokens_select | select or remove tokens from a tokens object |
tokens_skipgrams | create ngrams and skipgrams from tokens |
tokens_tolower | convert the case of tokens |
tokens_toupper | convert the case of tokens |
tokens_wordstem | stem the terms in an object |
topfeatures | list the most frequent features |