Package index • jiebaRS

Workers

Create and configure jiebaRS workers.

worker(): Initialize a jiebaRS worker

Segmentation

Segment Chinese text into tokens.

segment(): Segment text with a jieba worker
segment_batch(): Segment a batch of strings

Speech Tagging

Tag tokens with part-of-speech tags.

tagging(): Tag text with a jiebaRS worker
tagging_batch(): Tag a batch of strings

Keyword Extraction

Extract keywords via TF-IDF and TextRank.

keywords(): Extract keywords from text
keywords_df(): Extract keywords as a data frame
textrank(): Extract TextRank keywords from text
textrank_df(): Extract TextRank keywords as a data frame

Word Frequency and N-grams

Count word frequencies and n-grams from segmented tokens.

freq(): The frequency of words
count_ngrams(): Count n-grams from segmented text
get_tuple(): Compatibility wrapper for jiebaR::get_tuple()

Utilities

Filter tokens and manage user dictionaries.

filter_segment(): Filter segmentation results
new_user_word() add_word(): Add user word
get_idf(): Generate IDF dict