Skip to contents

Workers

Create and configure jiebaRS workers.

worker()
Initialize a jiebaRS worker

Segmentation

Segment Chinese text into tokens.

segment()
Segment text with a jieba worker
segment_batch()
Segment a batch of strings

Speech Tagging

Tag tokens with part-of-speech tags.

tagging()
Tag text with a jiebaRS worker
tagging_batch()
Tag a batch of strings

Keyword Extraction

Extract keywords via TF-IDF and TextRank.

keywords()
Extract keywords from text
keywords_df()
Extract keywords as a data frame
textrank()
Extract TextRank keywords from text
textrank_df()
Extract TextRank keywords as a data frame

Word Frequency and N-grams

Count word frequencies and n-grams from segmented tokens.

freq()
The frequency of words
count_ngrams()
Count n-grams from segmented text
get_tuple()
Compatibility wrapper for jiebaR::get_tuple()

Utilities

Filter tokens and manage user dictionaries.

filter_segment()
Filter segmentation results
new_user_word() add_word()
Add user word
get_idf()
Generate IDF dict