Skip to contents

ac_build() compiles a character vector of patterns into a reusable automaton backed by the Rust aho-corasick crate.

Usage

ac_build(
  patterns,
  match_kind = c("standard", "leftmost_first", "leftmost_longest"),
  implementation = c("auto", "noncontiguous_nfa", "contiguous_nfa", "dfa"),
  ascii_case_insensitive = FALSE,
  duplicate = c("keep", "error", "deduplicate")
)

Arguments

patterns

A character vector of non-empty patterns.

match_kind

Matching semantics:

  • "standard" supports overlapping search (Default).

  • "leftmost_first" returns leftmost non-overlapping matches, breaking ties by pattern order.

  • "leftmost_longest" returns leftmost non-overlapping matches, breaking ties by longest match.

implementation

Rust automaton implementation. "auto" lets the crate choose.

ascii_case_insensitive

Use ASCII-only case-insensitive matching. Default is FALSE.

duplicate

How duplicate patterns are handled:

  • "keep" preserves duplicates in their original order.

  • "error" fails if patterns contains duplicates.

  • "deduplicate" keeps the first occurrence of each pattern and drops later duplicates.

Value

An immutable <ac_automaton> object.

Examples

ac <- ac_build(c("hello", "world"))
length(ac)
#> [1] 2
ac_info(ac)
#> $patterns_len
#> [1] 2
#> 
#> $min_pattern_len
#> [1] 5
#> 
#> $max_pattern_len
#> [1] 5
#> 
#> $match_kind
#> [1] "standard"
#> 
#> $implementation
#> [1] "dfa"
#> 
#> $ascii_case_insensitive
#> [1] FALSE
#> 
#> $memory_usage
#> [1] 960
#>