Match Regular Expression Function

Owning Palette: String Functions

Requires: Base Development System

Searches for a regular expression in the input string beginning at the offset you enter. If the function finds a match, it splits the string into three substrings and any number of submatches. Resize the function to view any submatches found in the string.

Details  

 Add to the block diagram  Find on the palette
multiline? specifies whether to treat the text in input string as a multiple-line string. This setting affects how the ^ and $ characters handle matches. If you set multiline? to FALSE (default), when you enter ^ at the beginning of a regular expression, the expression matches only the beginning of the string in input string. When you enter $ at the end of a regular expression, the expression matches only the end of the string in input string. If you set multiline? to TRUE, ^ matches the beginning of any line in input string and $ matches the end of any line in input string.

Note  The ^ character anchors the match to the beginning of a string when used as the first character of a pattern. If you add ^ to the beginning of a character class immediately after an open square bracket, the expression matches any character not in a given character class.
ignore case? specifies whether the string search is case sensitive. If FALSE (default), the string search is case sensitive.
input string specifies the input string the function searches. This string cannot contain null characters.
regular expression specifies the pattern you want to search for in input string. If the function does not find a match, whole match and after match contain empty strings, before match contains the entire input string, offset past match returns –1, and all submatches outputs return empty strings. Place any substrings you want to search for in parentheses. The function returns any substring expressions it finds in substring 1..n. This string cannot contain null characters.
offset specifies the number of characters into input string at which the function starts searching for search string.
error in describes error conditions that occur before this node runs. This input provides standard error in functionality.
before match returns all the characters before the match.
whole match returns all the characters that match the expression entered in regular expression. Any substring matches the function finds appear in the submatch outputs.
after match returns all the characters after the match.
offset past match returns the index in input string of the first character after the last match. If the VI does not find a match, offset past match returns –1.
error out contains error information. This output provides standard error out functionality.

Match Regular Expression Details

Note  The Match Regular Expression function does not support null characters in strings. If you include null characters in strings you wire to this function, LabVIEW returns an error and the function may return unexpected results.

Regular expression support is provided by the PCRE library package. Refer to <National Instruments>\_Legal Information directory for more information about the license under which the PCRE library package is redistributed.

Refer to the PCRE website at www.pcre.org for more information about Perl Compatible Regular Expressions.

The Match Regular Expression function gives you more options for matching strings but performs more slowly than the Match Pattern function.

Use regular expressions in this function to refine searches.

Avoiding Stack Overflow

Certain regular expressions that use repeated grouped expressions (such as (.|\s)* or (a*)*) require significant resources to process when applied to large input strings. In some cases a stack overflow may occur on large input strings. Some regular expressions may recurse repeatedly while attempting to match a large string, which may eventually overflow the stack. For example, the regular expression (.|\n)*A and a large input string may cause LabVIEW to crash. To avoid recursion, you can rewrite the regular expression (.|\n)*A as (?s).*A. The (?s) notation indicates that a period matches new lines. You also can rewrite the expression as [^A]*A.

Grouping Patterns for Submatches

You can capture submatches by placing parentheses ( ) around a portion of a regular expression that you want the function to return as a submatch. For example, the regular expression (el.)..(L..) returns two submatches in the input string Hello LabVIEW!: ell and Lab. Each submatch corresponds to a character group in the order that the character group appears in the regular expression. In this example, submatch 1 is ell and submatch 2 is Lab.

If you nest a character group within another character group, the regular expression creates a submatch for the outer group before the inner group. For example, the regular expression (.(el.).).(L..) returns three submatches in the input Hello LabVIEW!: Hello, ell, and Lab. In this example, submatch 1 is Hello because the regular expression matches the outer character group before the inner group.

Examples of Regular Expressions

The following table shows examples of regular expressions you can use with the Match Regular Expression function.

Characters to Find Regular Expression
VOLTS VOLTS
A plus sign or a minus sign [+-]
A sequence of one or more digits [0-9]+
Zero or more spaces \s* or * (that is, a space followed by an asterisk)
One or more spaces, tabs, new lines, or carriage returns [\t \r \n \s]+
One or more characters other than digits [^0-9]+
The word Level only if it appears at the beginning of the string ^Level
The word Volts only if it appears at the end of the string Volts$
The longest string within parentheses \(.*\)
The first string within parentheses but not containing any parentheses within it \([^()]*\)
A left bracket \[
A right bracket \]
cat, cag, cot, cog, dat, dag, dot, and dog [cd][ao][tg]
cat or dog cat|dog
dog, cat dog, cat cat dog,cat cat cat dog, and so on ((cat )*dog)
One or more of the letter a followed by a space and the same number of the letter a, that is, a a, aa aa, aaa aaa, and so on (a+) \1