====== Boolean Text Search ======
Synchronet's text-search prompts (message scan, mail scan, file search,
and the less-style file pager) accept a small boolean query language:
combine search terms with AND / OR / NOT operators, group them with
parentheses, and quote terms to require a whole-word match. The language
is compatible with the syntax that //PCBoard// and //Wildcat!// used for
the same purpose, with a few deliberate extensions.
===== Where it works =====
The boolean parser is used at every //Text to search for// prompt. In
the stock command shell (''exec/default.js''):
^ Where ^ Trigger ^
| Message base scan (main menu) | ''F'' — //Find Text in Messages// (''/F'' to scan all sub-boards) |
| Reading messages on a sub | ''F'' — //Find text// re-prompt inside the read loop |
| Private mail (//Reading E-mail//) | ''/'' (slash) at the mail-read prompt |
| File listings (file menu) | ''F'' — //Find Text in File Descriptions// (''/F'' for all dirs) |
| File pager (''P_SEEK'', less-style)| ''/'' (slash) while viewing a file, bulletin, or log — ''n'' for next |
Note: the file menu's ''S'' (//Search for Filename(s)//) is a separate
wildcard-pattern filename match (''*.zip'', ''WILD*.EXE'', etc.) and does
//not// use the boolean parser. Sysops with [[custom:command_shell|custom shells]]
may bind these commands differently — the parser engages whenever the
underlying scan function receives the ''SCAN_FIND'' (messages) or
''FL_FIND'' (files) mode flag.
Programmatic callers (e.g. ''bbs.scan_posts(sub, mode, find)'' in
JavaScript) inherit the same syntax — the ''find'' string they pass is
parsed identically.
===== Inline help =====
At any boolean search prompt — including the file pager's ''/'' search —
entering a lone ''?'' displays a one-screen quick-reference help file
([[custom:menu_files|''text/menu/textsrch.msg'']]) and re-prompts. The
prompt itself includes the hint ''(?=help)'' so the feature is
discoverable without reading the docs first. In the pager case, the help
text scrolls onto the screen the same way the pager's main ''?'' help
already does; press Home or PgUp to return to the original content if
needed.
===== Quick reference =====
^ You type ^ Matches when… ^
| ''monitor'' | the haystack contains ''monitor'' (case-insensitive substring) |
| ''VGA monitor'' | the haystack contains the literal phrase ''VGA monitor'' |
| ''text & edit'' | both ''text'' //and// ''edit'' appear (in any order) |
| ''text and edit'' | same as above (''AND'' keyword form) |
| ''hard disk %%|%% hard drive'' | either ''hard disk'' //or// ''hard drive'' appears |
| ''hard disk or hard drive'' | same as above (''OR'' keyword form) |
| ''! 320x200'' | the haystack does **not** contain ''320x200'' |
| ''not 320x200'' | same as above (''NOT'' keyword form) |
| ''1024x768 &! swim'' | contains ''1024x768'' and not ''swim'' |
| ''(windows %%|%% dos) & modem !os/2'' | (''windows'' or ''dos'') and ''modem'', but not ''os/2'' |
| ''%%"TEST"%%'' | the **word** ''TEST'' (won't match ''TESTING'' or ''BACKTEST'') |
| ''%%"SMITH & JONES"%%'' | the literal phrase ''SMITH & JONES'', including the ''&'' |
===== Syntax =====
A query is a boolean expression of search terms, operators, and groups.
==== Search terms ====
A bare term is any run of characters that doesn't include an operator
character (''&'', ''%%|%%'', ''!'', ''('', '')'', ''%%"%%''). Embedded
whitespace is part of the term — it does **not** mean implicit AND. So
''VGA monitor'' is one phrase that matches whenever ''VGA'' appears
immediately followed by '' monitor''.
TEST -> contains 'TEST' anywhere (substring, case-insensitive)
VGA monitor -> contains the phrase 'VGA monitor'
no-such-string -> contains the literal 'no-such-string' (hyphens are not special)
==== Operators ====
^ Symbol ^ Keyword ^ Meaning ^
| ''&'' | ''AND'' | both operands present |
| ''%%|%%'' | ''OR'' | at least one operand present |
| ''!'' | ''NOT'' | operand absent (unary, binds tightest) |
Keyword forms are case-insensitive and only recognized as operators when
surrounded by whitespace, parens, or operator characters — so ''BANDIT''
is one phrase, while ''BAND AND IT'' is ''BAND'' AND ''IT''.
Precedence: **NOT > AND > OR**. So ''A %%|%% B & C'' means ''A OR (B AND
C)''. Use parentheses if you want a different grouping.
You can omit the ''&'' when it would precede a ''!'' or ''NOT'' — the
AND is implied. So ''dog !cat'' is the same as ''dog & !cat'' or ''dog
AND NOT cat''.
==== Grouping ====
Parentheses control evaluation order:
(windows | dos) & modem -- (windows or dos) and modem
windows | dos & modem -- windows or (dos and modem)
==== Quoting (whole-word match) ====
Wrapping a term in double quotes turns on **word-boundary matching**:
the match must be preceded and followed by a non-word character (or the
start / end of the haystack). This is useful when a short search term
would otherwise match inside a longer word.
TEST -> matches TEST, TESTING, BACKTEST, ... (substring)
"TEST" -> matches TEST, (TEST), 'TEST'. but NOT TESTING, BACKTEST
A //word character// is any letter, digit, or underscore. Everything else
(spaces, punctuation, brackets, etc.) is a boundary.
Quoted phrases preserve any internal whitespace and operator characters
as literal content, with word-boundary checks applied to the outer ends
of the phrase. This is what makes ''%%"SMITH & JONES"%%'' searchable as a
literal phrase including the ''&''.
"hard disk" -> matches 'the hard disk here', not 'hard disks' or 'hard diskette'
"new york" -> matches 'New York City'
"smith & jones" -> matches 'Smith & Jones report' (the & is literal, not AND)
=== Whitespace inside quotes opts out of the boundary check on that side ===
A leading space inside the quotes disables the word-boundary check on
the left edge; a trailing space disables it on the right edge. Padding
both sides turns a quoted term into a pure substring search.
^ You type ^ Behavior ^
| ''%%"WORD"%%'' | whole-word match — both edges bounded |
| ''%%" WORD"%%'' | trailing edge bounded only |
| ''%%"WORD "%%'' | leading edge bounded only |
| ''%%" WORD "%%'' | no boundary check — pure substring match |
This is what lets you, for instance, search for ''%%"co"%%'' to match the
abbreviation ''co.'' but not ''cocoa'' or ''coffee'', while ''%%" co "%%''
still matches ''co'' anywhere including inside ''cocoa''.
===== Common idioms =====
hello & world -- both present
hello | hi | greetings -- any of these
"PCBoard" !pirate -- whole word PCBoard, exclude messages with 'pirate'
( spam | scam ) & ! "yahoo" -- spam or scam, but not the word 'yahoo'
"SMITH & JONES" -- literal phrase with an ampersand
foo bar -- the phrase 'foo bar' (NOT implicit AND)
foo & bar -- the words 'foo' and 'bar' (in any order)
===== Error handling =====
If the parser can't understand your query, the BBS prints
''%%Invalid search expression: %%'' and returns you to the
prompt (showing the inline help). Common reasons:
* ''unterminated quoted string'' — a ''%%"%%'' opened a quoted phrase but no closing ''%%"%%'' was found
* ''%%expected ')'%%'' — a ''('' opened a group but the matching '')'' is missing
* ''expected search term'' — an operator was followed by another operator or end of input
* ''%%unexpected '' at offset N%%'' — a stray '')'' or some other syntactically out-of-place character
===== Compatibility notes =====
The dialect is intentionally compatible with //PCBoard// and //Wildcat!//
conventions where they agree, and a strict superset elsewhere:
* All published //PCBoard// and //Wildcat!// example queries parse and evaluate the same way they did in those systems.
* Operator **keywords** (''AND'' / ''OR'' / ''NOT'') are accepted in addition to the symbolic forms — //PCBoard// only used the symbols; //Wildcat!// supported both.
* **Standard precedence** (NOT > AND > OR) is used instead of //PCBoard//'s documented strict left-to-right, but the two only differ on hand-crafted examples that no published manual used. Add parentheses if you depend on a specific grouping.
* An implicit AND is inserted before ''!'' / ''NOT'' (as in //Wildcat!//'s ''(windows %%|%% DOS) & (modem %%|%% comm) !OS/2'' example). //PCBoard// required the ''&'' to be explicit; the implicit form is more permissive but never changes the meaning of an already-valid //PCBoard// query.
* //PCBoard// treated ''(text)'' (parens around a single word with no operator inside) as a search for the literal string ''(text)'' — including the parens. Synchronet always treats ''(...)'' as grouping, so ''(text)'' is equivalent to ''text''. If you want to search for parens literally, quote them: ''%%"(text)"%%''.
===== See Also =====
* [[user:msgbase|Message Bases]] — where ''F''ind on a sub-board uses this syntax
* [[user:mail|Electronic Mail]] — where ''/'' at the mail prompt uses this syntax
* [[user:files|File Areas]] — where ''F''ind in descriptions uses this syntax
* [[user:textfiles|Text Files]] — where the file pager's ''/'' uses this syntax
* [[custom:menu_files|Menu Files]] — ''textsrch.msg'' is the inline-help display file
{{tag>search messages mail files boolean}}