from small one page howto to huge articles all in one place
poll results
Last additions:
May 25th. 2007:
April, 26th. 2006:
|
You are here: manpages
UNICODE::WORDBREAK
Section: Courier Unicode Library (3) Updated: 07/29/2015 Index
Return to Main Contents
NAME
unicode::wordbreak_callback_base, unicode::wordbreak_callback_base - unicode word-breaking rules
SYNOPSIS
#include <courier-unicode.h>
class wordbreak : public unicode::wordbreak_callback_base {
public:
using unicode::wordbreak_callback_base::operator<<;
using unicode::wordbreak_callback_base::operator();
int callback(bool flag)
{
// ...
}
};
unicode_char c;
std::vector<unicode_char> buf;
wordbreak compute_wordbreak;
compute_wordbreak << c;
compute_wordbreak(buf);
compute_wordbreak(buf.begin(), buf.end());
compute_wordbreak.finish();
// ...
unicode_wordbreakscan scan;
scan << c;
size_t nchars=scan.finish();
DESCRIPTION
unicode::wordbreak_callback_base
is a C++ binding for the unicode word-breaking rule implementation described in
unicode_word_break(3).
Subclass
unicode::wordbreak_callback_base
and implement
callback() that's virtually inherited from
unicode::wordbreak_callback_base. The
callback() callback function receives the output values from the word-breaking algorithm, namely a
bool
indicating whether a word break exists before the unicode character in the underlying input sequence.
callback() should return 0. A non-zero return reports an error, that stops the word-breaking algorithm. See
unicode_word_break(3)
for more information.
The input unicode characters for the word-breaking algorithm are provided by the
<<
operator, one unicode character at a time; or by the
()
operator, passing either a container, or a beginning and an ending iterator value for an input sequence of unicode characters.
finish() indicates the end of the unicode character sequence.
unicode::wordbreakscan
is a C++ binding for the
unicode_wbscan_init(),
unicode_wbscan_next() and
unicode_wbscan_end
methods described in
unicode_word_break(3). Its
<<
iterates over the unicode characters, and
finish() indicates the number of characters before the first unicode word break. The
<<
iterator returns a
bool
indicating when the first word break has already been found, so further calls are not necessary.
SEE ALSO
courier-unicode(7),
unicode_word_break(3).
AUTHOR
Sam Varshavchik
-
Author
Index
- NAME
-
- SYNOPSIS
-
- DESCRIPTION
-
- SEE ALSO
-
- AUTHOR
-
|