Struct bstr::WordIndices[][src]

pub struct WordIndices<'a>(_);

An iterator over words in a byte string and their byte index positions.

This iterator is typically constructed by ByteSlice::word_indices.

This is similar to the WordsWithBreakIndices iterator, except it only returns elements that contain a “word” character. A word character is defined by UTS #18 (Annex C) to be the combination of the Alphabetic and Join_Control properties, along with the Decimal_Number, Mark and Connector_Punctuation general categories.

Since words are made up of one or more codepoints, this iterator yields &str elements (along with their start and end byte offsets). When invalid UTF-8 is encountered, replacement codepoints are substituted. Because of this, the indices yielded by this iterator may not correspond to the length of the word yielded with those indices. For example, when this iterator encounters \xFF in the byte string, then it will yield a pair of indices ranging over a single byte, but will provide an &str equivalent to "\u{FFFD}", which is three bytes in length. However, when given only valid UTF-8, then all indices are in exact correspondence with their paired word.

This iterator yields words in accordance with the default word boundary rules specified in UAX #29. In particular, this may not be suitable for Japanese and Chinese scripts that do not use spaces between words.

Implementations

impl<'a> WordIndices<'a>[src]

pub fn as_bytes(&self) -> &'a [u8]

Notable traits for &'_ [u8]

impl<'_> Read for &'_ [u8]impl<'_> Write for &'_ mut [u8]
[src]

View the underlying data as a subslice of the original data.

The slice returned has the same lifetime as the original slice, and so the iterator can continue to be used while this exists.

Examples

use bstr::ByteSlice;

let mut it = b"foo bar baz".word_indices();

assert_eq!(b"foo bar baz", it.as_bytes());
it.next();
it.next();
assert_eq!(b" baz", it.as_bytes());
it.next();
it.next();
assert_eq!(b"", it.as_bytes());

Trait Implementations

impl<'a> Clone for WordIndices<'a>[src]

impl<'a> Debug for WordIndices<'a>[src]

impl<'a> Iterator for WordIndices<'a>[src]

type Item = (usize, usize, &'a str)

The type of the elements being iterated over.

Auto Trait Implementations

impl<'a> RefUnwindSafe for WordIndices<'a>

impl<'a> Send for WordIndices<'a>

impl<'a> Sync for WordIndices<'a>

impl<'a> Unpin for WordIndices<'a>

impl<'a> UnwindSafe for WordIndices<'a>

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<I> IntoIterator for I where
    I: Iterator
[src]

type Item = <I as Iterator>::Item

The type of the elements being iterated over.

type IntoIter = I

Which kind of iterator are we turning this into?

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

The resulting type after obtaining ownership.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.