Struct bstr::WordsWithBreakIndices[][src]

pub struct WordsWithBreakIndices<'a> { /* fields omitted */ }

An iterator over all word breaks in a byte string, along with their byte index positions.

This iterator is typically constructed by ByteSlice::words_with_break_indices.

This iterator yields not only all words, but the content that comes between words. In particular, if all elements yielded by this iterator are concatenated, then the result is the original string (subject to Unicode replacement codepoint substitutions).

Since words are made up of one or more codepoints, this iterator yields &str elements (along with their start and end byte offsets). When invalid UTF-8 is encountered, replacement codepoints are substituted. Because of this, the indices yielded by this iterator may not correspond to the length of the word yielded with those indices. For example, when this iterator encounters \xFF in the byte string, then it will yield a pair of indices ranging over a single byte, but will provide an &str equivalent to "\u{FFFD}", which is three bytes in length. However, when given only valid UTF-8, then all indices are in exact correspondence with their paired word.

This iterator yields words in accordance with the default word boundary rules specified in UAX #29. In particular, this may not be suitable for Japanese and Chinese scripts that do not use spaces between words.

Implementations

impl<'a> WordsWithBreakIndices<'a>[src]

pub fn as_bytes(&self) -> &'a [u8]

Notable traits for &'_ [u8]

impl<'_> Read for &'_ [u8]impl<'_> Write for &'_ mut [u8]
[src]

View the underlying data as a subslice of the original data.

The slice returned has the same lifetime as the original slice, and so the iterator can continue to be used while this exists.

Examples

use bstr::ByteSlice;

let mut it = b"foo bar baz".words_with_break_indices();

assert_eq!(b"foo bar baz", it.as_bytes());
it.next();
assert_eq!(b" bar baz", it.as_bytes());
it.next();
it.next();
assert_eq!(b" baz", it.as_bytes());
it.next();
it.next();
assert_eq!(b"", it.as_bytes());

Trait Implementations

impl<'a> Clone for WordsWithBreakIndices<'a>[src]

impl<'a> Debug for WordsWithBreakIndices<'a>[src]

impl<'a> Iterator for WordsWithBreakIndices<'a>[src]

type Item = (usize, usize, &'a str)

The type of the elements being iterated over.

Auto Trait Implementations

impl<'a> RefUnwindSafe for WordsWithBreakIndices<'a>

impl<'a> Send for WordsWithBreakIndices<'a>

impl<'a> Sync for WordsWithBreakIndices<'a>

impl<'a> Unpin for WordsWithBreakIndices<'a>

impl<'a> UnwindSafe for WordsWithBreakIndices<'a>

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<I> IntoIterator for I where
    I: Iterator
[src]

type Item = <I as Iterator>::Item

The type of the elements being iterated over.

type IntoIter = I

Which kind of iterator are we turning this into?

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

The resulting type after obtaining ownership.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.