crates/memchr/src/lib.rs - platform/external/rust/android-crates-io - Git at Google

 /*!
 This library provides heavily optimized routines for string search primitives.

 # Overview

 This section gives a brief high level overview of what this crate offers.

 * The top-level module provides routines for searching for 1, 2 or 3 bytes
   in the forward or reverse direction. When searching for more than one byte,
   positions are considered a match if the byte at that position matches any
   of the bytes.
 * The [`memmem`] sub-module provides forward and reverse substring search
   routines.

 In all such cases, routines operate on `&[u8]` without regard to encoding. This
 is exactly what you want when searching either UTF-8 or arbitrary bytes.

 # Example: using `memchr`

 This example shows how to use `memchr` to find the first occurrence of `z` in
 a haystack:

 ```
 use memchr::memchr;

 let haystack = b"foo bar baz quuz";
 assert_eq!(Some(10), memchr(b'z', haystack));
 ```

 # Example: matching one of three possible bytes

 This examples shows how to use `memrchr3` to find occurrences of `a`, `b` or
 `c`, starting at the end of the haystack.

 ```
 use memchr::memchr3_iter;

 let haystack = b"xyzaxyzbxyzc";

 let mut it = memchr3_iter(b'a', b'b', b'c', haystack).rev();
 assert_eq!(Some(11), it.next());
 assert_eq!(Some(7), it.next());
 assert_eq!(Some(3), it.next());
 assert_eq!(None, it.next());
 ```

 # Example: iterating over substring matches

 This example shows how to use the [`memmem`] sub-module to find occurrences of
 a substring in a haystack.

 ```
 use memchr::memmem;

 let haystack = b"foo bar foo baz foo";

 let mut it = memmem::find_iter(haystack, "foo");
 assert_eq!(Some(0), it.next());
 assert_eq!(Some(8), it.next());
 assert_eq!(Some(16), it.next());
 assert_eq!(None, it.next());
 ```

 # Example: repeating a search for the same needle

 It may be possible for the overhead of constructing a substring searcher to be
 measurable in some workloads. In cases where the same needle is used to search
 many haystacks, it is possible to do construction once and thus to avoid it for
 subsequent searches. This can be done with a [`memmem::Finder`]:

 ```
 use memchr::memmem;

 let finder = memmem::Finder::new("foo");

 assert_eq!(Some(4), finder.find(b"baz foo quux"));
 assert_eq!(None, finder.find(b"quux baz bar"));
 ```

 # Why use this crate?

 At first glance, the APIs provided by this crate might seem weird. Why provide
 a dedicated routine like `memchr` for something that could be implemented
 clearly and trivially in one line:

 ```
 fn memchr(needle: u8, haystack: &[u8]) -> Option<usize> {
     haystack.iter().position(|&b| b == needle)
 }
 ```

 Or similarly, why does this crate provide substring search routines when Rust's
 core library already provides them?

 ```
 fn search(haystack: &str, needle: &str) -> Option<usize> {
     haystack.find(needle)
 }
 ```

 The primary reason for both of them to exist is performance. When it comes to
 performance, at a high level at least, there are two primary ways to look at
 it:

 * **Throughput**: For this, think about it as, "given some very large haystack
   and a byte that never occurs in that haystack, how long does it take to
   search through it and determine that it, in fact, does not occur?"
 * **Latency**: For this, think about it as, "given a tiny haystack---just a
   few bytes---how long does it take to determine if a byte is in it?"

 The `memchr` routine in this crate has _slightly_ worse latency than the
 solution presented above, however, its throughput can easily be over an
 order of magnitude faster. This is a good general purpose trade off to make.
 You rarely lose, but often gain big.

 **NOTE:** The name `memchr` comes from the corresponding routine in `libc`. A
 key advantage of using this library is that its performance is not tied to its
 quality of implementation in the `libc` you happen to be using, which can vary
 greatly from platform to platform.

 But what about substring search? This one is a bit more complicated. The
 primary reason for its existence is still indeed performance, but it's also
 useful because Rust's core library doesn't actually expose any substring
 search routine on arbitrary bytes. The only substring search routine that
 exists works exclusively on valid UTF-8.

 So if you have valid UTF-8, is there a reason to use this over the standard
 library substring search routine? Yes. This routine is faster on almost every
 metric, including latency. The natural question then, is why isn't this
 implementation in the standard library, even if only for searching on UTF-8?
 The reason is that the implementation details for using SIMD in the standard
 library haven't quite been worked out yet.

 **NOTE:** Currently, only `x86_64`, `wasm32` and `aarch64` targets have vector
 accelerated implementations of `memchr` (and friends) and `memmem`.

 # Crate features

 * **std** - When enabled (the default), this will permit features specific to
 the standard library. Currently, the only thing used from the standard library
 is runtime SIMD CPU feature detection. This means that this feature must be
 enabled to get AVX2 accelerated routines on `x86_64` targets without enabling
 the `avx2` feature at compile time, for example. When `std` is not enabled,
 this crate will still attempt to use SSE2 accelerated routines on `x86_64`. It
 will also use AVX2 accelerated routines when the `avx2` feature is enabled at
 compile time. In general, enable this feature if you can.
 * **alloc** - When enabled (the default), APIs in this crate requiring some
 kind of allocation will become available. For example, the
 [`memmem::Finder::into_owned`](crate::memmem::Finder::into_owned) API and the
 [`arch::all::shiftor`](crate::arch::all::shiftor) substring search
 implementation. Otherwise, this crate is designed from the ground up to be
 usable in core-only contexts, so the `alloc` feature doesn't add much
 currently. Notably, disabling `std` but enabling `alloc` will **not** result
 in the use of AVX2 on `x86_64` targets unless the `avx2` feature is enabled
 at compile time. (With `std` enabled, AVX2 can be used even without the `avx2`
 feature enabled at compile time by way of runtime CPU feature detection.)
 * **logging** - When enabled (disabled by default), the `log` crate is used
 to emit log messages about what kinds of `memchr` and `memmem` algorithms
 are used. Namely, both `memchr` and `memmem` have a number of different
 implementation choices depending on the target and CPU, and the log messages
 can help show what specific implementations are being used. Generally, this is
 useful for debugging performance issues.
 * **libc** - **DEPRECATED**. Previously, this enabled the use of the target's
 `memchr` function from whatever `libc` was linked into the program. This
 feature is now a no-op because this crate's implementation of `memchr` should
 now be sufficiently fast on a number of platforms that `libc` should no longer
 be needed. (This feature is somewhat of a holdover from this crate's origins.
 Originally, this crate was literally just a safe wrapper function around the
 `memchr` function from `libc`.)
 */

 #![deny(missing_docs)]
 #![no_std]
 // It's just not worth trying to squash all dead code warnings. Pretty
 // unfortunate IMO. Not really sure how to fix this other than to either
 // live with it or sprinkle a whole mess of `cfg` annotations everywhere.
 #![cfg_attr(
     not(any(
         all(target_arch = "x86_64", target_feature = "sse2"),
         target_arch = "wasm32",
         target_arch = "aarch64",
     )),
     allow(dead_code)
 )]
 // Same deal for miri.
 #![cfg_attr(miri, allow(dead_code, unused_macros))]

 // Supporting 8-bit (or others) would be fine. If you need it, please submit a
 // bug report at https://github.com/BurntSushi/memchr
 #[cfg(not(any(
     target_pointer_width = "16",
     target_pointer_width = "32",
     target_pointer_width = "64"
 )))]
 compile_error!("memchr currently not supported on non-{16,32,64}");

 #[cfg(any(test, feature = "std"))]
 extern crate std;

 #[cfg(any(test, feature = "alloc"))]
 extern crate alloc;

 pub use crate::memchr::{
     memchr, memchr2, memchr2_iter, memchr3, memchr3_iter, memchr_iter,
     memrchr, memrchr2, memrchr2_iter, memrchr3, memrchr3_iter, memrchr_iter,
     Memchr, Memchr2, Memchr3,
 };

 #[macro_use]
 mod macros;

 #[cfg(test)]
 #[macro_use]
 mod tests;

 pub mod arch;
 mod cow;
 mod ext;
 mod memchr;
 pub mod memmem;
 mod vector;
	/*!
	This library provides heavily optimized routines for string search primitives.

	# Overview

	This section gives a brief high level overview of what this crate offers.

	* The top-level module provides routines for searching for 1, 2 or 3 bytes
	in the forward or reverse direction. When searching for more than one byte,
	positions are considered a match if the byte at that position matches any
	of the bytes.
	* The [`memmem`] sub-module provides forward and reverse substring search
	routines.

	In all such cases, routines operate on `&[u8]` without regard to encoding. This
	is exactly what you want when searching either UTF-8 or arbitrary bytes.

	# Example: using `memchr`

	This example shows how to use `memchr` to find the first occurrence of `z` in
	a haystack:

	```
	use memchr::memchr;

	let haystack = b"foo bar baz quuz";
	assert_eq!(Some(10), memchr(b'z', haystack));
	```

	# Example: matching one of three possible bytes

	This examples shows how to use `memrchr3` to find occurrences of `a`, `b` or
	`c`, starting at the end of the haystack.

	```
	use memchr::memchr3_iter;

	let haystack = b"xyzaxyzbxyzc";

	let mut it = memchr3_iter(b'a', b'b', b'c', haystack).rev();
	assert_eq!(Some(11), it.next());
	assert_eq!(Some(7), it.next());
	assert_eq!(Some(3), it.next());
	assert_eq!(None, it.next());
	```

	# Example: iterating over substring matches

	This example shows how to use the [`memmem`] sub-module to find occurrences of
	a substring in a haystack.

	```
	use memchr::memmem;

	let haystack = b"foo bar foo baz foo";

	let mut it = memmem::find_iter(haystack, "foo");
	assert_eq!(Some(0), it.next());
	assert_eq!(Some(8), it.next());
	assert_eq!(Some(16), it.next());
	assert_eq!(None, it.next());
	```

	# Example: repeating a search for the same needle

	It may be possible for the overhead of constructing a substring searcher to be
	measurable in some workloads. In cases where the same needle is used to search
	many haystacks, it is possible to do construction once and thus to avoid it for
	subsequent searches. This can be done with a [`memmem::Finder`]:

	```
	use memchr::memmem;

	let finder = memmem::Finder::new("foo");

	assert_eq!(Some(4), finder.find(b"baz foo quux"));
	assert_eq!(None, finder.find(b"quux baz bar"));
	```

	# Why use this crate?

	At first glance, the APIs provided by this crate might seem weird. Why provide
	a dedicated routine like `memchr` for something that could be implemented
	clearly and trivially in one line:

	```
	fn memchr(needle: u8, haystack: &[u8]) -> Option<usize> {
	haystack.iter().position(\|&b\| b == needle)
	}
	```

	Or similarly, why does this crate provide substring search routines when Rust's
	core library already provides them?

	```
	fn search(haystack: &str, needle: &str) -> Option<usize> {
	haystack.find(needle)
	}
	```

	The primary reason for both of them to exist is performance. When it comes to
	performance, at a high level at least, there are two primary ways to look at
	it:

	* Throughput: For this, think about it as, "given some very large haystack
	and a byte that never occurs in that haystack, how long does it take to
	search through it and determine that it, in fact, does not occur?"
	* Latency: For this, think about it as, "given a tiny haystack---just a
	few bytes---how long does it take to determine if a byte is in it?"

	The `memchr` routine in this crate has _slightly_ worse latency than the
	solution presented above, however, its throughput can easily be over an
	order of magnitude faster. This is a good general purpose trade off to make.
	You rarely lose, but often gain big.

	NOTE: The name `memchr` comes from the corresponding routine in `libc`. A
	key advantage of using this library is that its performance is not tied to its
	quality of implementation in the `libc` you happen to be using, which can vary
	greatly from platform to platform.

	But what about substring search? This one is a bit more complicated. The
	primary reason for its existence is still indeed performance, but it's also
	useful because Rust's core library doesn't actually expose any substring
	search routine on arbitrary bytes. The only substring search routine that
	exists works exclusively on valid UTF-8.

	So if you have valid UTF-8, is there a reason to use this over the standard
	library substring search routine? Yes. This routine is faster on almost every
	metric, including latency. The natural question then, is why isn't this
	implementation in the standard library, even if only for searching on UTF-8?
	The reason is that the implementation details for using SIMD in the standard
	library haven't quite been worked out yet.

	NOTE: Currently, only `x86_64`, `wasm32` and `aarch64` targets have vector
	accelerated implementations of `memchr` (and friends) and `memmem`.

	# Crate features

	* std - When enabled (the default), this will permit features specific to
	the standard library. Currently, the only thing used from the standard library
	is runtime SIMD CPU feature detection. This means that this feature must be
	enabled to get AVX2 accelerated routines on `x86_64` targets without enabling
	the `avx2` feature at compile time, for example. When `std` is not enabled,
	this crate will still attempt to use SSE2 accelerated routines on `x86_64`. It
	will also use AVX2 accelerated routines when the `avx2` feature is enabled at
	compile time. In general, enable this feature if you can.
	* alloc - When enabled (the default), APIs in this crate requiring some
	kind of allocation will become available. For example, the
	[`memmem::Finder::into_owned`](crate::memmem::Finder::into_owned) API and the
	[`arch::all::shiftor`](crate::arch::all::shiftor) substring search
	implementation. Otherwise, this crate is designed from the ground up to be
	usable in core-only contexts, so the `alloc` feature doesn't add much
	currently. Notably, disabling `std` but enabling `alloc` will not result
	in the use of AVX2 on `x86_64` targets unless the `avx2` feature is enabled
	at compile time. (With `std` enabled, AVX2 can be used even without the `avx2`
	feature enabled at compile time by way of runtime CPU feature detection.)
	* logging - When enabled (disabled by default), the `log` crate is used
	to emit log messages about what kinds of `memchr` and `memmem` algorithms
	are used. Namely, both `memchr` and `memmem` have a number of different
	implementation choices depending on the target and CPU, and the log messages
	can help show what specific implementations are being used. Generally, this is
	useful for debugging performance issues.
	* libc - DEPRECATED. Previously, this enabled the use of the target's
	`memchr` function from whatever `libc` was linked into the program. This
	feature is now a no-op because this crate's implementation of `memchr` should
	now be sufficiently fast on a number of platforms that `libc` should no longer
	be needed. (This feature is somewhat of a holdover from this crate's origins.
	Originally, this crate was literally just a safe wrapper function around the
	`memchr` function from `libc`.)
	*/

	#![deny(missing_docs)]
	#![no_std]
	// It's just not worth trying to squash all dead code warnings. Pretty
	// unfortunate IMO. Not really sure how to fix this other than to either
	// live with it or sprinkle a whole mess of `cfg` annotations everywhere.
	#![cfg_attr(
	not(any(
	all(target_arch = "x86_64", target_feature = "sse2"),
	target_arch = "wasm32",
	target_arch = "aarch64",
	)),
	allow(dead_code)
	)]
	// Same deal for miri.
	#![cfg_attr(miri, allow(dead_code, unused_macros))]

	// Supporting 8-bit (or others) would be fine. If you need it, please submit a
	// bug report at https://github.com/BurntSushi/memchr
	#[cfg(not(any(
	target_pointer_width = "16",
	target_pointer_width = "32",
	target_pointer_width = "64"
	)))]
	compile_error!("memchr currently not supported on non-{16,32,64}");

	#[cfg(any(test, feature = "std"))]
	extern crate std;

	#[cfg(any(test, feature = "alloc"))]
	extern crate alloc;

	pub use crate::memchr::{
	memchr, memchr2, memchr2_iter, memchr3, memchr3_iter, memchr_iter,
	memrchr, memrchr2, memrchr2_iter, memrchr3, memrchr3_iter, memrchr_iter,
	Memchr, Memchr2, Memchr3,
	};

	#[macro_use]
	mod macros;

	#[cfg(test)]
	#[macro_use]
	mod tests;

	pub mod arch;
	mod cow;
	mod ext;
	mod memchr;
	pub mod memmem;
	mod vector;