This article will quickly explain the Rust types [T; N]
, &[T; N]
, &[T]
, Vec<T>
, &Vec<T>
with C code, and what the str
, &str
, String
, OsString
and CString
add.
Arrays and Slices
Rust | C |
---|---|
[T; N] (array)Example: [i32; 100] Allocated on the stack | T[N] Example: int[100] Allocated on the stack |
&[T; N] (array reference)Example: &[i32; 100] N is tracked at compilation. Bounds-checks are done at runtime (opt-out using get_unchecked ¹). | const T[N] in function parameters or const T* Example: const int[100] or const int* Partially tracked at compilation², no access bounds-checks at runtime. |
&mut [T; N] (exclusive array reference)Example: &mut [ Same as above, and allows writing to the array. | T[N] in function parameters or T* Example: int[100] or int* Same as above, and allows writing to the array. |
Box<[T; N]> (boxed array)Example: Box<[ Same as &mut [T; N] ³, but the underlying array is allocated on the heap. The memory is relinquished when the object is Drop ped. | T* from malloc() Example: int* Same as T* , but the underlying array is allocated on the heap. The memory must be relinquished manually by calling free() . |
&[T] (slice reference)Example: &[ The size is not fixed at compile time. It will be tracked using an additional variable along with the base pointer. The two make a “fat pointer”. As for &[T; N] , you can opt out of runtime bounds-checks using get_unchecked ³. | struct { const T *base; size_t size } Example: struct{const int *base;size_t size }In practice, many C functions will take the base pointers and the size as two separate parameters. For instance, you will see: memset(base, 0, size) . The compiler won’t perform any bounds-checks automatically. |
&mut [T] (exclusive slice reference)Example: &[mut i32] Same as &[i32] , and allows writing to the array. | struct { T *base; size_t size } Example: struct{ int *base; size_t size } Same as above, and allows writing to the array. |
Vec<T> (vector)Example: Vec<i32> Lets you push (append) an arbitrary number of elements. &Vec<T> can automatically be coerced to &[T] because Vec<T> implements Deref , and &mut Vec<T> can automatically be coerced to &mut [T] because Vec<T> implements DerefMut ⁴. | struct{T* data;size_t size;size_t avail} Example: struct { int *data; size_t size; size_t avail } Implementation of a dynamic array using realloc() . You can pass the data and size fields to functions that expect a T* and size_t parameters. |
¹ Note this is slice::get_unchecked
. Rust lets you coerce a &[T; N]
array into a &[T]
slice (a bit like a const int[N]
can decay into a int*
). When you write v.get_unchecked(0)
, it implicitly means (&v).get_unchecked()
. The compilers then figures out that it can use slice::get_unchecked
even though &v
is a reference to an array.
² From the point of view of the standard, T[N]
is just syntactic sugar for T*
, but compilers to emit warnings when they see an incorrect function call, such as in:
void f(int p[100]);
void g(void) {
int v[10];
f(v);
}
³ Technically, you will need the Box<[T; N]>
object itself to be mut
to modify the underlying array:
fn f() {
let mut v = Box::new([0; 100]);
v[2] = 1;
}
⁴ This means you can write:
fn f(v: &[i32]) { }
fn g(v: &mut [i32]) { }
fn main() {
let mut v = vec![1, 2, 3];
f(&v);
g(&mut v);
}
With this, we can map the following patterns:
Rust | C |
---|---|
&p[..] | base, size |
&p[2..] | base + 2, size - 2 |
&p[..40] | base, 40 |
&p[2..40] | base + 2, 38 |
The main difference is that the Rust version will do bound-checks, while the C version won’t. You can again use get_unchecked()
to opt out of these checks, as it works with ranges just as well as indices.
Strings
Once you understand arrays and slices well, strings become easy:
str
is a[u8]
which is guaranteed to contain valid UTF-8 dataString
is aVec<u8>
which is guaranteed to contain valid UTF-8 dataCStr
is a[u8]
which is guaranteed to be null-terminatedCString
is aVec<u8>
which is guaranteed to be null-terminatedOsStr
is a[u8]
which is guaranteed to contain data valid for the system’s API⁵OsString
is aVec<u8>
which is guaranteed to contain data valid for the system’s API⁵Path
is anOsStr
that is used to represent a pathPathBuf
is anOsString
that is used to represent a path
⁵ To understand why OsStr
/OsString
is different from CStr
/CString
, take a look at WTF-8.
Since str
is just a [u8]
, the patterns below work as well. The difference is that all the resulting &str
must still contain valid UTF-8. In other words, you cannot slice in the middle of the UTF-8 encoding of a codepoint. As usual, you can opt out of the automatic checks by using get_unchecked, but you will have undefined behavior if the range you pass cuts in the middle of a codepoint.
Rust | C |
---|---|
&s[..] | base, size |
&s[2..] | base + 2, size - 2 |
&s[..40] | base, 40 |
&s[2..40] | base + 2, 38 |