~/writing/dovecot-attr-pure-panic
A Dovecot optimization the compiler had been deleting all along
An IMAP THREAD command was panicking inside Dovecot's array code. The crash was real, but the comment above it described an optimization that hadn't run in any -O2 build ever shipped. ATTR_PURE turned it into dead code, and the same annotation decided which line showed up in the backtrace.
An IMAP THREAD command can bring a Dovecot process down with an assertion failure, deep in libdovecot's array code:
Panic: file array.c: line 10 (array_idx_modifiable_i):
assertion failed: (idx < array->buffer->used / array->element_size)That assertion is the array layer catching an OOB access: someone asked for element idx in a buffer that doesn't have idx + 1 elements. A bounds check. It fired from a code path that, on the face of it, had just allocated the array large enough. The interesting part isn't the crash. It's the comment one line above it, describing an optimization that hadn't run in any optimized build in years.
What the code thought it was doing
The fault lives in mail_thread_strmap_remap() in src/lib-storage/index/index-thread.c, the routine that renumbers string-map nodes Dovecot uses to thread a mailbox. It builds a fresh array of nodes and tries to be clever about it:
i_array_init(&new_nodes, new_count + invalid_count + 32);
/* optimization: allocate all nodes initially */
(void)array_idx_modifiable(&new_nodes, new_count-1);
for (i = 0; i < old_count; i++) {
if (idx_map[i] != 0) {
node = array_idx_modifiable(&new_nodes, idx_map[i]);
/* ... */
}
}The intent reads cleanly. Initialize the array, touch the last index once to force it to full size up front, so the per-iteration accesses in the loop never have to reallocate. Reasonable micro-optimization for a hot path.
It doesn't work. Two separate counts, and they compound into something more interesting than either alone.
Count one: the array call does not grow anything
i_array_init() sets the array's capacity but leaves logical length at zero. The buffer can hold new_count + invalid_count + 32 elements, but used == 0. Nothing's in it yet.
array_idx_modifiable() is the wrong tool to grow it. It returns a writable pointer to an element that already exists and asserts the index is in bounds. It doesn't extend used. So calling it with an index past the end of a zero-length array isn't a request to grow; it's an OOB access, and the assertion is there to catch exactly that.
The loop runs against an array that's still logically empty. The moment idx_map[i] != 0, the access at line 200 asks for a node that isn't there, and the bounds check fires. At the crash, new_count was 8 and used was still 0.
#7 array_idx_modifiable_i (array=..., idx=<optimized out>) at array.c:10
#8 mail_thread_strmap_remap (idx_map=..., old_count=...,
new_count=8, ...) at index-thread.c:200
#9 mail_index_strmap_view_renumber (...) at mail-index-strmap.c:867
#10 mail_index_strmap_write (...) at mail-index-strmap.c:1192
#11 mail_index_strmap_view_sync_commit (...) at mail-index-strmap.c:1239
#12 mail_thread_index_map_build (...) at index-thread.c:359
#13 mail_thread_init (...) at index-thread.c:573
#14 imap_thread (...) at cmd-thread.c:90
#15 cmd_thread (...) at cmd-thread.c:282Notice the crash frame is index-thread.c:200, the access inside the loop, not :188, the "optimization" line. That's the second count, and it's the one that took a minute to see.
Count two: the optimization was never there
If the pre-allocation line were doing its job, it would've grown the array (or crashed itself) before the loop started. It did neither. In an optimized build it didn't execute at all.
array_idx_modifiable_i(), the fn behind the array_idx_modifiable() macro, is declared ATTR_PURE in src/lib/array.h. ATTR_PURE is a promise to the compiler: no side effects, result depends only on its arguments. That license lets the compiler delete a call whose return value is thrown away, because by definition that call can't change anything observable.
And that's exactly what the original code did with it:
(void)array_idx_modifiable(&new_nodes, new_count-1);Cast to void, discarded. Under -O2, the compiler reads the ATTR_PURE promise, sees a pure result going nowhere, and removes the line. "optimization: allocate all nodes initially" has been dead code in every optimized build ever shipped. Not slow, not crashing anywhere. Just not there. The comment describes an intention the binary never carried out.
Why the crash moved
The pre-grow line gets elided because its result is discarded and the fn is pure. The line that actually crashes is inside the loop, where the result is assigned to node and can't be elided. The annotation that made the optimization vanish is the same annotation that decided which line shows up in the backtrace. The bug and its disguise are the same mechanism.
Reproducing it in six lines
I wanted to prove this against stock libdovecot, nothing a maintainer couldn't run themselves. The whole bug fits in a main():
#include "lib.h"
#include "array.h"
#include <stdio.h>
int main(void)
{
lib_init();
ARRAY(int) arr;
i_array_init(&arr, 16);
int *node = array_idx_modifiable(&arr, 1);
printf("unreachable: node=%p\n", (void *)node);
return 0;
}The key detail: node = ... followed by the printf. Sinking the return value through a real use defeats ATTR_PURE elision, so the call survives even under -O2. That mirrors line 200, where the real panic fires, rather than line 188, where it was hiding. Run it and you get the exact assertion string from frame #7, at both -O0 and -O2. Three lines of body, the entire defect.
The same pattern, i_array_init() followed by a discarded array_idx_modifiable() on a target index, also sits in mail_thread_root_thread_merge() in index-thread-finish.c. Same latent bug, same invisible optimization.
The fix was already in the file
The right call is array_idx_get_space(). No ATTR_PURE, so it never gets elided, and it actually grows the buffer through buffer_get_space_unsafe() instead of asserting. For the "give me a pointer to the element at this index" use, it's a strict generalization of array_idx_modifiable(): on a buffer that's already big enough the two return the same pointer, and on one that's too small array_idx_modifiable() asserts while array_idx_get_space() grows. Swapping one for the other changes no caller's contract.
/* optimization: allocate all nodes initially */
- (void)array_idx_modifiable(&new_nodes, new_count-1);
+ (void)array_idx_get_space(&new_nodes, new_count-1);I didn't have to invent the fix. The correct idiom was already in use a few hundred lines away in the same source tree, in index-thread-finish.c, on the same kind of array, doing the same job the broken comment claimed to do. One call site had been written correctly and two hadn't, and the two wrong ones were wrong in a way the optimizer quietly covered up.
What I take from it
ATTR_PURE isn't the villain. It's correct: array_idx_modifiable_i() really is pure, discarding its result really is pointless, and the compiler did exactly the right thing by deleting the call. The problem is that a comment isn't a test. "optimization: allocate all nodes initially" had been false for the entire life of the code, and nothing failed loudly enough to say so, because the latent OOB access only bites when the array has to grow past what a single iteration touches. A discarded call to a pure fn is a no-op the compiler is entitled to erase, and if your only evidence that code runs is that it's written down, you don't actually know it runs.
It goes upstream as a pure code-correctness fix: the reproducer, the annotated stack, the two-line patch. Every frame in that stack is a Dovecot symbol, which is what makes it a clean patch a maintainer can verify in five minutes instead of a vague report nobody can reproduce.