Fix broken links - 1st run
diff --git a/docs/INSTALL.md b/docs/INSTALL.md
index 9d1309f..906d3f8 100644
--- a/docs/INSTALL.md
+++ b/docs/INSTALL.md
@@ -60,10 +60,9 @@
 * unit: perform unit tests (based on cmocka)
 * help: shows these build options
 
-[Unless you are on Mac OS
-X](https://developer.apple.com/library/archive/qa/qa1118/_index.html), you can
-also build statically linked versions of the AFL++ binaries by passing the
-`STATIC=1` argument to make:
+[Unless you are on Mac OS X](https://developer.apple.com/library/archive/qa/qa1118/_index.html),
+you can also build statically linked versions of the AFL++ binaries by passing
+the `STATIC=1` argument to make:
 
 ```shell
 make STATIC=1
@@ -169,5 +168,5 @@
 ```
 
 See
-[https://www.spy-hill.com/help/apple/SharedMemory.html](https://www.spy-hill.com/help/apple/SharedMemory.html)
+[http://www.spy-hill.com/help/apple/SharedMemory.html](http://www.spy-hill.com/help/apple/SharedMemory.html)
 for documentation for these settings and how to make them permanent.
\ No newline at end of file
diff --git a/docs/afl-fuzz_approach.md b/docs/afl-fuzz_approach.md
index a72087c..0188893 100644
--- a/docs/afl-fuzz_approach.md
+++ b/docs/afl-fuzz_approach.md
@@ -243,9 +243,10 @@
   together two random inputs from the queue at some arbitrarily selected
   midpoint.
 - sync - a stage used only when `-M` or `-S` is set (see
-  [parallel_fuzzing.md](parallel_fuzzing.md)). No real fuzzing is involved, but
-  the tool scans the output from other fuzzers and imports test cases as
-  necessary. The first time this is done, it may take several minutes or so.
+  [fuzzing_in_depth.md:3c) Using multiple cores](fuzzing_in_depth.md#c-using-multiple-cores)).
+  No real fuzzing is involved, but the tool scans the output from other fuzzers
+  and imports test cases as necessary. The first time this is done, it may take
+  several minutes or so.
 
 The remaining fields should be fairly self-evident: there's the exec count
 progress indicator for the current stage, a global exec counter, and a benchmark
@@ -254,8 +255,8 @@
 time - and if it stays below 100, the job will probably take very long.
 
 The fuzzer will explicitly warn you about slow targets, too. If this happens,
-see the [perf_tips.md](perf_tips.md) file included with the fuzzer for ideas on
-how to speed things up.
+see the [best_practices.md#improving-speed](best_practices.md#improving-speed)
+for ideas on how to speed things up.
 
 ### Findings in depth
 
@@ -396,7 +397,8 @@
 
 If the value is shown in green, you are using fewer CPU cores than available on
 your system and can probably parallelize to improve performance; for tips on how
-to do that, see [parallel_fuzzing.md](parallel_fuzzing.md).
+to do that, see
+[fuzzing_in_depth.md:3c) Using multiple cores](fuzzing_in_depth.md#c-using-multiple-cores).
 
 If the value is shown in red, your CPU is *possibly* oversubscribed, and running
 additional fuzzers may not give you any benefits.
diff --git a/docs/env_variables.md b/docs/env_variables.md
index 86ebf25..0952b96 100644
--- a/docs/env_variables.md
+++ b/docs/env_variables.md
@@ -583,10 +583,11 @@
 
 The FRIDA wrapper used to instrument binary-only code supports many of the same
 options as `afl-qemu-trace`, but also has a number of additional advanced
-options. These are listed in brief below (see [here](../frida_mode/README.md)
-for more details). These settings are provided for compatibiltiy with QEMU mode,
-the preferred way to configure FRIDA mode is through its
-[scripting](../frida_mode/Scripting.md) support.
+options. These are listed in brief below (see
+[frida_mode/README.md](../frida_mode/README.md) for more details). These
+settings are provided for compatibiltiy with QEMU mode, the preferred way to
+configure FRIDA mode is through its [scripting](../frida_mode/Scripting.md)
+support.
 
 * `AFL_FRIDA_DEBUG_MAPS` - See `AFL_QEMU_DEBUG_MAPS`
 * `AFL_FRIDA_DRIVER_NO_HOOK` - See `AFL_QEMU_DRIVER_NO_HOOK`. When using the
@@ -627,7 +628,7 @@
   coverage information for unstable edges (e.g., to be loaded within IDA
   lighthouse).
 * `AFL_FRIDA_JS_SCRIPT` - Set the script to be loaded by the FRIDA scripting
-  engine. See [here](Scripting.md) for details.
+  engine. See [frida_mode/Scripting.md](../frida_mode/Scripting.md) for details.
 * `AFL_FRIDA_OUTPUT_STDOUT` - Redirect the standard output of the target
   application to the named file (supersedes the setting of `AFL_DEBUG_CHILD`)
 * `AFL_FRIDA_OUTPUT_STDERR` - Redirect the standard error of the target
diff --git a/docs/fuzzing_binary-only_targets.md b/docs/fuzzing_binary-only_targets.md
index eaed3a9..fd18b5c 100644
--- a/docs/fuzzing_binary-only_targets.md
+++ b/docs/fuzzing_binary-only_targets.md
@@ -107,10 +107,10 @@
 [frida_mode/README.md](../frida_mode/README.md).
 
 If possible, you should use the persistent mode, see
-[qemu_frida/README.md](../qemu_frida/README.md). The mode is approximately 2-5x
-slower than compile-time instrumentation, and is less conducive to
-parallelization. But for binary-only fuzzing, it gives a huge speed improvement
-if it is possible to use.
+[instrumentation/README.persistent_mode.md](../instrumentation/README.persistent_mode.md).
+The mode is approximately 2-5x slower than compile-time instrumentation, and is
+less conducive to parallelization. But for binary-only fuzzing, it gives a huge
+speed improvement if it is possible to use.
 
 If you want to fuzz a binary-only library, then you can fuzz it with frida-gum
 via frida_mode/. You will have to write a harness to call the target function in
diff --git a/docs/fuzzing_in_depth.md b/docs/fuzzing_in_depth.md
index 4a1ddf4..29e8f81 100644
--- a/docs/fuzzing_in_depth.md
+++ b/docs/fuzzing_in_depth.md
@@ -153,12 +153,12 @@
 
 There are many more options and modes available, however, these are most of the
 time less effective. See:
-* [instrumentation/README.ctx.md](../instrumentation/README.ctx.md)
-* [instrumentation/README.ngram.md](../instrumentation/README.ngram.md)
+* [instrumentation/README.llvm.md#6) AFL++ Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage)
+* [instrumentation/README.llvm.md#7) AFL++ N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage)
 
 AFL++ performs "never zero" counting in its bitmap. You can read more about this
 here:
-* [instrumentation/README.neverzero.md](../instrumentation/README.neverzero.md)
+* [instrumentation/README.llvm.md#8-neverzero-counters](../instrumentation/README.llvm.md#8-neverzero-counters)
 
 ### c) Selecting sanitizers
 
@@ -474,7 +474,8 @@
 
 ![resources/screenshot.png](resources/screenshot.png)
 
-All labels are explained in [status_screen.md](status_screen.md).
+All labels are explained in
+[afl-fuzz_approach.md#understanding-the-status-screen](afl-fuzz_approach.md#understanding-the-status-screen).
 
 ### b) Keeping memory use and timeouts in check
 
diff --git a/frida_mode/Scripting.md b/frida_mode/Scripting.md
index 63ab171..ad86fdd 100644
--- a/frida_mode/Scripting.md
+++ b/frida_mode/Scripting.md
@@ -109,8 +109,8 @@
 
 A persistent hook can be implemented using a conventional shared object, sample
 source code for a hook suitable for the prototype of `LLVMFuzzerTestOneInput`
-can be found in [hook/hook.c](hook/hook.c). This can be configured using code
-similar to the following.
+can be found in [hook/](hook/). This can be configured using code similar to the
+following.
 
 ```js
 const path = Afl.module.path;
diff --git a/instrumentation/README.llvm.md b/instrumentation/README.llvm.md
index fa02564..ca9ce93 100644
--- a/instrumentation/README.llvm.md
+++ b/instrumentation/README.llvm.md
@@ -234,4 +234,45 @@
 
 It is highly recommended to increase the MAP_SIZE_POW2 definition in config.h to
 at least 18 and maybe up to 20 for this as otherwise too many map collisions
-occur.
\ No newline at end of file
+occur.
+
+## 8) NeverZero counters
+
+In larger, complex, or reiterative programs, the byte sized counters that
+collect the edge coverage can easily fill up and wrap around. This is not that
+much of an issue - unless, by chance, it wraps just to a value of zero when the
+program execution ends. In this case, afl-fuzz is not able to see that the edge
+has been accessed and will ignore it.
+
+NeverZero prevents this behavior. If a counter wraps, it jumps over the value 0
+directly to a 1. This improves path discovery (by a very small amount) at a very
+low cost (one instruction per edge).
+
+(The alternative of saturated counters has been tested also and proved to be
+inferior in terms of path discovery.)
+
+This is implemented in afl-gcc and afl-gcc-fast, however, for llvm_mode this is
+optional if multithread safe counters are selected or the llvm version is below
+9 - as there are severe performance costs in these cases.
+
+If you want to enable this for llvm versions below 9 or thread safe counters,
+then set
+
+```
+export AFL_LLVM_NOT_ZERO=1
+```
+
+In case you are on llvm 9 or greater and you do not want this behavior, then you
+can set:
+
+```
+AFL_LLVM_SKIP_NEVERZERO=1
+```
+
+If the target does not have extensive loops or functions that are called a lot
+then this can give a small performance boost.
+
+Please note that the default counter implementations are not thread safe!
+
+Support for thread safe counters in mode LLVM CLASSIC can be activated with
+setting `AFL_LLVM_THREADSAFE_INST=1`.
\ No newline at end of file
diff --git a/utils/README.md b/utils/README.md
index 5f5745b..debc86e 100644
--- a/utils/README.md
+++ b/utils/README.md
@@ -48,7 +48,7 @@
   - defork               - intercept fork() in targets
 
   - distributed_fuzzing  - a sample script for synchronizing fuzzer instances
-                           across multiple machines (see parallel_fuzzing.md).
+                           across multiple machines.
 
   - libdislocator        - like ASAN but lightweight.