rearrange the neon filter code to be sharable among all of the "portable" functions
remove remaining special-case neon functions that are (no longer) faster than the portable ones
5 files changed