xfs: fix rmap inefficiencies

Reduce the performance impact of the reverse mapping btree when reflink
is enabled by using the much faster non-overlapped btree lookup
functions when we're searching the rmap index with a fully specified
key.  If we find the exact record we're looking for, great!  We don't
have to perform the full overlapped scan.  For filesystems with high
sharing factors this reduces the xfs_scrub runtime by a good 15%%.

This has been shown to reduce the fstests runtime for realtime rmap
configurations by 30%%, since the lack of AGs severely limits
scalability.

v2: simplify the non-overlapped lookup code per dave comments