Multi-threaded implementation of RMA for distributed shared memory MPID_Win_fence(assert,win) * Mark the window epoch as closed so that we can detect attempts to perform unsynchronized RMA operations. NOTE: In the case of multiple threads calling RMA operations, closing the window epochs without waiting for other threads to finish their RMA calls will likely result in a race condition if the application is not properly synchronized. This is really an application error, so in the interest of high-performance, we make no attempt to minimize the non-determinism. * Atomically move the remote handler call counters to a buffer and zero the counters. This ensures that the ------------------------------------------------------------------------