mcoffey
Joined: 26 Mar 2011 Posts: 15
|
Posted: Sun Sep 18, 2011 12:57 pm Post subject: large matrix multiply |
|
|
Im trying to multiply large matrices on a CUDA device and want to know if my supposition is correct.
I assume if a device has limited RAM then the calling routines need to break the matrices into blocks to manipulate on the device. Im currently trying to multiply 2 matrices of around 1.75GB each and am writing a wrapper to do it in blocks. However, are there routines already out there that do this? Has anyone already done this? The idea is to produce a wrapper that will automatically break a matrix into the required number of blocks appropriate to the number of devices and power available.
Im trying to matrix multiply then invert a matrix of 30,000 * 50,000 full precision
Thanks for any guidance |
|