dddh@node9 ~/g/cusp/performance/spmv $ nvcc -O2 -o spmv spmv.cu -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=\"sm_35,compute_35\" -I/home/dddh/g -I../../ dddh@node9 ~/g/cusp/performance/spmv $ time ./spmv ~/g/m/pwtk.mtx --value_type=double --device=0 Computing SpMV with 'double' values. There are 2 devices supporting CUDA Device 0: "GeForce GTX TITAN" Major revision number: 3 Minor revision number: 5 Total amount of global memory: 6442123264 bytes Device 1: "GeForce GTX 680" Major revision number: 3 Minor revision number: 0 Total amount of global memory: 2147287040 bytes Running on Device 0 Read matrix (/home/dddh/g/m/pwtk.mtx) with shape (217918,217918) and 11634424 entries coo_flat : 2.2136 ms ( 10.51 GFLOP/s 127.7 GB/s) [L2 error 0.000000] coo_flat_tex : 1.8492 ms ( 12.58 GFLOP/s 152.9 GB/s) [L2 error 0.000000] csr_scalar : 6.5901 ms ( 3.53 GFLOP/s 36.1 GB/s) [L2 error 0.000000] csr_scalar_tex : 5.3945 ms ( 4.31 GFLOP/s 44.1 GB/s) [L2 error 0.000000] csr_vector : 2.0344 ms ( 11.44 GFLOP/s 116.9 GB/s) [L2 error 0.000000] csr_vector_tex : 2.0404 ms ( 11.40 GFLOP/s 116.6 GB/s) [L2 error 0.000000] Refusing to convert to DIA format Refusing to convert to ELL format hyb : 1.0365 ms ( 22.45 GFLOP/s 229.7 GB/s) [L2 error 0.000000] hyb_tex : 1.1736 ms ( 19.83 GFLOP/s 202.9 GB/s) [L2 error 0.000000] real 0m16.576s user 0m8.032s sys 0m8.476s dddh@node9 ~/g/cusp/performance/spmv $ time ./spmv ~/g/m/pwtk.mtx --value_type=double --device=1 Computing SpMV with 'double' values. There are 2 devices supporting CUDA Device 0: "GeForce GTX TITAN" Major revision number: 3 Minor revision number: 5 Total amount of global memory: 6442123264 bytes Device 1: "GeForce GTX 680" Major revision number: 3 Minor revision number: 0 Total amount of global memory: 2147287040 bytes Running on Device 1 Read matrix (/home/dddh/g/m/pwtk.mtx) with shape (217918,217918) and 11634424 entries coo_flat : 2.8742 ms ( 8.10 GFLOP/s 98.4 GB/s) [L2 error 0.000000] coo_flat_tex : 2.4613 ms ( 9.45 GFLOP/s 114.9 GB/s) [L2 error 0.000000] csr_scalar : 18.2128 ms ( 1.28 GFLOP/s 13.1 GB/s) [L2 error 0.000000] csr_scalar_tex : 18.3211 ms ( 1.27 GFLOP/s 13.0 GB/s) [L2 error 0.000000] csr_vector : 3.1678 ms ( 7.35 GFLOP/s 75.1 GB/s) [L2 error 0.000000] csr_vector_tex : 3.1189 ms ( 7.46 GFLOP/s 76.3 GB/s) [L2 error 0.000000] Refusing to convert to DIA format Refusing to convert to ELL format hyb : 2.0665 ms ( 11.26 GFLOP/s 115.2 GB/s) [L2 error 0.000000] hyb_tex : 1.7925 ms ( 12.98 GFLOP/s 132.8 GB/s) [L2 error 0.000000] real 0m19.464s user 0m7.552s sys 0m11.900s dddh@node2:~/g/m> time ~/spmv ~/g/m/pwtk.mtx --value_type=double Computing SpMV with 'double' values. There is 1 device supporting CUDA Device 0: "GeForce GTX 580" Major revision number: 2 Minor revision number: 0 Total amount of global memory: 1610153984 bytes Running on Device 0 Read matrix (/home/dddh/g/m/pwtk.mtx) with shape (217918,217918) and 11634424 entries coo_flat : 3.1945 ms ( 7.28 GFLOP/s 88.5 GB/s) [L2 error 0.000000] coo_flat_tex : 2.7491 ms ( 8.46 GFLOP/s 102.8 GB/s) [L2 error 0.000000] csr_scalar : 14.6018 ms ( 1.59 GFLOP/s 16.3 GB/s) [L2 error 0.000000] csr_scalar_tex : 7.2921 ms ( 3.19 GFLOP/s 32.6 GB/s) [L2 error 0.000000] csr_vector : 2.0886 ms ( 11.14 GFLOP/s 113.9 GB/s) [L2 error 0.000000] csr_vector_tex : 1.8733 ms ( 12.42 GFLOP/s 127.0 GB/s) [L2 error 0.000000] Refusing to convert to DIA format Refusing to convert to ELL format hyb : 1.6643 ms ( 13.98 GFLOP/s 143.1 GB/s) [L2 error 0.000000] hyb_tex : 1.9237 ms ( 12.10 GFLOP/s 123.8 GB/s) [L2 error 0.000000] real 0m30.359s user 0m18.456s sys 0m11.724s