The datasets that have been generated by profiling typical TVM layer workloads on different target devices
Target Device | Compiler Backend | Conv2d | Dense | AvgPool2d | MaxPool2d | Dilated Conv2d | Depthwise Conv2d |
---|---|---|---|---|---|---|---|
Tesla A100 | TVM 0.9 - cuda | 3227/3245 | 18421/18862 | 4223/4223 | 4168/4168 | 7887/10504 | 2022/5173 |
Tesla K80 | TVM 0.9 - cuda | 2719/2734 | 25059/25064 | 3905/3905 | 3854/3854 | 7890/10520 | 1950/5062 |
Geforce 980ti 250W | TVM 0.9 - cuda | 2690/2705 | 23629/23634 | 4692/4692 | 3299/3299 | 7857/7902 | 1934/2523 |
Intel Xeon E5-2680 | TVM 0.9 - llvm | 2805/2820 | 15510/15511 | 2917/2917 | 2821/2821 | 8511/9131 | 2584/5779 |
Raspberry Pi 4B | TVM 0.9 - llvm | 1812/9031 | 00000/00000 | 0000/0000 | 0000/0000 | 2470/9938 | 0000/0000 |
ODroid XU4 (GPU) | TVM 0.9 - opencl | 2098/2110 | 00087/00087 | 0000/0000 | 0000/0000 | 0000/0000 | 2764/10641 |
Target Device Name | Version,Flow | unique/total | unique/total | unique/total | unique/total | unique/total | unique/total |