need to refine the design of DeviceContext #6415

QiJune · 2017-12-08T07:24:20Z

There are two problems of current DeviceContext design:

The base class DeviceContext has a interface called GetEigenDevice.

Lines 38 to 51 in 00b64f6

    
           class DeviceContext { 
        
            public: 
        
             virtual ~DeviceContext() {} 
        
             virtual Place GetPlace() const = 0; 
        
             template <typename PlaceType, 
        
                       typename DeviceType = 
        
                           typename EigenDeviceConverter<PlaceType>::EigenDeviceType> 
        
             DeviceType* GetEigenDevice() const; 
        
             virtual void Wait() const {} 
        
             virtual void Finish() const {} 
        
           };

However, Eigen is not supported in all kinds of Device, e.g. AMD Graphics card. It should be moved to derived DeviceContext class.

The math functors in operators/math directory take Place template parameter.

Paddle/paddle/operators/math/math_function.cu

Lines 24 to 45 in 00b64f6

    
           template <> 
        
           void gemm<platform::GPUPlace, float>(const platform::DeviceContext& context, 
        
                                                const CBLAS_TRANSPOSE transA, 
        
                                                const CBLAS_TRANSPOSE transB, const int M, 
        
                                                const int N, const int K, 
        
                                                const float alpha, const float* A, 
        
                                                const float* B, const float beta, 
        
                                                float* C) { 
        
             // Note that cublas follows fortran order, so the order is different from 
        
             // the cblas convention. 
        
             int lda = (transA == CblasNoTrans) ? K : M; 
        
             int ldb = (transB == CblasNoTrans) ? N : K; 
        
             cublasOperation_t cuTransA = 
        
                 (transA == CblasNoTrans) ? CUBLAS_OP_N : CUBLAS_OP_T; 
        
             cublasOperation_t cuTransB = 
        
                 (transB == CblasNoTrans) ? CUBLAS_OP_N : CUBLAS_OP_T; 
        
             PADDLE_ENFORCE(platform::dynload::cublasSgemm( 
        
                 reinterpret_cast<const platform::CUDADeviceContext&>(context) 
        
                     .cublas_handle(), 
        
                 cuTransB, cuTransA, N, M, K, &alpha, B, ldb, A, lda, &beta, C, N)); 
        
           }

We have to cast the DeviceContext to CUDADeviceContext even though we have already know we are implementing a CUDA version of the functor.
Instead, we'd better to take DeviceContext as the template parameter.

At the same time, the template parameter in OpKernel should also be DeviceContext instead of Place

The text was updated successfully, but these errors were encountered:

This was referenced Dec 8, 2017

Multi-device support #6403

Closed

Refine device context #6433

Merged

QiJune closed this as completed in #6433 Dec 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

need to refine the design of DeviceContext #6415

need to refine the design of DeviceContext #6415

QiJune commented Dec 8, 2017 •

edited by tonyyang-svail

Loading

need to refine the design of DeviceContext #6415

need to refine the design of DeviceContext #6415

Comments

QiJune commented Dec 8, 2017 • edited by tonyyang-svail Loading

QiJune commented Dec 8, 2017 •

edited by tonyyang-svail

Loading