• <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>

            chaogu ---大寫的人!

            LOG-2011-04

             

            //============================================================

            //============================================================

            DATE:2011-4-12

            TIME:01:18

            ICBC.pdf –finish

            //============================================================

            //============================================================

            DATE:2011-4-15

            TIME:00:00

            Reading the “NoSQL Datebase”

               Reason for use NoSQL

            1. Avoidance of Unneeded Complexity

            2. High Throughput

            3. Horizontal Scalability and Running on Commodity Hardware

            4. Avoidance of Expensive Object-Relational Mapping

            5. Complexity and Cost of Setting up Database Clusters

            6. Compromising Reliability for Better Performance

            7. The Current “One size fit’s it all” Databases Thinking Was and Is Wrong

            8. The Myth of Effortless Distribution and Partitioning of Centralized Data Models

            9. Movements in Programming Languages and Development Frameworks

            10. Requirements of Cloud Computing

            11. The RDBMS plus Caching-Layer Pattern/Workaround vs. Systems Built from Scratch with Scalability in Mind

            12. Yesterday’s vs. Today’s Needs

            Nosqldbs.pdf ----page19

             

            //============================================================
            //============================================================
            DATE:2011-4-16

            TIME:00:24

            Reading the cudaArticle—05

            A multiprocessor takes four clock cycles to issue one memory instruction for a "warp"

            Accessing local or global memory incurs an additional 400 to 600 clock cycles of memory latency

            -----------------------------------

            Cuda Memory

            Registers:

            The fastest form of memory on the multi-processor.

            Is only accessible by the thread.

            Has the lifetime of the thread.

            Shared Memory:

            Can be as fast as a register when there are no bank conflicts or when reading from the same address.

            Accessible by any thread of the block from which it was created.

            Has the lifetime of the block.

            Global memory:

            Potentially 150x slower than register or shared memory -- watch out for uncoalesced reads and writes which will be discussed in the next column.

            Accessible from either the host or device.

            Has the lifetime of the application.

            Local memory:

            A potential performance gotcha, it resides in global memory and can be 150x slower than register or shared memory.

            Is only accessible by the thread.

            Has the lifetime of the thread.

             

            // includes, system
            #include <stdio.h>
            #include <assert.h>
             
            // Simple utility function to check for CUDA runtime errors
            void checkCUDAError(const char* msg);
             
            // Part 2 of 2: implement the fast kernel using shared memory
            __global__ void reverseArrayBlock(int *d_out, int *d_in)
            {
                extern __shared__ int s_data[];
             
                int inOffset = blockDim.x * blockIdx.x;
                int in = inOffset + threadIdx.x;
             
                // Load one element per thread from device memory and store it 
                // *in reversed order* into temporary shared memory
                s_data[blockDim.x - 1 - threadIdx.x] = d_in[in];
             
            // Block until all threads in the block have written 
            //their data to shared mem
                __syncthreads();
             
                // write the data from shared memory in forward order, 
                // but to the reversed block offset as before
             
                int outOffset = blockDim.x * (gridDim.x - 1 - blockIdx.x);
             
                int out = outOffset + threadIdx.x;
                d_out[out] = s_data[threadIdx.x];
            }
             
            ////////////////////////////////////////////////////////////////////
            // Program main
            ////////////////////////////////////////////////////////////////////
            int main( int argc, char** argv) 
            {
                // pointer for host memory and size
                int *h_a;
                int dimA = 256 * 1024; // 256K elements (1MB total)
             
                // pointer for device memory
                int *d_b, *d_a;
             
                // define grid and block size
                int numThreadsPerBlock = 256;
             
            // Compute number of blocks needed based on array size 
            //and desired block size
                int numBlocks = dimA / numThreadsPerBlock; 
             
                // Part 1 of 2: Compute the number of bytes of shared memory needed
                // This is used in the kernel invocation below
                int sharedMemSize = numThreadsPerBlock * sizeof(int);
             
                // allocate host and device memory
                size_t memSize = numBlocks * numThreadsPerBlock * sizeof(int);
                h_a = (int *) malloc(memSize);
                cudaMalloc( (void **) &d_a, memSize );
                cudaMalloc( (void **) &d_b, memSize );
             
                // Initialize input array on host
                for (int i = 0; i < dimA; ++i) {
                    h_a[i] = i;
                }
             
                // Copy host array to device array
                cudaMemcpy( d_a, h_a, memSize, cudaMemcpyHostToDevice );
             
                // launch kernel
                dim3 dimGrid(numBlocks);
                dim3 dimBlock(numThreadsPerBlock);
            reverseArrayBlock<<< dimGrid, dimBlock, sharedMemSize >>>( d_b, d_a );
             
                // block until the device has completed
                cudaThreadSynchronize();
             
                // check if kernel execution generated an error
                // Check for any CUDA errors
                checkCUDAError("kernel invocation");
             
                // device to host copy
                cudaMemcpy( h_a, d_b, memSize, cudaMemcpyDeviceToHost );
             
                // Check for any CUDA errors
                checkCUDAError("memcpy");
             
                // verify the data returned to the host is correct
                for (int i = 0; i < dimA; i++){
                    assert(h_a[i] == dimA - 1 - i );
                }
             
                // free device memory
                cudaFree(d_a);
                cudaFree(d_b);
             
                // free host memory
                free(h_a);
             
            // If the program makes it this far, 
            //then the results are correct and
                // there are no run-time errors. Good work!
                printf("Correct!\n");
             
                return 0;
            }
             
            void checkCUDAError(const char *msg)
            {
                cudaError_t err = cudaGetLastError();
                if( cudaSuccess != err) 
                {
                    fprintf(stderr, "Cuda error: %s: %s.\n", msg, 
                                      cudaGetErrorString( err) );
                    exit(EXIT_FAILURE);
                }                         
            }

             

            //============================================================

            TIME:01:16

            Finsh reading the cudaArticle 06

             

            //============================================================

            DATE:2011-4-23

            TIME:09:31

            Reading berkeley view on cloud computing

               Page 10 classes of utility computing

             

            //============================================================

            DATE:2011-4-24

            TIME:00:16

            Reading Makefile.pdf

             

            --------------------------------------------------------------

            List macros specified by defalut(Makefile)

               Using : make –p

            $@ name of target

            $? List of dependents

            $^ gives all dependencies,whether more recent than the target

            $+ same as above,but keep the duplicate names

            $< the first dependencies

             

            --------------------------------------------------------------

            Reading berkeley view on cloud computing

               Page 19 Number 5 Obstacle: Performance Unpredictability

             

            //============================================================

            //============================================================

            DATE:2011-4-25

            TIME:01:40

            Finish reading Berkeley view on cloud computing

             

            //============================================================

            //============================================================

            DATE:2011-4-28

            TIME:21:22

            Coding the motion project

            The Visual Studio 2005 return an error that stack overflow

            “Unhandled exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.”

             

            --------------------------------------------------------------

            'motion.exe': Unloaded 'C:\WINDOWS\WinSxS\x86_Microsoft.VC80.CRT_1fc8b3b9a1e18e3b_8.0.50727.4053_x-ww_e6967989\msvcr80.dll'

            'motion.exe': Unloaded 'C:\WINDOWS\system32\psapi.dll'

            'motion.exe': Unloaded 'C:\WINDOWS\system32\shimeng.dll'

            First-chance exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.

            Unhandled exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.

            The program '[2388] motion.exe: Native' has exited with code 0 (0x0).

            --------------------------------------------------------------

            Problem: using huge big objet

             

            //============================================================

            //============================================================

            DATE:2011-4-30

            TIME:01:40

            Coding CSE332 project 2

               Adding other data-counter Implementations

             

            posted on 2011-05-03 21:57 chaogu 閱讀(648) 評論(0)  編輯 收藏 引用

            導航

            <2011年5月>
            24252627282930
            1234567
            891011121314
            15161718192021
            22232425262728
            2930311234

            統計

            常用鏈接

            留言簿(1)

            隨筆檔案

            搜索

            最新評論

            閱讀排行榜

            評論排行榜

            久久天堂AV综合合色蜜桃网| 国产精品青草久久久久婷婷 | 久久精品不卡| 久久久久亚洲AV综合波多野结衣| 国产亚洲精午夜久久久久久| 久久超碰97人人做人人爱| 性高湖久久久久久久久| 国产一区二区精品久久| 欧美精品九九99久久在观看| 亚洲精品乱码久久久久久蜜桃图片| 久久青草国产手机看片福利盒子| 久久久精品国产亚洲成人满18免费网站 | 欧美激情精品久久久久| 无码人妻久久一区二区三区蜜桃| 人妻无码中文久久久久专区| 久久久久亚洲AV成人网| 99久久99久久精品免费看蜜桃| 色8激情欧美成人久久综合电| 精品久久8x国产免费观看| 噜噜噜色噜噜噜久久| 国产99久久久国产精品~~牛| 中文字幕久久久久人妻| 色天使久久综合网天天| 国产一区二区三精品久久久无广告| 国产亚洲美女精品久久久2020| 久久精品成人| 久久国产香蕉一区精品| 91久久精品电影| 久久99国产精品久久久| 久久久无码精品亚洲日韩蜜臀浪潮 | 亚洲午夜久久久久久噜噜噜| 久久综合九色综合久99| 国产午夜精品理论片久久| 精品久久久久久| 久久综合综合久久97色| 99久久伊人精品综合观看| 亚洲国产成人久久综合碰碰动漫3d| 国产午夜福利精品久久2021| 久久精品国产亚洲av麻豆色欲 | 97久久婷婷五月综合色d啪蜜芽 | 久久99免费视频|