• <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>

            chaogu ---大寫的人!

            LOG-2011-04

             

            //============================================================

            //============================================================

            DATE:2011-4-12

            TIME:01:18

            ICBC.pdf –finish

            //============================================================

            //============================================================

            DATE:2011-4-15

            TIME:00:00

            Reading the “NoSQL Datebase”

               Reason for use NoSQL

            1. Avoidance of Unneeded Complexity

            2. High Throughput

            3. Horizontal Scalability and Running on Commodity Hardware

            4. Avoidance of Expensive Object-Relational Mapping

            5. Complexity and Cost of Setting up Database Clusters

            6. Compromising Reliability for Better Performance

            7. The Current “One size fit’s it all” Databases Thinking Was and Is Wrong

            8. The Myth of Effortless Distribution and Partitioning of Centralized Data Models

            9. Movements in Programming Languages and Development Frameworks

            10. Requirements of Cloud Computing

            11. The RDBMS plus Caching-Layer Pattern/Workaround vs. Systems Built from Scratch with Scalability in Mind

            12. Yesterday’s vs. Today’s Needs

            Nosqldbs.pdf ----page19

             

            //============================================================
            //============================================================
            DATE:2011-4-16

            TIME:00:24

            Reading the cudaArticle—05

            A multiprocessor takes four clock cycles to issue one memory instruction for a "warp"

            Accessing local or global memory incurs an additional 400 to 600 clock cycles of memory latency

            -----------------------------------

            Cuda Memory

            Registers:

            The fastest form of memory on the multi-processor.

            Is only accessible by the thread.

            Has the lifetime of the thread.

            Shared Memory:

            Can be as fast as a register when there are no bank conflicts or when reading from the same address.

            Accessible by any thread of the block from which it was created.

            Has the lifetime of the block.

            Global memory:

            Potentially 150x slower than register or shared memory -- watch out for uncoalesced reads and writes which will be discussed in the next column.

            Accessible from either the host or device.

            Has the lifetime of the application.

            Local memory:

            A potential performance gotcha, it resides in global memory and can be 150x slower than register or shared memory.

            Is only accessible by the thread.

            Has the lifetime of the thread.

             

            // includes, system
            #include <stdio.h>
            #include <assert.h>
             
            // Simple utility function to check for CUDA runtime errors
            void checkCUDAError(const char* msg);
             
            // Part 2 of 2: implement the fast kernel using shared memory
            __global__ void reverseArrayBlock(int *d_out, int *d_in)
            {
                extern __shared__ int s_data[];
             
                int inOffset = blockDim.x * blockIdx.x;
                int in = inOffset + threadIdx.x;
             
                // Load one element per thread from device memory and store it 
                // *in reversed order* into temporary shared memory
                s_data[blockDim.x - 1 - threadIdx.x] = d_in[in];
             
            // Block until all threads in the block have written 
            //their data to shared mem
                __syncthreads();
             
                // write the data from shared memory in forward order, 
                // but to the reversed block offset as before
             
                int outOffset = blockDim.x * (gridDim.x - 1 - blockIdx.x);
             
                int out = outOffset + threadIdx.x;
                d_out[out] = s_data[threadIdx.x];
            }
             
            ////////////////////////////////////////////////////////////////////
            // Program main
            ////////////////////////////////////////////////////////////////////
            int main( int argc, char** argv) 
            {
                // pointer for host memory and size
                int *h_a;
                int dimA = 256 * 1024; // 256K elements (1MB total)
             
                // pointer for device memory
                int *d_b, *d_a;
             
                // define grid and block size
                int numThreadsPerBlock = 256;
             
            // Compute number of blocks needed based on array size 
            //and desired block size
                int numBlocks = dimA / numThreadsPerBlock; 
             
                // Part 1 of 2: Compute the number of bytes of shared memory needed
                // This is used in the kernel invocation below
                int sharedMemSize = numThreadsPerBlock * sizeof(int);
             
                // allocate host and device memory
                size_t memSize = numBlocks * numThreadsPerBlock * sizeof(int);
                h_a = (int *) malloc(memSize);
                cudaMalloc( (void **) &d_a, memSize );
                cudaMalloc( (void **) &d_b, memSize );
             
                // Initialize input array on host
                for (int i = 0; i < dimA; ++i) {
                    h_a[i] = i;
                }
             
                // Copy host array to device array
                cudaMemcpy( d_a, h_a, memSize, cudaMemcpyHostToDevice );
             
                // launch kernel
                dim3 dimGrid(numBlocks);
                dim3 dimBlock(numThreadsPerBlock);
            reverseArrayBlock<<< dimGrid, dimBlock, sharedMemSize >>>( d_b, d_a );
             
                // block until the device has completed
                cudaThreadSynchronize();
             
                // check if kernel execution generated an error
                // Check for any CUDA errors
                checkCUDAError("kernel invocation");
             
                // device to host copy
                cudaMemcpy( h_a, d_b, memSize, cudaMemcpyDeviceToHost );
             
                // Check for any CUDA errors
                checkCUDAError("memcpy");
             
                // verify the data returned to the host is correct
                for (int i = 0; i < dimA; i++){
                    assert(h_a[i] == dimA - 1 - i );
                }
             
                // free device memory
                cudaFree(d_a);
                cudaFree(d_b);
             
                // free host memory
                free(h_a);
             
            // If the program makes it this far, 
            //then the results are correct and
                // there are no run-time errors. Good work!
                printf("Correct!\n");
             
                return 0;
            }
             
            void checkCUDAError(const char *msg)
            {
                cudaError_t err = cudaGetLastError();
                if( cudaSuccess != err) 
                {
                    fprintf(stderr, "Cuda error: %s: %s.\n", msg, 
                                      cudaGetErrorString( err) );
                    exit(EXIT_FAILURE);
                }                         
            }

             

            //============================================================

            TIME:01:16

            Finsh reading the cudaArticle 06

             

            //============================================================

            DATE:2011-4-23

            TIME:09:31

            Reading berkeley view on cloud computing

               Page 10 classes of utility computing

             

            //============================================================

            DATE:2011-4-24

            TIME:00:16

            Reading Makefile.pdf

             

            --------------------------------------------------------------

            List macros specified by defalut(Makefile)

               Using : make –p

            $@ name of target

            $? List of dependents

            $^ gives all dependencies,whether more recent than the target

            $+ same as above,but keep the duplicate names

            $< the first dependencies

             

            --------------------------------------------------------------

            Reading berkeley view on cloud computing

               Page 19 Number 5 Obstacle: Performance Unpredictability

             

            //============================================================

            //============================================================

            DATE:2011-4-25

            TIME:01:40

            Finish reading Berkeley view on cloud computing

             

            //============================================================

            //============================================================

            DATE:2011-4-28

            TIME:21:22

            Coding the motion project

            The Visual Studio 2005 return an error that stack overflow

            “Unhandled exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.”

             

            --------------------------------------------------------------

            'motion.exe': Unloaded 'C:\WINDOWS\WinSxS\x86_Microsoft.VC80.CRT_1fc8b3b9a1e18e3b_8.0.50727.4053_x-ww_e6967989\msvcr80.dll'

            'motion.exe': Unloaded 'C:\WINDOWS\system32\psapi.dll'

            'motion.exe': Unloaded 'C:\WINDOWS\system32\shimeng.dll'

            First-chance exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.

            Unhandled exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.

            The program '[2388] motion.exe: Native' has exited with code 0 (0x0).

            --------------------------------------------------------------

            Problem: using huge big objet

             

            //============================================================

            //============================================================

            DATE:2011-4-30

            TIME:01:40

            Coding CSE332 project 2

               Adding other data-counter Implementations

             

            posted on 2011-05-03 21:57 chaogu 閱讀(649) 評論(0)  編輯 收藏 引用

            導航

            <2011年5月>
            24252627282930
            1234567
            891011121314
            15161718192021
            22232425262728
            2930311234

            統計

            常用鏈接

            留言簿(1)

            隨筆檔案

            搜索

            最新評論

            閱讀排行榜

            評論排行榜

            伊人久久大香线蕉亚洲| 99国内精品久久久久久久| 久久人人爽人人人人片av| 国产69精品久久久久9999APGF| 久久久久久久久无码精品亚洲日韩| 久久不射电影网| 久久这里只有精品首页| 国产精品久久免费| 久久久这里有精品| 亚洲狠狠综合久久| 色偷偷88888欧美精品久久久| 国产激情久久久久影院小草| 一本久久a久久精品亚洲| 久久久久久国产精品免费免费| 久久精品国产亚洲av麻豆色欲| 四虎影视久久久免费| 久久精品国产亚洲网站| 欧美噜噜久久久XXX| 一本一道久久a久久精品综合| 久久精品国产99国产电影网 | 秋霞久久国产精品电影院| 性做久久久久久久久久久| 国产福利电影一区二区三区久久老子无码午夜伦不 | 国产高清国内精品福利99久久| 7777精品久久久大香线蕉| 久久精品成人免费国产片小草| 国产精品欧美久久久天天影视| 久久精品国产亚洲AV影院| 亚洲а∨天堂久久精品| 久久久久国产一级毛片高清板| 亚洲国产精品久久久久婷婷老年 | 久久久久亚洲av成人无码电影 | 久久久九九有精品国产| 精品熟女少妇a∨免费久久| 亚洲中文久久精品无码ww16 | 精品国产乱码久久久久久郑州公司| 狠狠色丁香久久婷婷综合图片| 久久一区二区三区99| 日本亚洲色大成网站WWW久久| 亚洲综合婷婷久久| 国产香蕉97碰碰久久人人|