• <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>

            chaogu ---大寫的人!

            LOG-2011-04

             

            //============================================================

            //============================================================

            DATE:2011-4-12

            TIME:01:18

            ICBC.pdf –finish

            //============================================================

            //============================================================

            DATE:2011-4-15

            TIME:00:00

            Reading the “NoSQL Datebase”

               Reason for use NoSQL

            1. Avoidance of Unneeded Complexity

            2. High Throughput

            3. Horizontal Scalability and Running on Commodity Hardware

            4. Avoidance of Expensive Object-Relational Mapping

            5. Complexity and Cost of Setting up Database Clusters

            6. Compromising Reliability for Better Performance

            7. The Current “One size fit’s it all” Databases Thinking Was and Is Wrong

            8. The Myth of Effortless Distribution and Partitioning of Centralized Data Models

            9. Movements in Programming Languages and Development Frameworks

            10. Requirements of Cloud Computing

            11. The RDBMS plus Caching-Layer Pattern/Workaround vs. Systems Built from Scratch with Scalability in Mind

            12. Yesterday’s vs. Today’s Needs

            Nosqldbs.pdf ----page19

             

            //============================================================
            //============================================================
            DATE:2011-4-16

            TIME:00:24

            Reading the cudaArticle—05

            A multiprocessor takes four clock cycles to issue one memory instruction for a "warp"

            Accessing local or global memory incurs an additional 400 to 600 clock cycles of memory latency

            -----------------------------------

            Cuda Memory

            Registers:

            The fastest form of memory on the multi-processor.

            Is only accessible by the thread.

            Has the lifetime of the thread.

            Shared Memory:

            Can be as fast as a register when there are no bank conflicts or when reading from the same address.

            Accessible by any thread of the block from which it was created.

            Has the lifetime of the block.

            Global memory:

            Potentially 150x slower than register or shared memory -- watch out for uncoalesced reads and writes which will be discussed in the next column.

            Accessible from either the host or device.

            Has the lifetime of the application.

            Local memory:

            A potential performance gotcha, it resides in global memory and can be 150x slower than register or shared memory.

            Is only accessible by the thread.

            Has the lifetime of the thread.

             

            // includes, system
            #include <stdio.h>
            #include <assert.h>
             
            // Simple utility function to check for CUDA runtime errors
            void checkCUDAError(const char* msg);
             
            // Part 2 of 2: implement the fast kernel using shared memory
            __global__ void reverseArrayBlock(int *d_out, int *d_in)
            {
                extern __shared__ int s_data[];
             
                int inOffset = blockDim.x * blockIdx.x;
                int in = inOffset + threadIdx.x;
             
                // Load one element per thread from device memory and store it 
                // *in reversed order* into temporary shared memory
                s_data[blockDim.x - 1 - threadIdx.x] = d_in[in];
             
            // Block until all threads in the block have written 
            //their data to shared mem
                __syncthreads();
             
                // write the data from shared memory in forward order, 
                // but to the reversed block offset as before
             
                int outOffset = blockDim.x * (gridDim.x - 1 - blockIdx.x);
             
                int out = outOffset + threadIdx.x;
                d_out[out] = s_data[threadIdx.x];
            }
             
            ////////////////////////////////////////////////////////////////////
            // Program main
            ////////////////////////////////////////////////////////////////////
            int main( int argc, char** argv) 
            {
                // pointer for host memory and size
                int *h_a;
                int dimA = 256 * 1024; // 256K elements (1MB total)
             
                // pointer for device memory
                int *d_b, *d_a;
             
                // define grid and block size
                int numThreadsPerBlock = 256;
             
            // Compute number of blocks needed based on array size 
            //and desired block size
                int numBlocks = dimA / numThreadsPerBlock; 
             
                // Part 1 of 2: Compute the number of bytes of shared memory needed
                // This is used in the kernel invocation below
                int sharedMemSize = numThreadsPerBlock * sizeof(int);
             
                // allocate host and device memory
                size_t memSize = numBlocks * numThreadsPerBlock * sizeof(int);
                h_a = (int *) malloc(memSize);
                cudaMalloc( (void **) &d_a, memSize );
                cudaMalloc( (void **) &d_b, memSize );
             
                // Initialize input array on host
                for (int i = 0; i < dimA; ++i) {
                    h_a[i] = i;
                }
             
                // Copy host array to device array
                cudaMemcpy( d_a, h_a, memSize, cudaMemcpyHostToDevice );
             
                // launch kernel
                dim3 dimGrid(numBlocks);
                dim3 dimBlock(numThreadsPerBlock);
            reverseArrayBlock<<< dimGrid, dimBlock, sharedMemSize >>>( d_b, d_a );
             
                // block until the device has completed
                cudaThreadSynchronize();
             
                // check if kernel execution generated an error
                // Check for any CUDA errors
                checkCUDAError("kernel invocation");
             
                // device to host copy
                cudaMemcpy( h_a, d_b, memSize, cudaMemcpyDeviceToHost );
             
                // Check for any CUDA errors
                checkCUDAError("memcpy");
             
                // verify the data returned to the host is correct
                for (int i = 0; i < dimA; i++){
                    assert(h_a[i] == dimA - 1 - i );
                }
             
                // free device memory
                cudaFree(d_a);
                cudaFree(d_b);
             
                // free host memory
                free(h_a);
             
            // If the program makes it this far, 
            //then the results are correct and
                // there are no run-time errors. Good work!
                printf("Correct!\n");
             
                return 0;
            }
             
            void checkCUDAError(const char *msg)
            {
                cudaError_t err = cudaGetLastError();
                if( cudaSuccess != err) 
                {
                    fprintf(stderr, "Cuda error: %s: %s.\n", msg, 
                                      cudaGetErrorString( err) );
                    exit(EXIT_FAILURE);
                }                         
            }

             

            //============================================================

            TIME:01:16

            Finsh reading the cudaArticle 06

             

            //============================================================

            DATE:2011-4-23

            TIME:09:31

            Reading berkeley view on cloud computing

               Page 10 classes of utility computing

             

            //============================================================

            DATE:2011-4-24

            TIME:00:16

            Reading Makefile.pdf

             

            --------------------------------------------------------------

            List macros specified by defalut(Makefile)

               Using : make –p

            $@ name of target

            $? List of dependents

            $^ gives all dependencies,whether more recent than the target

            $+ same as above,but keep the duplicate names

            $< the first dependencies

             

            --------------------------------------------------------------

            Reading berkeley view on cloud computing

               Page 19 Number 5 Obstacle: Performance Unpredictability

             

            //============================================================

            //============================================================

            DATE:2011-4-25

            TIME:01:40

            Finish reading Berkeley view on cloud computing

             

            //============================================================

            //============================================================

            DATE:2011-4-28

            TIME:21:22

            Coding the motion project

            The Visual Studio 2005 return an error that stack overflow

            “Unhandled exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.”

             

            --------------------------------------------------------------

            'motion.exe': Unloaded 'C:\WINDOWS\WinSxS\x86_Microsoft.VC80.CRT_1fc8b3b9a1e18e3b_8.0.50727.4053_x-ww_e6967989\msvcr80.dll'

            'motion.exe': Unloaded 'C:\WINDOWS\system32\psapi.dll'

            'motion.exe': Unloaded 'C:\WINDOWS\system32\shimeng.dll'

            First-chance exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.

            Unhandled exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.

            The program '[2388] motion.exe: Native' has exited with code 0 (0x0).

            --------------------------------------------------------------

            Problem: using huge big objet

             

            //============================================================

            //============================================================

            DATE:2011-4-30

            TIME:01:40

            Coding CSE332 project 2

               Adding other data-counter Implementations

             

            posted on 2011-05-03 21:57 chaogu 閱讀(620) 評論(0)  編輯 收藏 引用

            導航

            <2011年5月>
            24252627282930
            1234567
            891011121314
            15161718192021
            22232425262728
            2930311234

            統計

            常用鏈接

            留言簿(1)

            隨筆檔案

            搜索

            最新評論

            閱讀排行榜

            評論排行榜

            久久久久国产精品麻豆AR影院| 97久久精品人妻人人搡人人玩| 久久99精品久久久久久野外| 国产福利电影一区二区三区,免费久久久久久久精| 久久精品9988| 国内精品综合久久久40p| 精品久久久久久中文字幕人妻最新| 色综合久久88色综合天天| 亚洲精品无码专区久久同性男| 久久午夜无码鲁丝片秋霞 | 久久这里只有精品首页| 日产精品久久久久久久| 久久精品国产亚洲一区二区| 国内精品伊人久久久久妇| 久久九九青青国产精品| 国产成人久久精品一区二区三区| 亚洲国产精品久久久久婷婷软件 | 久久99久久99精品免视看动漫| 狠狠色丁香久久婷婷综| 久久久亚洲裙底偷窥综合| 久久国产精品视频| 久久精品国产久精国产果冻传媒| 久久久久久噜噜精品免费直播| 久久久久免费精品国产| 国产激情久久久久影院| 亚洲精品蜜桃久久久久久| 久久精品二区| A狠狠久久蜜臀婷色中文网| 老司机午夜网站国内精品久久久久久久久| 亚洲精品无码久久久影院相关影片 | 久久亚洲精品无码观看不卡| 国产成人久久精品区一区二区| 亚洲AⅤ优女AV综合久久久| 99国内精品久久久久久久| 国产成人精品久久一区二区三区 | 久久久久久久免费视频| 久久久黄片| 欧美久久久久久精选9999| 国内精品久久久久久不卡影院| 久久国产免费观看精品| www久久久天天com|