??xml version="1.0" encoding="utf-8" standalone="yes"?>一本色道久久88综合日韩精品,亚洲伊人久久精品影院,av色综合久久天堂av色综合在http://www.shnenglu.com/flashboy/archive/2009/12/05/102609.htmlRedLightRedLightSat, 05 Dec 2009 09:36:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/12/05/102609.htmlhttp://www.shnenglu.com/flashboy/comments/102609.htmlhttp://www.shnenglu.com/flashboy/archive/2009/12/05/102609.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/102609.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/102609.html最q我被安排到一个小l中Q做的项目是一个体感游戏,之前考虑q用ogre开源引擎来做,可惜我对q东西又不熟的,听h家说q要用第三方cdCEGUI, q要配置一些东西,一听头都大了,光学习理想都用半个月了,我们的工E只l两个月的时_唉呀Q算了吧Q自׃是还有些׃八糟的代码可用吗Q狠下决心,军_q它一把!哪怕辛苦一点,做下来一个游戏引擎以后就Ҏq事了?br>l过和另外一个同事两个月的时间奋斗,一路中虽说遇到些困难,也常加班的,目l于也做完了?br>




一个游戏引擎原型基本蛋生了Q我们命名它?RedLightQ它基本实现?br>
(0) Win32E序渲染框架
(1) UI的基本消息交互流E,XMLH体配置理QUI皮肤配置理Q基本的UI控g库,多分辩率无缝UI拼图
(2) 室外场景理
(3) 摄相\径摄?br>(4) 3D Max8模型及骨骼动d出插?br>(5) 模型渲染, 关键帧动d骨骼动画控制
(6) 基本的水面反效?br>(7) 单的面粒子系l?br>(8) 声音控制接口

q是一个单机游戏引擎的原型Q功能有限,但它ȝ不i使命完成了一个项目了Q以后再扩展使它日益强大吧!

RedLight 2009-12-05 17:36 发表评论
]]>
BigWorld引擎初识大观http://www.shnenglu.com/flashboy/archive/2009/11/14/100914.htmlRedLightRedLightSat, 14 Nov 2009 05:16:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/11/14/100914.htmlhttp://www.shnenglu.com/flashboy/comments/100914.htmlhttp://www.shnenglu.com/flashboy/archive/2009/11/14/100914.html#Feedback2http://www.shnenglu.com/flashboy/comments/commentRss/100914.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/100914.html阅读全文

RedLight 2009-11-14 13:16 发表评论
]]>
D3D与OpenGL常用API对译http://www.shnenglu.com/flashboy/archive/2009/11/14/100912.htmlRedLightRedLightSat, 14 Nov 2009 04:49:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/11/14/100912.htmlhttp://www.shnenglu.com/flashboy/comments/100912.htmlhttp://www.shnenglu.com/flashboy/archive/2009/11/14/100912.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/100912.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/100912.html作ؓ一?DE序? 我用了OpenGL两年? 最q在搞一个项? 从OpenGL转到D3D, 虽然工程外在的框枉装得不? 但想完全COpenGL转换到D3D, 看v来还是有隑ֺ? ׃我两个星期的旉, 我终于{换过来了?br>D3D与OpenGL的几Ҏ较明显不同的地方:
(一)、正交投影时QOpenGL以屏q左上角?0,0), 而D3D却以屏幕中心?0,0)
(?、OpenGL使用x坐标p, 而D3D使用左手坐标p?br>(?、OpenGL使用旋{操作{{入的角度参数?角度, 而D3D?弧度Q所以注意要PI * Angle / 180

下面我把具体地API对照关系列出?不是很全Q以后添加中.......)


1. 坐标变换
 pos = D3DXVECTOR3(0,2,-1.5);
 at  = D3DXVECTOR3(0,0,0);
 up  = D3DXVECTOR3(0,1,0);
 D3DXMatrixLookAtLH(&view,&pos,&at,&up);
 pd3dDevice->SetTransform(D3DTS_VIEW,&view);

2. l制
 pd3dDevice->SetRenderState(D3DRS_FILLMODE,D3DFILL_WIREFRAME);

    DrawPrimitive()
 DrawIndexedPrimitive()
 
 DrawPrimitiveUP()
 DrawIndexedPrimitiveUP()
3. 颜色


4. 片段试

 (1) 深度试
 g_pDevice->SetRenderState(D3DRS_ZENABLE, TRUE);           //glEnable(GL_DEPTH_TEST);  
 g_pDevice->SetRenderState(D3DRS_ZFUNC, D3DCMP_LESSEQUAL);    //glDepthFunc(GL_LEQUAL);
 //--------------------------------------------------------------------------------------------------------
 g_pDevice->SetRenderState(D3DRS_CULLMODE, D3DCULL_CCW);      //glEnable(GL_CULL_FACE);

 
 (2) Alpha试
 //-------------------------------------------------------------------------------------------------------- 
 g_pDevice->SetRenderState(D3DRS_ALPHATESTENABLE, TRUE);                //glEnable(GL_ALPHA_TEST);
 g_pDevice->SetRenderState(D3DRS_ALPHAFUNC, D3DCMP_GREATER);    //glAlphaFunc(GL_GREATER, 0.1f);
 g_pDevice->SetRenderState(D3DRS_ALPHAREF, 0.1 * 255); //取D?0 ~ 255
 
 (3) 剪裁试 (q面剪切)
 //--------------------------------------------------------------------------------------------------------
 // Enable clip plane for reflection map
 CMatrix44f pWorldViewProjIT=m_pWorldViewProj;
 //pWorldViewProjIT.Transpose();
 pWorldViewProjIT.Invert();   

 // Transform plane to clip-space
 float pClipSpacePlane[4];
 float pClipPlane[]= { 0, 0, 1, 0};   

 // Check if camera is below water surface, if so invert clip plane
 CVector3f pEye=(CVector3f)m_pCamera.GetPosition();
 if(-pEye.m_fZ<0.0)
 {
  pClipPlane[2]=-pClipPlane[2];
 }

 MatrixTransformPlane(pClipSpacePlane, pClipPlane, pWorldViewProjIT);

 // enable clip plane now
 g_pDevice->SetClipPlane(0, pClipSpacePlane);  
 g_pDevice->SetRenderState(D3DRS_CLIPPLANEENABLE, 1);
 
 
 
 (4) 模板试
 //--------------------------------------------------------------------------------------------------------
 g_pDevice->SetRenderState(D3DRS_STENCILENABLE, TRUE);
    g_pDevice->SetRenderState(D3DRS_STENCILFUNC, 3DCMP_ALWAYS);
    g_pDevice->SetRenderState(D3DRS_STENCILREF, 0x1); //取D?0 ~ 255
   
    Device->SetRenderState(D3DRS_STENCILPASS,  D3DSTENCILOP_KEEP);
 
5. U理操作
 
 g_pDevice->SetSamplerState( 0, D3DSAMP_MINFILTER, D3DTEXF_LINEAR);
   g_pDevice->SetSamplerState( 0, D3DSAMP_MAGFILTER, D3DTEXF_LINEAR);
   g_pDevice->SetSamplerState( 0, D3DSAMP_MIPFILTER, D3DTEXF_LINEAR);
   
 g_pDevice->SetSamplerState( 0, D3DSAMP_ADDRESSU, D3DTADDRESS_CLAMP);
  g_pDevice->SetSamplerState( 0, D3DSAMP_ADDRESSV, D3DTADDRESS_CLAMP);

6. ~冲区操?br>  
 (1) 颜色~冲
 //--------------------------------------------------------------------------------------------------------
 g_pDevice->SetRenderState(D3DRS_COLORWRITEENABLE, D3DCOLORWRITEENABLE_ALPHA);
 g_pDevice->SetRenderState(D3DRS_COLORWRITEENABLE, 0x000000F);
 
 (2) 深度~冲
 //--------------------------------------------------------------------------------------------------------
 g_pDevice->SetRenderState(D3DRS_ZENABLE, TRUE);           //glEnable(GL_DEPTH_TEST);
 g_pDevice->SetRenderState(D3DRS_ZWRITEENABLE, TRUE);        //glDepthMask(GL_TRUE);

 (3) 模板~冲
 //--------------------------------------------------------------------------------------------------------
 
 (4) 渲染到纹?br> //--------------------------------------------------------------------------------------------------------
 // Render targets
 IDirect3DSurface9 *m_plD3DBackbufferSurf,
  *m_plD3DDepthStencilSurfAA,
  *m_plD3DDepthStencilSurf;

 CRenderTarget *m_pRTRefraction, *m_pRTReflection; //(自定义纹理类)

 //-----------------------------------------------------------------------------------

 // Get backbuffer
 g_pDevice->GetRenderTarget(0, &m_plD3DBackbufferSurf);
 
 // Get depthstencil
 g_pDevice->GetDepthStencilSurface(&m_plD3DDepthStencilSurfAA);
 

 // Restore previous states
 g_pDevice->SetRenderTarget(0, m_plD3DBackbufferSurf);
 g_pDevice->SetDepthStencilSurface(m_plD3DDepthStencilSurfAA);
 
 // (1)折射?-------------------------------------------------------------------------
 
 //下面的语句调用了 g_pDevice->CreateRenderTarget(iWidth, iHeight, (D3DFORMAT) iFormat, (D3DMULTISAMPLE_TYPE)iAASamples, 0, 0, &m_plD3Surf, 0));
 if(FAILED(m_pRTRefraction->Create(m_fWidth>>1, m_fHeight>>1, D3DFMT_A8R8G8B8)))
 {
  return APP_ERR_INITFAIL;
 }
 
 // Create depthstencil withouth multisampling
 g_pDevice->CreateDepthStencilSurface(m_fWidth, m_fHeight, D3DFMT_D24X8, (D3DMULTISAMPLE_TYPE)0, 0, 0, &m_plD3DDepthStencilSurf, 0);
 
 
 g_pDevice->SetRenderTarget(0, m_pRTReflection->GetSurface());
 
 g_pDevice->StretchRect(m_plD3DBackbufferSurf, 0, m_pRTRefraction->GetSurface(), 0, D3DTEXF_NONE);
 
 // (2)反射?----------------------------------------------------------------------------------
 m_pRTReflection=new CRenderTarget;
 if(FAILED(m_pRTReflection->Create(m_fWidth>>2, m_fHeight>>2, D3DFMT_A8R8G8B8)))
 {
  return APP_ERR_INITFAIL;
 }
 
 g_pDevice->SetRenderTarget(0, m_pRTReflection->GetSurface());
 
 
 //-----------------------------------------------------------------------------------
 g_pDevice->SetRenderTarget(0, m_pRTReflection->GetSurface());
 g_pDevice->SetDepthStencilSurface(m_plD3DDepthStencilSurf);
 g_pDevice->Clear(0, 0, D3DCLEAR_TARGET|D3DCLEAR_ZBUFFER, D3DCOLOR_ARGB(255, 0, 0, 128), 1.0f, 0);  
 SetViewport(m_pRTReflection->GetWidth(), m_pRTReflection->GetHeight());
 //-----------------------------------------------------------------------------------
 
 D3DXSaveTextureToFile("imageTex.jpg",D3DXIFF_JPG,(IDirect3DTexture9*)m_pWavesBump->GetTexture(),NULL);
 
7. 混合操作
 g_pDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, FALSE);    //glDisable(GL_BLEND);
 g_pDevice->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_SRCALPHA);  //glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
 g_pDevice->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_INVSRCALPHA);
 

8. 灯光与材?br> g_pDevice->SetRenderState(D3DRS_LIGHTING, FALSE); //glDisable(GL_LIGHTING);
 
 D3DMATERIAL9 mtrl;
 mtrl.Ambient  = a;
 mtrl.Diffuse  = d;
 mtrl.Specular = s;
 mtrl.Emissive = e;
 mtrl.Power    = p;
 Device->SetMaterial(&mtrl);  //在设|纹理前讑֮
 //讄当前使用的纹?br> 



RedLight 2009-11-14 12:49 发表评论
]]>
D3D常用APIhttp://www.shnenglu.com/flashboy/archive/2009/09/17/96536.htmlRedLightRedLightThu, 17 Sep 2009 07:55:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/09/17/96536.htmlhttp://www.shnenglu.com/flashboy/comments/96536.htmlhttp://www.shnenglu.com/flashboy/archive/2009/09/17/96536.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/96536.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/96536.htmlD3DAPI大全Q全部函?/h4>

//Direct3D 9.0 SDK 开发参考Direct3D 9.0 SDK 文档 (中文?

词汇?

DirectX 8 教程

你也可以?COM 对象想象成一套ؓ某个主题而设计的一整套库函数。DX 提供了一套完整的设计3D游戏的库?

http://baike.baidu.com/view/1169027.htm

使用DirectX的不同组Ӟ你需要链接不同的静态库。例如你要用DirectDrawlgQ你需要ddraw.lib?

对于DirectDrawQ这个头文g是ddraw.h?/com~程

D3D.H

http://wowe1314.blog.163.com/blog/static/2358876200751191213583/

在Direct3D~程中,我们要做的工作基本上可以归纳为:

调用适当的函数获取接口指针;

调用接口的方法(成员函数Q来完成所需功能Q?

用完接口后,调用ReleaseҎq行“释放”Q注意释N序应该和获取它们的顺序相反?

http://www.lihuasoft.net/article/show.php?id=2928

Microsoft_DirectX_9.0c里的 9个DirectX的DLL

DX9和DX10在渲染流水线上都是有天壤之别的,好在DX高版本开发包q行库中包含了对低版本开发包q行库的实现Q所以用DX8开发的E序QDX9q行库也能够很好的支持,在安装有D9q行库的pȝ上跑DX8开发的E序不需要再安装DX8q行库,但是q个兼容性支持在最q被微Y逐渐攑ּQ有时候DX9的不同更新版本做的程序也不能向下兼容Q比如DX9FEB2007SDK,同DX9AUG2006SDK在shader~译规则上也是不同的Q?007攑ּ了VS2.0和PS2.0以下版本shader的支持,同时对于HLSL?include相对路径引用的默认根目录也是有区别的.openGL的shader扩展不同的厂商有不同的扩展开发包Q但是这U情况随着GLSL和openGL2.0的出现有所改观.同时OpenGL是跨q_的而DX不是Q这意味着用OpenGL和GNU   C++规则开发的E序可以同时在Linux,unix和安装有GNU环境的Windows上同时运行。从效率上来看,DX׃数据时批量写入显存的Q同OpenGL的单条函数写入来讲DX效率上要高一些,不过q来OpenGL也支持了批写入,只是支持批写入的OpenGL攑ּ了openGL一惯的优势也就是语a架构上的z得函数的数目变得很冗杂。在效果上看DX9同支持GLSL或CG扩展的openGL可以实现相同的显C效果。但是有一点不同是DXUT和D3DX在一些基l制上比glu和openGL   ARB   Extend要差一点,比如l制虚线QDX没有好的函数可以是实现这一功能。但是DX的扩展工hopenGL扩展工具又有多余的优势比如向量计?GUI控g,mesh优化和曲面展开QPRT预计等{和性能试{等上又要强一炏VDX10同OpenGL比较感觉openGL不是同一个数量上的产品QDX10在渲染流水线和架构上和能够实现的效果上要比DX9和openGLq步的多。要做面向未来的游戏产品量q是用DX10吧?

­

LPDIRECT3D9 D3DL?

LPDIRECT3DDEVICE9 D3DgL?

LPDZRECT3DVERTXBUFFER9 点~冲区接?

LPD3DVIEWPORT9  视口接口

LPD3DDISPLAYMODE D3D讑֤昄模式接口

LPD3DPRESENT_PARAMETERS 创徏l构用来保存讑֤的显C模式接?

LPD3DXVECTOR3  3D向量接口

LPDIRECT3DTEXTURE9 U理接口

ID3DXSprite  _接口

g.pvb  成员函数

g_pD3D  成员函数

g_pD3D->GetAdapterDisplayMode(D3DADAPTER_DEFAULT,&d3ddm) 获取昄模式

g_pd3dDevice 成员函数

g_pd3dDevice->SetRenderStateQ,BOOLQ?是否开启灯?

g_pd3dDevice->SetTransform( D3DTS_WORLD, &(matWorld * matWorld_x));//上面计出来的旋{矩阵Q设|ؓ世界变换矩阵

g_pd3dDevice->SetStreamSource( 0, g_pVB, 0, sizeof(CUSTOMVERTEX) );写入?

g_pd3dDevice->SetFVF() 讄FVF

g_pd3dDevice->DrawIndexedPrimitive( dŞ? 0, 0, 点个? 0, 三角形个?); ?

timeGetTime 当前pȝ的时?

DIRECT3DTXTURE 文理接口

BITMAPPEILEHEADER 文g?

BITMAPINFOHEADER 信息?

fread M个数据块

biBitcout 每个象素占几个字?

bicompression 是否被压~?

fseek 跻I偏移指针

greatetxture 创徏一个空文理

D3Dcaked_RECT 锁定l构?

setTexturestagestata 讄文理操作

CONSTD3DMATRIX*  q行变换的变换矩?

l构?

D3DPRESENT_PARAMETERS 存储D3D讑֤信息

D3DXMATRIX  修改矩阵

数组

CUSTOMVERTEX 讄点位置颜色信息

­

矩阵函数

D3DXMATRIX * D3DXMatrixIdentityQPOut,pMQ?单位矩阵

D3DXMATRIX * D3DXMatrixTransposeQ上Q?nbsp; 矩阵转置

D3DXMATRIX * D3DXMatrixInverseQ上中间加个FLOATQ?逆矩?

D3DXMATRIX * D3DXMatrixTransformationQ)

D3DXMATRIX* D3DXMatrixTranslationQ输出矩阵,XQYQZQ?q移变换

D3DXMATRIX * D3DXMatrixScalingQ上Q?~放变换

FLOAT D3DXPlaneDotCoordQpp,pvQ?点和q面之见的关p?

D3DXPLANE * D3DXPlaneFromPointNormalQPOUTQPPOINTQPNORMALQ?构造子

D3DXPLANE * D3DXPlaneFromPoints(Pout,pv1,pv2,pv3) 通过Ҏ描述q面

D3DXPLANE * D3DPlaneNormalize(POUT,PP) 标准化一个^?

D3DXPLANE * D3DXPlaneTransform(POUT,PP,PM) q移q面

D3DXM

转换函数

D3DXMATRIX* D3DXMatrixLookAtLH(输出用于视图转换的矩阵,摄象机的位置Q摄象机面向的位|,摄象机的正方?  视图转换的矩?

D3DXMATRIX* D3DXMatrixOrthoLH(输出用于正交投媄的交换矩阵,取景宽,取景高,取景L象机的最q距,取景L象机的最q距? 正交投媄变换矩阵

D3DXMATRIX* D3DXMatrixPerspectiveFovLH(输出用于透视投媄的交换矩阵,摄象机镜头的夹角YQ^截台体的U|比,q^截面的距,q^截面的距? 透视投媄的矩?

Direct3DCreate9(D3D版本) 创徏D3D对象

讑֤函数

SetTransform(变换的类型,变换的变换矩? 讄左手或右手坐?

SetViewport(视口指针) 讄q近距离

GetClientRect(hWnd,*RECT) 获取H口l图区域

memcpyQ指针,数组Q长度) 拯

SetStreamSource(0,G.pvb接口指针Q?,长度) 数据?

GetAdapterDisplayMode(指定昄卡序列号Q存储显C模式的指针) 获取昑֍的模?

HRESULT CreateDevice(昑֍序列PD3D讑֤cdQ所属窗口句柄,D3Dq行3Dq算Q存储D3D讑֤相关信息指针Q返回D3D讑֤借口指针的地址) 创徏讑֤借口

HRESULT CreateVertexBuffer(点~冲区大(字节Q,点~冲区属性,灉|点格式Q顶点缓冲区内存位置Q顶点缓冲区指针地址Q保留参数通常?) 创徏点~冲

HRESULT CreateIndexBuffer(索引~冲区大(字节Q?点~冲区属?FMT颜色,点~冲区内存位|?索引~冲区指针地址Q保留参数通常?)   创徏索引~冲

HRESULT Lock(加锁内存起始地址Q加锁内存大,q回内存指针地址Q加锁属? 加羃内存

HRESULT UnLock() 解锁

HRESULT SetStreamSource(渲染数据序列号Q进行绑定连接的点~冲区指针,q行l定q接渲染数据的起始位置Q渲染数据流中一个顶Ҏ占的内存大小) 点~冲区和渲染数据连?

HRESULT SetFVF(灉|点格式) 讄点格式

HRESULT DrawPrimitive(l制的图元类型,l制的开始顶点的索引|l制的图元数?  d后向~冲?

HRESULT DrawPrimitiveup() 可以直接?

HRESULT Preesent(复制源的矩Ş区域指针Q复制目的地的矩形区域指针,D3D讑֤H口句柄Q最更新区域指? 屏幕{

HRESULT SetIndices(使用的烦引缓冲区指针) 讄当前l制的烦引数l?

DrawIndexedPrimitive(囑օcdQ绘制到的烦引缓冲区的开始地址Q最的索引数组元素的|点的数目,开始的索引数组元素的|l制的数? 同DrawPrimitive()

­

l制函数

HRESULT DrawPrimitive(基本囑օcdQv始顶点,l制的图元的数量)  囑օl制

HRESULT Clear(清楚的矩形区域数量,清除的D行区域数l指针,清楚哪个~冲区,清除后重|的颜色Q清除后重置的深度,0-1.0Q重|的摸版? 清空囑Şl制?

HRESULT BeginScene() 开始绘?

HRESULT EndScene() l束l制

­

U理函数

CreateTexture()  创徏D3DU理对象

LoadBmpTetureQ) 装蝲文理函数

LoadBmpTexture24Bit (LPDIRECT3DDEVICE9 pDevice,LPCSTR  pSrcFile,LPDIRECT3DTEXTURE9* ppTexture) 24位纹?

D3DXCreateTextureFromFile(D3D讑֤指针Q纹理图形文Ӟ存储D3D文理的指针地址) 直接从磁盘获取纹?

D3DXCreateTextureFromFileEx(D3D讑֤指针Q纹理图形文Ӟ指定U理宽,高,指定渐进U理序列U数Q纹理用方式一般ؓ0Q指定纹理图形格式,U理存放的内存类型一般位?Q纹理过滤方式,自动生成的纹理序列过滤方式,讄透明Ԍ囑Ş文g信息存放地址可设|?Q调色板存储地址Q创建的D3D文理的指针地址) 高获取U理

HRESULT SetTexture(多U理的烦?-7QD3D的纹理接口指? 讄当前要渲染的U理

HRESULT SetTextureStageState(多U理的烦引,U理渲染状态的cdQ纹理渲染状态的|与类型相对应) 讄U理的渲染状?

HRESULT SetSamplerState(指定U理属?-7Q纹理采样属性类型,讄U理采样属? U理采样

HRESULT CheckDeviceFormat(指定昑֍序列?D3D讑֤cd,指定昄模式格式Q缓冲区属性,需要用查询的格式的设备类?需要查询的昄格式) U理压羃

HRESULT LockRect(指定加锁的纹理别,指向D3DLOCKED_RECTl构Q要加锁的RECT区域-0代表整个区域Q加锁类??或下表的? 锁定U理

HRESULT UnlockRect(解锁的纹理? 解锁U理

向量函数

D3DXVECTOR3 * D3DXVer3Length(V) 向量模的计算

D3DXVECTOR3 * D3DXVec3NormalizeQ返回指针,VQ?单位?

D3DXVECTOR3 * D3DXVec3Add(q回的指针,u,v) 向量加法

D3DXVECTOR3 * D3DXVec3SubtractQ同上) 减法

D3DXVECTOR3 * D3DXVec3Cross(同上) 向量X?

D3DXVECTOR3 * D3DXVec3Lerp(同上) C

D3DXVECTOR3 * D3DXVec3MaximizeQ同上) 取最大?

D3DXVECTOR3 * D3DXVec3MinimizeQ同上) 取最?

D3DXVECTOR3 * D3DXVec3ScaleQ返回指针,PVQFLOATQ?比例

FLOAT D3DXVec3Dot(pv1,pv2) 点乘

参见~程_a.chm中的COM中模块的导出函数

Private Type D3DVECTOR

    x As Single

    y As Single

    z As Single

End Type

'q回3D向量的规格化向量

Private Declare Function D3DXVec3Normalize Lib "DX8VB.DLL" Alias "VB_D3DXVec3Normalize" (VOut As D3DVECTOR, v As D3DVECTOR) As Long

Private Declare Function D3DXVec3Add Lib "DX8VB.DLL" Alias "VB_D3DXVec3Add" (VOut As D3DVECTOR, v1 As D3DVECTOR, V2 As D3DVECTOR) As Long

Private Declare Function D3DXVec3Subtract Lib "DX8VB.DLL" Alias "VB_D3DXVec3Subtract" (VOut As D3DVECTOR, v1 As D3DVECTOR, V2 As D3DVECTOR) As Long

Private Declare Function D3DXVec3Length Lib "DX8VB.DLL" Alias "VB_D3DXVec3Length" (v As D3DVECTOR) As Single

D3DFVF 自由点的格?

D3DFVF_DIFFUSE 包含谩反的信息

D3DFVF_NORMAL 包含法线信息

D3DFVF_PSIZE 点信息指明l制点的大小

D3DFVF_SPECULAR 包含镜面反射的信?

D3DFVF_XYZ 包含未经转换的顶点坐?

D3DFVF_XYZRHW 包含l过转换的顶点坐?

D3DFVF_XYZB1 through D3DFVF_XYZB5 包含用于骨骼动化的顶点和点寚w骼的权重信息

D3DFVF_XYZW 包含l过转换和裁剪的点坐标

D3DTRANSFORMSTATETYPE 变换的类?

­

D3DPRIMITIVETYPE 定义基本囑օ

D3DPT_POINTLIST 一l点的集?

D3DPT_LINELIST 一l线的集?

D3DPT_LINESTRIP 首尾相连的线D늚集合

D3DPT_TRIANGLELIST 一l三角Ş的集?

D3DPT_TRIANGLESTRIP 首尾相连的三角ŞQ有两个点集合

D3DPT_TRIANGLEFAN   l成扇Ş的一l三角Ş集合

D3DPT_FORCE_DWORD 未定义的

D3DDISPLAYMODE 屏幕昄模式

D3DFMT_UNKNOWN  未知的象素格?

D3DFMT_R8G8B8  24位色QRGB各占8?

D3DFMT_A8R8G8B8  32位色Q@RGB各占8?

D3DFMT_X8R8G8B8  32位色QXZ??RGB各占8?

D3DFMT_R5G6B5  16位色QR?QG?QB??

D3DFMT_X1R5G5B5  16位色Q保?位,RGB各占5?

D3DFMT_A1R5G5B5  16位色Q@?位,RG5各占5?

D3DFMT_A4R4G4B4  16位色Q@RGB各占4?

D3DFMT_R3G3B2  8位色QR3,G3,B2?

D3DFMT_A8  只有8位@

D3DFMT_A8R3G3B2  16位色Q@8QR3QG3QB2?

D3DFMT_X4R4G4B4  16位色

D3DFMT_A2B10G10R10 32位色Q@?位,RGB?0?

D3DFMT_A8B8G8R8  32位色

D3DFMT_X8B8G8R8  32位色

D3DFMT_G16R16  32位色Q只有红和绿

D3DFMT_A2R10G10B10 32位色

D3DFMT_A16B16G16R16 64位色

D3DFMT_A8P8  8位色Q?位表C半透明Q?位表C颜?

D3DFMT_P8  8位色Q用牙色索引DC?

D3DFMT_L8  8位色Q只表示亮度

D3DFMT_L16  16位色Q只表示亮度

D3DFMT_A8L8  16位色Q?位表C半透明Q?位表CZ?

D3DFMT_A4L4  8位色Q?位表C半透明Q?位表CZ?

D3DDEVTYPE_HAL   g抽象层,通过昄g来完成图形渲染工?

D3DDEVTYPE_NULLREF  

D3DDEVTYPE_REF   参考光栅器Q一般用语测试显C卡不支持的D3D功能

D3DDEVTYPE_SW   用语支持W三方的软g

D3DDEVTYPE_FORCE_DWORD  扩展?

D3DCREATE 3Dq算的方?

D3DCREATE_ADAPTERGROUP_DEVICE

D3DCREATE_DISABLE_DRIVER_MANAGEMENT

D3DCREATE_DISABLE_DRIVER_MANAGEMENT_EX

D3DCREATE_FPU_PRESERVE   Ȁzd_ֺ点q算或Q点运异常检,讄该项会降低系l性能

D3DCREATE_HARDWARE_VERTEXPROCESSING 由D3Dgq行点预算

D3DCREATE_MIXED_VERTEXPROCESSING 由؜合方式进行顶点运?

D3DCREATE_MULTITHREADED   支持多线E绘Ӟ讄该项会降低系l性能

D3DCREATE_NOWINDOWCHANGES  

D3DCREATE_PUREDEVICE   用D3D的GET*Q)函数Q禁止D3D使用虚拟讑֤模拟点q算

D3DCREATE_SCREENSAVER  

D3DCREATE_SOFTWARE_VERTEXPROCESSING 由D3D软gq行点q算

D3DSWAPEFFECT 取值列?

D3DSWAPEFFECT_DISCARD    后台~冲区复制到前台Ӟ清除后台~冲区内?

D3DSWAPEFFECT_FLIP    后台~冲区内容复制后Q保持不变,有多个后台缓冲区时?

D3DSWAPEFFECT_COPY    后台~冲区内容复制后Q保持不变,只有1个后台缓冲区时?

D3DSWAPEFFECT_FORCE_DWORD   该直作ؓ32位存储,通常不用

D3DPRESENT 屏幕反{模式列表

D3DPRESENT_DONOTWAIT  

D3DPRESENT_INTERVAL_DEFAULT 默认的同ONE

D3DPRESENT_INTERVAL_ONE  当屏q刷Cơ时前台后台q行交换

D3DPRESENT_INTERVAL_TWO  当屏q刷Cơ时前台后台q行交换

D3DPRESENT_INTERVAL_THREE 当屏q刷Cơ时前台后台q行交换

D3DPRESENT_INTERVAL_FOUR 当屏q刷新四ơ时前台后台q行交换

D3DPRESENT_INTERVAL_IMMEDIATE 囑Şl制完成时立卌行交?

D3DPRESENT_LINEAR_CONTENT

D3DUSAGE ~冲区属性值列?

D3DUSAGE_AUTOGENMIPMAP  

D3DUSAGE_DEPTHSTENCIL

D3DUSAGE_DMAP  

D3DUSAGE_DONOTCLIP 用裁剪Q表C顶点缓冲区中的点不进行裁剪,当设|该属性时Q渲染状态D3DRS_CLIPPING必须设ؓFALSE

D3DUSAGE_DYNAMIC 使用动态内存分?

D3DUSAGE_NPATCHES 使用点~冲区绘制N-patches曲线

D3DUSAGE_POINTS  使用点~冲区绘制点

D3DUSAGE_RENDERTARGET

D3DUSAGE_RTPATCHES 使用点~冲区绘制曲U?

D3DUSAGE_SOFTWAREPROCESSING 使用软gq行点q算Q否则用硬件计?

D3DUSAGE_WRITEONLY 只写属性,不能q行L作,讄该属性可以提高系l性能

D3DPOOL  ~冲源内存位|列?

D3DPOOL_DEFAULT  默认的,点~冲区尽可能存在与显存中

D3DPOOL_MANAGED  由D3D自动调度点~冲区内存位|(昑֭和内存)

D3DPOOL_SCRATCH  点~冲Z于计机的时内存中Q这U类型的点~冲Z能直接进行渲染,只能q行内存枷锁Q拷贝等操作

D3DPOOL_SYSTEMMEM 点~冲Z于内存中

D3DLOCK  ~冲区加?

D3DLOCK_DISCARD  更新整个~冲?

D3DLOCK_DONOTWAIT

D3DLOCK_NO_DIRTY_UPDATE 在加锁的q程中系l进行其他操作(默认有Dirty标记)

D3DLOCK_NOOVERWRITE 保证不腹E缓冲区数据Q设|该属性可以立卌回内存指针,提高pȝ性能

D3DLOCK_NOSYSLOCK 在加锁的q程中系l可能执行其他操?

D3DLOCK_READONLY 讄~冲Z只读属?

D3DXVECTOR3 向量法

D3DXVECTOR3u(x,y,z);

D3DXVECTOR3v(x,y,z);

float 变量=D3DXVec3Dot(u指针,v指针) 点乘

D3DXMATRIX 矩阵

D3DXMatrixIdentity 单位矩阵

D3DXMatrixInverse 逆矩?

D3D实现囑Ş变换

D3DXMatrixTranslation q移矩阵

D3DXMatrixLockAtLH 观察矩阵

D3DXMatrixIdentity  一个矩阵单位化

D3DXMatrixRotationY lY轴{

D3DXMatrixRotationX lX轴{

D3DXMatrixRotationZ lZ轴{

D3DXMatrixScaling ~放变换

D3DXMatrixMuLationAxis 围绕L一个u旋{

D3DXMatrixMultiply l合变换

D3DUSAGE U理使用

D3DUSAGE_AUTOGENMIPMAP 自动生成多渐进U理序列Q该方式在资源处于D3DPOOL_SYSTEMMEM时无?

D3DUSAGE_DEPTHSTENCIL 深度模版~冲区,只在资源处于D3DPOOL_default时有?

D3DUSAGE_DMAP  该纹理是一个置换纹?

D3DUSAGE_DONOTCLIP

D3DUSAGE_DYNAMIC

D3DUSAGE_NPATCHES

D3DUSAGE_POINTS

D3DUSAGE_RENDERTARGET 该文理是一个渲染目标缓冲区

D3DUSAGE_RTPATCHES

D3DUSAGE_SOFTWAREPROCESSING 应用坐标变换

D3DUSAGE_WRITEONLY

D3DTEXTURESTAGESTATETYPE 渲染状态类?

D3DTSS_COLOROP   1 文理层的颜色混合方式

D3DTSS_COLORARG1  2 颜色混合的第一个参?

D3DTSS_COLORARG2  3 颜色混合的第二个参数

D3DTSS_ALPHAOP   4 指定U理层的Alpha透明

D3DTSS_ALPHAARG1  5 Alpha混合的第一个参?

D3DTSS_ALPHAARG2  6 Alpha混合的第二个参数

D3DTSS_BUMPENVMAT00  7 l制凹凸U理?

D3DTSS_BUMPENVMAT01  8 l制凹凸U理?

D3DTSS_BUMPENVMAT10  9 l制凹凸U理?

D3DTSS_BUMPENVMAT11  10 l制凹凸U理?

D3DTSS_TEXCOORDINDEX  11 该纹理层使用的纹理坐标的索引

D3DTSS_BUMPENVLSCALE  22 l制凹凸U理的羃攑֏?

D3DTSS_BUMPENVLOFFSET   23 l制凹凸U理的^Ud?

D3DTSS_TEXTURETRANSFORMFLAGS  24 控制U理坐标的{换标?

D3DTSS_COLORARG0  26 指定混合q程的第三个颜色

D3DTSS_ALPHAARG0  27 Alpha混合的第三个参数

D3DTSS_RESULTARG  28 颜色混合的结果输出寄存器

D3DTSS_CONSTANT  32 颜色混合的常量寄存器

D3DTSS_FORCE_DWORD  0x7fffffff 强制转换?2位,通常不用

D3DSAMPLERSTATETYPE U理采样属?

D3DSAMP_ADDRESSU  1 包装U理

D3DSAMP_ADDRESSV  2 包装U理

D3DSAMP_ADDRESSW  3 包装U理

D3DSAMP_BORDERCOLOR  4

D3DSAMP_MAGFILTER  5 处理攑֤qo

D3DSAMP_MINFILTER  6 处理~小qo

D3DSAMP_MIPFILTER  7 多纹理过?

D3DSAMP_MIPMAPLODBIAS  8 多文理U数偏移|初试直ؓ0

D3DSAMP_MAXMIPLEVEL  9 最大多U理U别Q初试gؓ0

D3DSAMP_MAXANISOTROPY  10 各向异性,初试gؓ1

D3DSAMP_SRGBTEXTURE  11

D3DSAMP_ELEMENTINDEX   12

D3DSAMP_DMAPOFFSET  13

D3DSAMP_FORCE_DWORD  0x7fffffff 强制转换32位,通常不用

U理d

D3DTADDRESS_WRAP  1 包装U理d

D3DTADDRESS_MIRROR  2 镜像U理d

D3DTADDRESS_CLAMP  3 夹取U理d

D3DTADDRESS_BORDER  4 Ҏ颜色U理d

D3DTADDRESS_MIRRORONCE  5 一ơ镜像纹理寻址

D3DTADDRESS_FORCE_DWORD 0x7fffffff强制转换32位,通常不用

世界变换

D3DTS_WORLD  世界变换



RedLight 2009-09-17 15:55 发表评论
]]>
D3d9的一些更?(?http://www.shnenglu.com/flashboy/archive/2009/09/12/96003.htmlRedLightRedLightSat, 12 Sep 2009 07:40:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/09/12/96003.htmlhttp://www.shnenglu.com/flashboy/comments/96003.htmlhttp://www.shnenglu.com/flashboy/archive/2009/09/12/96003.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/96003.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/96003.html
׃Aug 8造成的D3D9恐惧症已l完全消除了Q这一章将会给大家介绍?D引擎转向D3D9的各个方面,包括l于出现的全屏幕模式。从q章以后Q我用D3D9作ؓ讲解的语al箋D2D教程?br>
【OPl束Q开始正片?br>
『WhyQ?br>
  估计大家首先要问的就?#8220;WhyQ?#8221;Z么要前进到D3D9Q理由如下:
1、D3D9修复了D3D8已知的所有BugQ因此运行v来更E_Q速度也要快?br>2、D3D9提供了许多便利的新功能,虽然l大多数是面?D的,但是也有不少2D适用的,比如IDirect3DDevice9::StretchRectQ以及对IDirect3DSurface9的改q等{。D3DX库就更多了,比如D3DXSaveSurfaceToFileInMemoryQ一开始没发现q个函数有啥用处Q现在基本离不开了?br>3、HLSL。就像上一话我说的那样QD2D教程以后会有PixelShader的内宏V我可不x汇编来写ShaderQ会Mh的(贺我吧Q终于抛弃汇~Shader?#8230;…Q。虽然说q不是决定性的理由Q因有Cg什么的Q不q我想编写显卡无关的代码Q因此我不去研究CgQ反正和HLSL差不多)以及R2VB之类?br>4、ID3DXFontQ往下看你就知道了?br>
《D3D的变化?br>
『界面名U变化?br>
  一句话Q?Ҏ9p?br>
?#8220;创徏”型方法的一个统一变化?br>
  许多Create*()ҎQ比如创备、创建纹理、创建顶点缓冲等{,多了一个HANDLE* pSharedHandle参数Q无用,NULL之(看来微Y原打弄个共享句柄之c,不过被D3D10巨大的变化Q云了Q?br>
『创建D3D讑֤的变化?br>
  D3DPRESENT_PARAMS的FullScreen_PresentationInterval变成了PresentationIntervalQ也是说即使在H口模式下也可以做到垂直同步来防止撕裂现象(2D的福韛_Q。相应的QD3DSWAPEFFECT_COPY_VSYNC消失了,反正q个效果也不咋的Q消׃也好?br>  要做到垂直同步需要给PresentationInterval赋值D3DPRESENT_INTERVAL_DEFAULT或D3DPRESENT_INTERVAL_ONE。其中D3DPRESENT_INTERVAL_ONE的效果比D3DPRESENT_INTERVAL_DEFAULT好一点,不过相应的也会占用多一点点pȝ资源……真的只有一点点而已Q实在是无所谓的……
  如果不要垂直同步Q想要看看实际祯速的话,D3DPRESENT_INTERVAL_IMMEDIATE?br>  注意在窗口模式下Q你只能使用q三UPresent模式Q全屏幕模式下就可以使用别的Q但是要首先D3DCAPS9以查看显卡是否支持)。不q我感觉?9Q的游戏来说Q有q三个就_了?br>  另外在窗口模式下QBackBufferFormat也可以设|成D3DFMT_UNKNOWNQD3D会自动获取当前桌面的格式讑֮成后备缓冲的格式Q省去GetDisplayMode。实际上Q窗口模式下的后备缓冲已l不需要和桌面格式相同Q你可以通过IDirect3D9::CheckDeviceFormatConversion来检查,如果q个讑֤支持q两U颜色格式之间的转换Q就可以l程序的后备~冲讑֮上不同的格式。我试过在桌面格式ؓ32BitQD3DFMT_X8R8G8B8Q时程序的后备~冲格式讄为D3DFMT_R5G6B5Q?6BitQ,发现了速度提升Q也是说这个设定是有意义的?br>  可创建的讑֤cd多了一UD3DDEVTYPE_NULLREFQ在安装了D3D SDK的机子上{同于D3DDEYTYPE_REFQ在其他的机子上Q这U设备实际上没有创徏真正意义的D3D讑֤Q只是允怽创徏的纹理、表面等资源Q但是Render、Present{操作都会无效(实际上这些资源都创徏在了D3DPOOL_SCRATCH池里Q不你讑֮使用的是什么POOLQ。也是_仅仅在模拟基本的q行而已。你可以用这个设备来~写一个利用D3DX函数库进行图像格式{换的E序Q比如把一大堆不同的格式{换成易于D3D9使用的DDS格式。因为实际上没有创徏讑֤Q你甚至可以~写成控制台的,通过GetConsoleWindow的方法获得HWND。Mercury 3用的MIF格式的{换器是q么做出来的。注意D3DDEVTYPE_NULLREF只能用在IDirect3D::CreateDeviceӞ其他的方法都不行?br>
『创面的变化?br>
  创徏表面QSurfaceQ的Ҏ变成了IDirect3DDevice9::CreateOffscreenPlainSurfaceQ参数很单不用多_需要注意的是可以选择POOL了?br>
『设定FVF的变化?br>
  讑֮FVFӞ原来通过IDirect3DDevice8::SetVertexShaderQ现在有了一个专门用来设定FVF的方法:IDirect3DDevice9::SetFVF。这是个很好的变化,省得把FVF和Shader弄؜Q题外话Q也是因ؓq个变化Q让Shader在设备Reset后得以保存,不错不错Q?br>
『获取后备缓册Ӏ?br>
  D3D9现在允许有多个后备缓冲交换链Q不q对?D来说Q基本不需要这U东西,IDirect3DDevice9::GetBackBuffer多出来的W一个参数赋?卛_。如果你有兴,可以ȝI一下这个玩意,有时候可以用来做分场?br>
?font size=+0>SetStreamSource?br>
  q个Ҏ的功能被扩展了,Ҏ参数可以知道,多出来的OffsetInBytes允许你选择一个顶点缓冲的OffsetQD3D9从q个Offset之后开始读取数据。因此你可以把几l用来渲染纹理的正方形顶点存储到一个顶点缓冲里面?br>
?font size=+0>SetSamplerState?br>
  q个是D3D9的新ҎQ把原先SetTextureStageState的一些功能独立了出来Q和2D关系最密切的就是纹理过滤了。原先的D3DTSS_MINFILTER变成了D3DSAMP_MINFILTERQ相应的D3DTSS_MAGFILTER也变成D3DSAMP_MAGFILTERQD3DTSS_MAXANISOTROPY变成D3DSAMP_MAXANISOTROPY。另外还有更多的Q比如纹理寻址{。你ȝ一下D3DSAMPLERSTATETYPE枚Dcd的内容就知道?#8220;q移”了些什么?br>  q个变化对于Shader来说很方ѝ改成Sampler的东西在PixelShaderq程也会有效Q而没有更改的东西在PixelShader׃会有效了。D3D8时候把q些全都攑֜了一PҎ造成混ؕ?br>
?font size=+0>SetRenderTarget?br>
  D3D9现在允许多重RenderTarget存在Q不q我们基本上只用一个,RenderTargetIndex设ؓ0Q第二个参数仍然是需要设定的表面。与D3D8相同的是Q在讑֮之前仍然需要先通过GetSurfaceLevel获得表面才行?br>
『顶点缓冲的锁定?br>
  注意IDirect3DVertexBuffer9::Lock的第三个参数Q从原来的BYTE**变成了void**。也是q样?#8230;…

『其他的一些变化?br>
1、CopyRects变成了UpdateSurface。和UpdateTexture一P只能从D3DPOOL_SYSTEMMEM拯到D3DPOOL_DEFAULT
2、增加了一个比较有用的IDirect3DDevice9::ColorFillҎQ作用是向D3DPOOL_DEFAULT的某个区域填充颜Ԍ和Clear的功能类|但是在用目的上要比Clear明确的多Qƈ且由于不牉|深度~冲之类Q速度要快一些?br>3、增加了一个IDirect3DDevice9::StretchRectҎQ通过q个Ҏ可以在D3DPOOL_DEFAULT的表面或U理之间q行带过滤器的羃放操作,免去利用Render的过E,非常有用。不q这个方法由于用了g处理Q限制较多,请大家仔l看SDK文档的Remarks部分?br>
《D3DX的变化?br>
  D3DX的变化实际上相当的多Q但正如我一开始所_基本都是面向3D的。需要我们注意的有以下几U:
1、D3DX***FromFile之类的函数支持的囑փ格式增加了,不过所增加的都是很见的格式。^时基本上q是用BMP、TGA和PNGp够?br>2、增加了D3DXSave***ToFileInMemoryQ将会把文g写入内存。这个函数的作用g不是很容易想刎ͼ但是如果你要写一个集成了转换、打包功能的工具Q这个就很有用了Q省M通过临时文g操作造成的各U问题。另外如果你熟悉某种囑Ş文g的格式的话,q可以通过直接讉Kq个文g获得RAW信息。注意,q类函数写入的是一个ID3DXBufferQ这个东西很单,只有两个特定的方法,一看便懂,不再多言?br>3、增加了一个ID3DXLineQ可以方便你?D上画U,创徏ID3DXLine的方法是D3DXCreateLine。这个东西也不复杂,使用Ҏ有点像ID3DXSpriteQ稍微研I一下就能弄懂,注意每次Draw的是D3DPT_LINESTRIP。用它比直接用顶点缓冲的好处是可以方便的打开反锯齿,效果?#8230;…基本满意?br>4、增加了一个ID3DXRenderToSurfaceQ?#8220;理论上来?#8221;方便了利用RenderTarget的过E?#8230;…不过我感觉反而弄得复杂了。创建的Ҏ是D3DXCreateRenderToSurfaceQ有心情的朋友自qI看看吧Q我׃讲了?br>
  ID3DXSprite和ID3DXFont在Summer 2004的DX9 SDKQ也是W一版DX9.0cQ开始发生了很大变化Q下面详qͼ

『ID3DXSprite?br>
  你会发现ID3DXSprite::DrawTransform不见了,取而代之的是其功能被整合到ID3DXSprite::SetTransform里面Q也是说ؓ了羃攑֒旋{Q我们不得不和矩阉|交道了。其实也不会太复杂,因ؓ我们只是做一些矩阵运,学过U性代数的朋友肯定会很熟悉Q就你不怎么熟悉U性代敎ͼ也没关系QD3DX函数库提供了现成的矩阵运函敎ͼ你只要用p了?br>
D3DXMatrixScaling
D3DXMatrixRotationZ
D3DXMatrixTranslation

  按照序调用q三个函?#8230;…或许学过3D的马上就惛_q点了,的确是没错啦。注意顺序哦QScaling -> Rotation -> TranslationQ简USRTQ看q全金属狂潮吗?看过的话q个单词很好记吧^_^Q,弄错了可是得不到正确l果的?br>  你是不是惛_把同一个D3DXMATRIX当作参数使用三次Q错啦!你要用矩阵乘法。创Z个D3DXMATRIXQ比如mat1、mat2、mat3Q分别用q三个函数将其创Zؓ~放矩阵、旋转矩阵和q移矩阵Q然后在ID3DXSprite::SetTransformӞq样写:

SetTransform(mat1 * mat2 * mat3);

  有够ȝ的是不?ID3DXSprite方便了做3D的,可害苦了?D的,所以我已经不直接用q个了(什么叫不直接用Q往下看Q?br>
『ID3DXFont?br>
  大家来欢呼吧QSummer 2004改进的ID3DXFontd枪毙掉了上一话那个字体引?#8230;…
  q东西的改进Q怎么说呢Q应该说是改头换面吧Q速度、效果都和以前不是一个数量。可怜的PixelFontQ才存在了一话就要被抛弃了?br>  ID3DXFont多出来的几个ҎQPreload*()q类的,是把一些常用的字的字模提前d到内存里面加快速度Q同时还可以使用ID3DXSprite渲染Q进一步加快速度。虽然内部仍然有GDI的部分,不过很明昑ַ作方式发生了极大的变化。根据我的估计,q次的ID3DXFont很聪明的利用GDI获得文字的轮廓,然后通过U理来渲染。这L速度快得多了,而且文字质量也得C很好的控Ӟ基本和直接用GDI的质量相同了?br>  ׃PreloadCharacters()和PreloadGlyphs()不是那么好理解,一般用PreloadText()p。徏议将所有ASCII字符、标点符号和部分汉字预读q去。这个预读过E略微有ҎQ而且Ҏ预读的文字数量和你创建文字的字号Q占用的内存也不同。这里给大家一堆文字,你Copyq去pQ?br>
引用

const char strPreloadText[] = " 1234567890qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM~!@#$%^&*()-=[]\\;',./_+{}|:\"<>? 、?#183;ˉˇ¨〃—~?#8230;‘’“”〔〕〈〉《》「」『』〖〗【】!Q#K%Q'Q)Q+Q-Q/Q1Q3Q5Q7Q9Q;Q=Q?QAQEQIQGQテQOQ݋QOQͼQԌQ_QӞDQYQEQ|Q_`abcdefghijklmnopqrstuvwxyz{|}我人有的和M不ؓq工要在W一上是中国l已发了民同";

  注意W一个字W是I格哦!把空格预读进d是很重要的^_^
  看上dƈ不多Q因考虑到内存占用及速度Q我只预M一些符号和五笔的一键字。这些字W在24号字时候已l占用了?MB了,比vPixelFont字库占用的要大得多。天知道ID3DXFont到底预读了些什?#8230;…
  PreloadText()的第二个参数不要用strlenQsizeof(strPreloadText)卛_?br>  然后是利用ID3DXSprite来渲染。注意ID3DXFont::DrawText的第一个参数就是LPD3DXSPRITEQ因此如果要利用ID3DXSpriteQ要ID3DXFont::DrawText攑ֈID3DXSprite::Begin和ID3DXSprite::End之间。这是我刚才说的不直接用ID3DXSprite的意思,ID3DXFont会完成ID3DXSprite的全部调用,你不用担心?br>  另外你应该注意到ID3DXSprite::Begin增加了参敎ͼ实际上DX文档里面没说Q但是示例里面有Q如果想让ID3DXSprite发挥作用q且最大幅度的提升效率Q参C讑֮D3DXSPRITE_ALPHABLEND | D3DXSPRITE_SORT_TEXTURE卛_。意思很明白Q打开Alphaqo和纹理筛选。这里DX文档上有个错误一直没改:文档里给出的是D3DXSprite__SORT_TEXTUREQ但是你可以试试Q绝Ҏ错?br>  剩下的就没啥了,ID3DXFont的用方法上一话已l讲q。要注意的是D3DXCreateFont和D3DXCreateFontIndirect都发生了变化。D3DXCreateFont已经不再牉|GDI了,D3DXCreateFontIndirect所使用的结构也变成了D3DXFONT_DESCQ相对于LOGFONTl构Q除M一些用不着的参敎ͼ增加了一个MipLevelsQ就是MipMap{啦,不用多说Q?D下只?。其他的上一话都有。实际上׃D3DXCreateFont已经不再兌GDIQD3DXCreateFontIndirect的存在仅仅是׃历史原因Qؓ了兼容像我这Uh的用习惯)Q大家还是用D3DXCreateFont吧,省事?br>  截图׃贴了Q没啥意义。你可能觉得直接向后备缓冲上DrawTextq不够好看,那么先d一张纹理上Q然后将U理错位渲染到后备缓冲ƈ且打开U型qoQ就可以辑ֈ和PixelFont相同的效果了?br>  速度?#8230;…我画了整整一屏幕字,在不~冲文字的情况下Q这?#8220;~冲文字”和ID3DXFont的文字缓冲可不是一回事啊!看过上一话的都应该知道我q里指的是什么)Q速度仍然?20FPS以上。或怽会觉得速度q是有点慢,但是Q如果用D3D8的ID3DXFontMq么一屏幕Q基本就只剩20FPS了?br>  使用ID3DXFont替换掉PixelFont的优势就是可以方便的自定义字体字号了Qƈ且也不再受GB2312字库的限制。所以大安换了?#8230;…都换了吧……把PixelFont忘了?#8230;…

『稳定的DX9 SDK版本?br>
  我现在用的是April 2006Q而且应该会用很长旉。August 2006我是肯定不会ȝ啦!即我不再恐惧D3D9Q也会对q个SDK避让三分的。其实对?DQ我感觉用到April 2006p够了Q之后的DX9 SDK主要在D3DX?D函数库部分进行更?#8230;…其实也是U后的蚂p达不了几天,D3D10马上p出来了。要说D3D10?#8230;…你还是看我另外一日志好了,M打死我都不拿它做2D?br>
  实际上仅仅是2D的话Q从D3D8转向D3D9q没有多变化,主要是稳定嘛Q只要你不调用一些D3D9专用的功能,即拿D3D9来做2DQ在l大多数昑֍上还是能够运行的。嗯……GF2{以上吧,GF2之前的,也太老了Q无视好了?br>
《再上点菜好了:全屏q模式?br>
  其实q不是多么复杂的问题Q让我拖了这么久……不拖了,q里教l大家如何做全屏q模式以及如何处理设备丢q问题?br>
『创建全屏幕模式?br>
  D3DPRESENT_PARAMS里面QWindowed讑֮为falseQƈ且一定要讑֮BackBufferWidth和BackBufferHeightQ完毕?br>  哈哈Q就q么单,或许早就有h试q了Q但是你试试按下Alt+TabQ再切换回去Q保证你什么都看不到?br>  之前曄说过QDX8之前的版本,在全屏幕下工作比在窗口下ҎQ到DX8之后则完全颠倒过来。因为在H口模式下不用担心设备丢失(除非你更Ҏ面分辨率Q,全屏q模式下׃有这个问题了。下面详qͼ

『设备、资源丢失?br>
  讑֤丢失会发生在全屏q模式下切换回桌面时Q不论是通过Alt+Tabq是QQ上有人给你发了张囄-_-bbbQ,而且如果在调用IDirect3DDevice9::ResetQ从现在开始就是D3D9了啊Q忘记D3D8?#8230;…Q的时候发生错误,讑֤也会丢失?br>  讑֤丢失会造成资源丢失Q所有创建在D3DPOOL_DEFAULT池的资源都会丢失Q需要重新创建,其内容当然也会消失,需要重写?br>  然而创建在D3DPOOL_SYSTEMMEM和D3DPOOL_SCRATCH池的资源不会受到影响。创建在D3DPOOL_MANAGED池的资源也不会丢失,而且在设备重新可用的时候,D3DPOOL_MANAGED池的资源也可以立x入用,内容也不会改变。看q个池名字:托管池就能知道,D3D帮你处理了所有问题?br>  因此避免讑֤丢失后资源丢q易方法就是将所有资源创建在D3DPOOL_MANAGED池内。不q这q不是个好方法,q意味着不能用渲染对象——记得吗QRenderTarget只能创徏在D3DPOOL_DEFAULT。实际上最好的Ҏ是跟t所有D3DPOOL_DEFAULT资源Q比如利用std::listQ将所有D3DPOOL_DEFAULT资源勾住Q在讑֤发生丢失的时候释放掉资源Q设备可以l用的时候重新创源,记得把数据写回去。对于其他的池就不用q么折腾了?br>
『当讑֤丢失之后?br>
  不论通过M方式发生了设备丢失,所有的操作几乎都会失效Q只有Release()可以用——其实D3D会保证有部分操作可以成功Q但是也仅仅?#8220;可以”成功而不?#8220;一?#8221;成功Q所以你q不如认定丢q时候全都会p|比较好——以及IDirect3DDevice9::TestCooperativeLevel。因此在讑֤丢失之后Q你应该停止整个游戏循环Q而通过反复调用IDirect3DDevice9::TestCooperativeLevel判断讑֤是否可用?br>
『IDirect3DDevice9::TestCooperativeLevel?br>
  q个Ҏ当前的讑֤状态。返回值有四种QD3D_OK一切正常,D3DERR_DEVICELOST讑֤丢失QD3DERR_DEVICENOTRESET讑֤可以Reset。另外还有D3D9新增的D3DERR_DRIVERINTERNALERRORQ遇到这个你完蛋了Q基本不可能恢复了,l止E序吧?br>  按照序来讲Q如果游戏在正常q行QD3D_OK会返回;如果发生了设备丢失ƈ且在q个时候不能恢复,比如全屏q模式的时候用户切换到了Windows桌面Q就会返回D3DERR_DEVICELOSTQ如果用户又切换回了游戏Q设备可以恢复了Q还没恢复呢Q只?#8220;可以”恢复而已Q,׃q回D3DERR_DEVICENOTRESET?br>  另外QIDirect3DDevice9::Present也会q回cM的|不过你最好别指望q个Q老老实实的用TestCooperativeLevel。因为Present在设备可以恢复的时候还是返回D3DERR_DEVICELOSTQ外一句:D3D10的时候TestCooperativeLevel׃完全整合到Present里面了,可喜可贺可喜可贺Q?br>
『处理设备丢失?br>
  看下面的伪代码:

switch (IDirect3DDevice9::TestCooperativeLevel()){
  case D3D_OK:
    GameLoop();
    break;
  case D3DERR_DEVICELOST:
    break;
  case D3DERR_DEVICENOTRESET
    OnLostDevice();
    IDirect3DDevice9::Reset();
    OnResetDevice();
    break;
  default:
    QuitGame();
    break;
}

  GameLoop()是你的游戏q行的过E了。把q个switch写在我们游戏框架的GameMain()部分Q具体的位置可以看Q何一话附带的源代码?br>  好像我一直没有讲IDirect3DDevice9::Reset的参数啊Q因为只有一个参敎ͼ是指向D3DPRESENT_PARAMS的指针。把你第一ơ创备时使用的D3DPRESENT_PARAMSl构保存hQ供Reset来用?br>  OnLostDevice()是Release掉所有D3DPOOL_DEFAULT的资源,OnResetDevice()是Create*()恢复啦!你可能注意到ID3DXFont、ID3DXSprite{等都有同名的方法,是在这个时候调用的。如果你没有q么做,也就是说q保留着MD3DPOOL_DEFAULT的资源的话,IDirect3DDevice9::Reset׃定会p|?br>  另外在OnResetDevice里面你还要重新进?font size=+0>SetRenderState?font size=+0>SetSamplerState{等QReset之后q些东西也丢׃。实际上Reset和重新创Zơ设备类|所不同的是重新创徏讑֤的话你需要连D3DPOOL_MANAGED的资源也Release掉。这个话题就不讨Z?br>  从代码可以看出来QD3DERR_DEVICELOST时程序什么都没做Q只是在ȝ。我认ؓq是一个好习惯Q因为实在不能保证在D3DERR_DEVICELOST旉了Releaseq能q什么,与其q样q不如等讑֤能用了再说?br>
  实在懒得资源的话,全部D3DPOOL_MANAGED好了。至于渲染对象?自己惛_法?br>
『h工制?#8220;讑֤丢失”?br>
  “q嘛q要刉设备丢失啊Q?#8221;如果更改游戏分L率、色深、切换全屏幕及窗口状态,q行q样的操作也要通过ResetQ同LQReset之前也要释放掉所有D3DPOOL_DEFAULT资源Q其实严格来_q有更多的资源也要释放,不过?D下基本不会创c资源,你就不用了Qƈ且调用ID3DXSprite::OnLostDevice之类的方法。这是人工刉?#8220;讑֤丢失”了。实际上在这个过E设备ƈ没有真正的丢失,只是会有一D|间处于不可用的状态,此时Reset未q回Q整个D3D讑֤好像死了一栗D个例子,你切换桌面分辨率Q会有那么一D|间显C器上什么都不显C,然后很快正怺。和q个现象是同一个原因。Reset成功后记得恢复资源?br>  你可能注意到q里的Reset和上面的Reset不是一回事。的是q样Q这里是Z重设状态而不是恢复设备。因此更改分辨率、色qReset需要写到switch外面Q也是别和它搅和的意?_-bb。而且你只需要OnLostDevice -> Reset -> OnResetDevice。记住:正确的调用Reset不会造成讑֤丢失Q这个概念别弄؜了?br>
『切换全屏幕模式时的注意事项?br>
  注意WindowStyle的变化。切换成全屏q模式后Q只能用WS_POPUPQ不然显CZ变得怪怪的Q你可以通过SetWindowLongPtr函数更改H口外观Q第二个参数指定GWL_STYLE卛_。别忘了WS_VISIBLE啊!不然你什么都看不见?br>
『更详细的文档?br>
  我这里只是简单讨Z造成讑֤丢失的原因及处理ҎQ更详细的内容你可以参考DX SDK文档的Lost Device文章Qh家是权威的?br>
【以上,正片l束Q后面是ED?br>
  我们前进CD3D9Q赶上了时代?br>  我们创徏了全屏幕游戏Q赶上了时代?br>  我却变得一脑子糊Q被观众抛弃了?br>  哈哈Q开玩笑啦,不过q一话很乱倒是真的Q因Z论是更新到D3D9q是讑֤丢失Q牵扯的东西都太散太杂,l果弄得q一话也是一盘散沙(居然又没有附带代码)。唉Q大家就忍了吧,忍不了的话就来PIA我吧?br>
  关于更新至D3D9更多的内容,你可以参考SDK文档的《Converting to Direct3D 9》?br>
【以上,EDl束Q后面是……?br>
  W一季完l了……
  回过头来看看Q从W一话创Z个WindowsH口Q到q一话的讑֤丢失Q话题的层次一直在深入Q现在已l深入到了不再是“学习”而是“研究”的范围。我也不再想仅仅是搞“教学”而是惛_大家“讨论”。不q第一季主要还是教学吧。能坚持着看D2D教程到现在的Q应该基本能够写出完整的2D Demo来了吧。如果有什么问题的话,Ƣ迎提出Q我在看到后会立d{的……只要你这个问题不太RP的话……
  那么Q第二季会是什么样子?
  W二季就不再是教学了Q而开始我和大家的讨论q程。第二季的第一话,也就是第09话,我将提供一些高U技巧给大家Qƈ希望有兴的朋友和我一赯行这些技巧的研究。另外在W二季里面,我们q要创徏一?D囑Ş引擎。原来打给大家讲解Medux 2Q不q现在感觉这东西实在儿U,l对会让大家B4的。那么既然如此,q脆介绍Mercury 3好了Q有意见无?
  透漏一点下一话的内容吧:模糊_ֺ和多ơ纹理渲染,嘿嘿Q听上去挺高q是不Q实际上񔽎单,q你能不能惛_而已?br>  希望你在看完q一话之后,q回d把前面的内容看看Q相信你会得到新的收莗搞不好你还能抓出几个Bug呢!因ؓ我是惛_什么写什么,没个章法QBug是难免的?br>
 


附加:

Direct3D中的字体与文本显C?br>
囑Şpȝ中ؓ了获得当前运行程序的相关信息Q往往需要在屏幕上显C文本,Direct3D的功能扩展接口ID3DXFontҎ提供了方便的解决Ҏ?

 

 

创徏ID3DXFont对象

使用接口ID3DXFontl制文本Q首先需要通过函数D3DXCreateFont()创徏ID3DXFont字体对象。ID3DXFont接口装了Windows字体和Direct3D讑֤指针QD3DXCreateFont()函数通过Windows字体和Direct3D讑֤指针创徏ID3DXFont对象Q该函数的声明如下:

Creates a font object for a device and font.

HRESULT D3DXCreateFont(  LPDIRECT3DDEVICE9 pDevice,  INT Height,  UINT Width,  UINT Weight,  UINT MipLevels,  BOOL Italic,  DWORD CharSet,  DWORD OutputPrecision,  DWORD Quality,  DWORD PitchAndFamily,  LPCTSTR pFacename,  LPD3DXFONT * ppFont);
Parameters
pDevice
[in] Pointer to an IDirect3DDevice9 interface, the device to be associated with the font object.
Height
[in] The height of the characters in logical units.
Width
[in] The width of the characters in logical units.
Weight
[in] Typeface weight. One example is bold.
MipLevels
[in] The number of mipmap levels.
Italic
[in] True for italic font, false otherwise.
CharSet
[in] The character set of the font.
OutputPrecision
[in] Specifies how Windows should attempt to match the desired font sizes and characteristics with actual fonts. Use OUT_TT_ONLY_PRECIS for instance, to ensure that you always get a TrueType font.
Quality
[in] Specifies how Windows should match the desired font with a real font. It applies to raster fonts only and should not affect TrueType fonts.
PitchAndFamily
[in] Pitch and family index.
pFacename
[in] String containing the typeface name. If the compiler settings require Unicode, the data type LPCTSTR resolves to LPCWSTR. Otherwise, the string data type resolves to LPCSTR. See Remarks.
ppFont
[out] Returns a pointer to an ID3DXFont interface, representing the created font object.
Return Values
If the function succeeds, the return value is S_OK. If the function fails, the return value can be one of the following: D3DERR_INVALIDCALL, D3DXERR_INVALIDDATA, E_OUTOFMEMORY.

Remarks
The creation of an ID3DXFont object requires that the device supports 32-bit color.

The compiler setting also determines the function version. If Unicode is defined, the function call resolves to D3DXCreateFontW. Otherwise, the function call resolves to D3DXCreateFontA because ANSI strings are being used.

If you want more information about font parameters, see The Logical Font.

CZ代码如下Q?/p>

D3DXCreateFont(g_device, 50, 20, 20, 0, FALSE, DEFAULT_CHARSET, 0, 0, 0, "Arial", &g_font);



RedLight 2009-09-12 15:40 发表评论
]]>
优化3D囑Ş渲染通道负蝲(?http://www.shnenglu.com/flashboy/archive/2009/09/01/94944.htmlRedLightRedLightTue, 01 Sep 2009 01:32:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/09/01/94944.htmlhttp://www.shnenglu.com/flashboy/comments/94944.htmlhttp://www.shnenglu.com/flashboy/archive/2009/09/01/94944.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/94944.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/94944.html
   一般在光栅化之前的瓉UC”transform bound”, 三角形设|处理后的瓶颈称?#8221;fill bound”定位瓉的办?
  • 1.改变帧缓冲或者渲染目?Render Target)的颜色深?16 ?32 ?, 如果帧速改变了, 那么瓉应该在~冲(RenderTarget)的填充率上?
  • 2.否则试试改变贴图大小和脓图过滤设|? 如果帧速变?那么瓉应该是在贴图q里?
  • 3.否则改变分L?如果帧速改变了, 那么改变一下pixel shader的指令数? 如果帧速变? 那么瓉应该是pixel shader. 否则瓉在光栅化过E中?
  • 4.否则, 改变点格式的大? 如果帧速改变了, 那么瓉应该在显卡带宽上?
  • 5.如果以上都不? 那么瓉在CPUq一辏V?
  • 优化Ҏ36?
  • 1.量减少无用的顶Ҏ? 比如贴图坐标, 如果有Object使用2l有的?l? 那么?要将他们攑֜一个vertex buffer? q样可以减少传输的数据量?
  • 2.使用多个streamsource, 比如SkinMesh渲染, 可以把顶点坐标和法线q些每一帧都要修改的数据攑֜一个动态VB? 其它不需要修改的(如脓囑֝?攑ֈ一个静态VB? q样减了数据传输量?
  • 3.量使用16位的索引~冲,避免32位的. 一斚w费带宽, 一斚w也不是所有的昑֍都支?2位的索引~冲?
  • 4.可以考虑使用vertex shader来计静态VB中的数据.比如SkinMesh的顶点可以放到vectex shader中计? q样可以避免每一帧都从AGP内存中向昑֭传送数? q样也可以用静态VB了?
  • 5.坚决避免使用Draw**UP一族的函数来绘制多边Ş?
  • 6.在设计程序之前好好规划一下显卡内存的使用, 保framebuffer, 贴图, 静态VB能够正好攑օ昑֍的本地内存中?
  • 7.量佉KҎ式大是32字节的倍数.可以考虑使用压羃q的点格式然后用vertex shader去解. 或者留下冗余的部分, 佉K点大刚好32字节的倍数?br>
  • 8.点在顶点缓冲中的顺序尽量符合绘制的序, 考虑使用strips来代替list?
  • 9.如果可能量多的使用static vertex buffer代替dynamic vertex buffer?
  • 10.动态VB使用DISCARD参数来lock更新, 使用NOOVERWRITE来添?量不要使用不带参数的lock调用(0)?
  • 11.量减少lock的次? 有些东西q不一定非要每一帧都更新VB, 比如人物动画一般每U钟更新30ơVB基本上就够了?
  • 12.如果是因为需要绘制的点数据太多了可以考虑使用LOD, 但是现在的显卡的l制能力都很强劲, 所以需要权衡一下LOD是否能够带来相应的好? 如果q分的强化LOD很可能将瓉转移到CPUq边?
  • 13.避免q多的顶点计?比如q多的光? q于复杂的光照计?复杂的光照模?, U理自动生成的开启也会增加顶点的计算? 如果贴图坐标变换矩阵不是单位矩阵, 也会造成点计算量的增加, 所以如果纹理变换已l结? 记得要将U理变换矩阵设ؓ单位矩阵同时调整贴图坐标?
  • 14.避免Vertex shader指o数量太多或者分支过? 量减少vertex shader的长度和复杂E度. 量使用swizzling代替mov?
  • 15.如果图象质量斚w的计?pixel shader)范围很大, q且很复? 可以考虑试试全屏反走栗说不定更快?
  • 16.量按照front – back的顺序来l制?
  • 17.在shader中判断Z值可以避免绘制不可见的象? 但是nvidia单的shader不要q么?(Don't do this in a simple shader)?
  • 18.如果可能, 量使用vertex shader来代替pixel shader.计从逐象素变成逐顶炏V?
  • 19.量降低贴图的大?q大的脓囑֏能造成贴图cacheq蝲, 从而导致脓图cache命中降低.q大的脓图会D昑֭q蝲, q时候脓图是从系l内存中取的?br>
  • 20.只要可能q16位色的脓? 如环境脓图或者shadow map.它们?2位色的脓囑֮在是费?
  • 21.考虑使用DXT 贴图压羃?
  • 22.如果可能,使用单的贴图qo或者mip map, 除非必要否则量不要使用三线qo和各异性过? light map ?环境贴图基本上都不需要用它们?
  • 23.只有真正需要修改的贴图才用Dynamic, q且使用DISCRAD和WRITEONLY来lock?
  • 24.太多的~冲d可以考虑关闭Z-Writes如有些多pass的渲染中的后lpass或者粒子系l等半透明几何物体Q如果可以)?
  • 25.可能的话量使用alpha test代替alpha blending?
  • 26.如果不需要stencil buffer尽量?6位的Z buffer?
  • 27.减小RenderTarget 贴图的大? 如shadow map 环境贴图. 可能Ҏ不需要那么大效果很好?
  • 28.Stencil ?Z buffer 量一起clear. 他们本来是一块缓册Ӏ?
  • 29.量减少渲染状态的切换, 量一ơ画可能多的多边Ş。(Ҏ昑֍性能军_最多画多少Q?不过一般再多也不会多到哪里厅R?除非你根本不需要脓囑֒渲染状态的切换Q?
  • 30.量使用shader来代替Fixed Pipeline?
  • 31.量使用shader来实现来取代Multipass渲染效果?
  • 32.量优先先徏立重要的资源, 如Render target, shaders, 贴图, VB, IB{等.以免昑֭q蝲的时候它们被创徏到系l内存中?
  • 33.坚决不要在渲染@环中调用创徏资源?
  • 34.按照shader和脓囑ֈl后再渲?先按照shaders分组再按贴图?
  • 35.Color Stencil Z buffer量在一ơClear调用中清除?
  • 36.一个Vertex buffer 的大在2M-4M之间最好?


RedLight 2009-09-01 09:32 发表评论
]]>
OpenGL Performance Optimization(?http://www.shnenglu.com/flashboy/archive/2009/08/25/94364.htmlRedLightRedLightTue, 25 Aug 2009 06:05:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/08/25/94364.htmlhttp://www.shnenglu.com/flashboy/comments/94364.htmlhttp://www.shnenglu.com/flashboy/archive/2009/08/25/94364.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/94364.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/94364.htmlSIGGRAPH '97

Course 24: OpenGL and Window System Integration

OpenGL Performance Optimization



Contents



1. Hardware vs. Software

OpenGL may be implemented by any combination of hardware and software. At the high-end, hardware may implement virtually all of OpenGL while at the low-end, OpenGL may be implemented entirely in software. In between are combination software/hardware implementations. More money buys more hardware and better performance.

Intro-level workstation hardware and the recent PC 3-D hardware typically implement point, line, and polygon rasterization in hardware but implement floating point transformations, lighting, and clipping in software. This is a good strategy since the bottleneck in 3-D rendering is usually rasterization and modern CPU's have sufficient floating point performance to handle the transformation stage.

OpenGL developers must remember that their application may be used on a wide variety of OpenGL implementations. Therefore one should consider using all possible optimizations, even those which have little return on the development system, since other systems may benefit greatly.

From this point of view it may seem wise to develop your application on a low-end system. There is a pitfall however; some operations which are cheep in software may be expensive in hardware. The moral is: test your application on a variety of systems to be sure the performance is dependable.



2. Application Organization

At first glance it may seem that the performance of interactive OpenGL applications is dominated by the performance of OpenGL itself. This may be true in some circumstances but be aware that the organization of the application is also significant.

2.1 High Level Organization

Multiprocessing

Some graphical applications have a substantial computational component other than 3-D rendering. Virtual reality applications must compute object interactions and collisions. Scientific visualization programs must compute analysis functions and graphical representations of data.

One should consider multiprocessing in these situations. By assigning rendering and computation to different threads they may be executed in parallel on multiprocessor computers.

For many applications, supporting multiprocessing is just a matter of partitioning the render and compute operations into separate threads which share common data structures and coordinate with synchronization primitives.

SGI's Performer is an example of a high level toolkit designed for this purpose.

Image quality vs. performance

In general, one wants high-speed animation and high-quality images in an OpenGL application. If you can't have both at once a reasonable compromise may be to render at low complexity during animation and high complexity for static images.

Complexity may refer to the geometric or rendering attributes of a database. Here are a few examples.

  • During interactive rotation (i.e. mouse button held down) render a reduced-polygon model. When drawing a static image draw the full polygon model.
  • During animation, disable dithering, smooth shading, and/or texturing. Enable them for the static image.
  • If texturing is required, use GL_NEAREST sampling and glHint( GL_PERSPECTIVE_CORRECTION_HINT, GL_FASTEST ).
  • During animation, disable antialiasing. Enable antialiasing for the static image.
  • Use coarser NURBS/evaluator tesselation during animation. Use glPolygonMode( GL_FRONT_AND_BACK, GL_LINE ) to inspect tesselation granularity and reduce if possible.

Level of detail management and culling

Objects which are distant from the viewer may be rendered with a reduced complexity model. This strategy reduces the demands on all stages of the graphics pipeline. Toolkits such as Inventor and Performer support this feature automatically.

Objects which are entirely outside of the field of view may be culled. This type of high level cull testing can be done efficiently with bounding boxes or spheres and have a major impact on performance. Again, toolkits such as Inventor and Performer have this feature.

2.2 Low Level Organization

The objects which are rendered with OpenGL have to be stored in some sort of data structure. Some data structures are more efficient than others with respect to how quickly they can be rendered.

Basically, one wants data structures which can be traversed quickly and passed to the graphics library in an efficient manner. For example, suppose we need to render a triangle strip. The data structure which stores the list of vertices may be implemented with a linked list or an array. Clearly the array can be traversed more quickly than a linked list. The way in which a vertex is stored in the data structure is also significant. High performance hardware can process vertexes specified by a pointer more quickly than those specified by three separate parameters.

An Example

Suppose we're writing an application which involves drawing a road map. One of the components of the database is a list of cities specified with a latitude, longitude and name. The data structure describing a city may be:
	struct city {
float latitute, longitude;	/* city location */
char *name;			/* city's name */
int large_flag;  		/* 0 = small, 1 = large */
};
A list of cities may be stored as an array of city structs.

Our first attempt at rendering this information may be:

	void draw_cities( int n, struct city citylist[] )
{
int i;
for (i=0; i < n; i++) {
if (citylist[i].large_flag) {
glPointSize( 4.0 );
}
else {
glPointSize( 2.0 );
}
glBegin( GL_POINTS );
glVertex2f( citylist[i].longitude, citylist[i].latitude );
glEnd();
glRasterPos2f( citylist[i].longitude, citylist[i].latitude );
glCallLists( strlen(citylist[i].name),
GL_BYTE,
citylist[i].name );
}
}
This is a poor implementation for a number of reasons:
  • glPointSize is called for every loop iteration.
  • only one point is drawn between glBegin and glEnd
  • the vertices aren't being specified in the most efficient manner
Here's a better implementation:
	void draw_cities( int n, struct city citylist[] )
{
int i;
/* draw small dots first */
glPointSize( 2.0 );
glBegin( GL_POINTS );
for (i=0; i < n ;i++) {
if (citylist[i].large_flag==0) {
glVertex2f( citylist[i].longitude, citylist[i].latitude );
}
}
glEnd();
/* draw large dots second */
glPointSize( 4.0 );
glBegin( GL_POINTS );
for (i=0; i < n ;i++) {
if (citylist[i].large_flag==1) {
glVertex2f( citylist[i].longitude, citylist[i].latitude );
}
}
glEnd();
/* draw city labels third */
for (i=0; i < n ;i++) {
glRasterPos2f( citylist[i].longitude, citylist[i].latitude );
glCallLists( strlen(citylist[i].name),
GL_BYTE,
citylist[i].name );
}
}
In this implementation we're only calling glPointSize twice and we're maximizing the number of vertices specified between glBegin and glEnd.

We can still do better, however. If we redesign the data structures used to represent the city information we can improve the efficiency of drawing the city points. For example:

	struct city_list {
int num_cities;		/* how many cities in the list */
float *position;	/* pointer to lat/lon coordinates */
char **name;		/* pointer to city names */
float size;		/* size of city points */
};
Now cities of different sizes are stored in separate lists. Position are stored sequentially in a dynamically allocated array. By reorganizing the data structures we've eliminated the need for a conditional inside the glBegin/glEnd loops. Also, we can render a list of cities using the GL_EXT_vertex_array extension if available, or at least use a more efficient version of glVertex and glRasterPos.
	/* indicates if server can do GL_EXT_vertex_array: */
GLboolean varray_available;
void draw_cities( struct city_list *list )
{
int i;
GLboolean use_begin_end;
/* draw the points */
glPointSize( list->size );
#ifdef GL_EXT_vertex_array
if (varray_available) {
glVertexPointerEXT( 2, GL_FLOAT, 0, list->num_cities, list->position );
glDrawArraysEXT( GL_POINTS, 0, list->num_cities );
use_begin_end = GL_FALSE;
}
else
#else
{
use_begin_end = GL_TRUE;
}
#endif
if (use_begin_end) {
glBegin(GL_POINTS);
for (i=0; i < list->num_cities; i++) {
glVertex2fv( &position[i*2] );
}
glEnd();
}
/* draw city labels */
for (i=0; i < list->num_cities ;i++) {
glRasterPos2fv( list->position[i*2] );
glCallLists( strlen(list->name[i]),
GL_BYTE, list->name[i] );
}
}
As this example shows, it's better to know something about efficient rendering techniques before designing the data structures. In many cases one has to find a compromize between data structures optimized for rendering and those optimized for clarity and convenience.

In the following sections the techniques for maximizing performance, as seen above, are explained.



3. OpenGL Optimization

There are many possibilities to improving OpenGL performance. The impact of any single optimization can vary a great deal depending on the OpenGL implementation. Interestingly, items which have a large impact on software renderers may have no effect on hardware renderers, and vice versa! For example, smooth shading can be expensive in software but free in hardware While glGet* can be cheap in software but expensive in hardware.

After each of the following techniques look for a bracketed list of symbols which relates the significance of the optimization to your OpenGL system:

  • H - beneficial for high-end hardware
  • L - beneficial for low-end hardware
  • S - beneficial for software implementations
  • all - probably beneficial for all implementations

3.1 Traversal

Traversal is the sending of data to the graphics system. Specifically, we want to minimize the time taken to specify primitives to OpenGL.
Use connected primitives
Connected primitives such as GL_LINES, GL_LINE_LOOP, GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN, and GL_QUAD_STRIP require fewer vertices to describe an object than individual line, triangle, or polygon primitives. This reduces data transfer and transformation workload. [all]
Use the vertex array extension
On some architectures function calls are somewhat expensive so replacing many glVertex/glColor/glNormal calls with the vertex array mechanism may be very beneficial. [all]
Store vertex data in consecutive memory locations
When maximum performance is needed on high-end systems it's good to store vertex data in contiguous memory to maximize through put of data from host memory to graphics subsystem. [H,L]
Use the vector versions of glVertex, glColor, glNormal and glTexCoord
The glVertex, glColor, etc. functions which take a pointer to their arguments such as glVertex3fv(v) may be much faster than those which take individual arguments such as glVertex3f(x,y,z) on systems with DMA-driven graphics hardware. [H,L]
Reduce quantity of primitives
Be careful not to render primitives which are over-tesselated. Experiment with the GLU primitives, for example, to determine the best compromise of image quality vs. tesselation level. Textured objects in particular may still be rendered effectively with low geometric complexity. [all]
Display lists
Use display lists to encapsulate frequently drawn objects. Display list data may be stored in the graphics subsystem rather than host memory thereby eliminating host-to-graphics data movement. Display lists are also very beneficial when rendering remotely. [all]
Don't specify unneeded per-vertex information
If lighting is disabled don't call glNormal. If texturing is disabled don't call glTexCoord, etc.
Minimize code between glBegin/glEnd
For maximum performance on high-end systems it's extremely important to send vertex data to the graphics system as fast as possible. Avoid extraneous code between glBegin/glEnd.

Example:

	glBegin( GL_TRIANGLE_STRIP );
for (i=0; i < n; i++) {
if (lighting) {
glNormal3fv( norm[i] );
}
glVertex3fv( vert[i] );
}
glEnd();

This is a very bad construct. The following is much better:

	if (lighting) {
glBegin( GL_TRIANGLE_STRIP );
for (i=0; i < n ;i++) {
glNormal3fv( norm[i] );
glVertex3fv( vert[i] );
}
glEnd();
}
else {
glBegin( GL_TRIANGLE_STRIP );
for (i=0; i < n ;i++) {
glVertex3fv( vert[i] );
}
glEnd();
}
Also consider manually unrolling important rendering loops to maximize the function call rate.

3.2 Transformation

Transformation includes the transformation of vertices from glVertex to window coordinates, clipping and lighting.

Lighting
  • Avoid using positional lights, i.e. light positions should be of the form (x,y,z,0) [L,S]
  • Avoid using spotlights. [all]
  • Avoid using two-sided lighting. [all]
  • Avoid using negative material and light color coefficients [S]
  • Avoid using the local viewer lighting model. [L,S]
  • Avoid frequent changes to the GL_SHININESS material parameter. [L,S]
  • Some OpenGL implementations are optimized for the case of a single light source.
  • Consider pre-lighting complex objects before rendering, ala radiosity. You can get the effect of lighting by specifying vertex colors instead of vertex normals. [S]
Two sided lighting
If you want both the front and back of polygons shaded the same try using two light sources instead of two-sided lighting. Position the two light sources on opposite sides of your object. That way, a polygon will always be lit correctly whether it's back or front facing. [L,S]
Disable normal vector normalization when not needed
glEnable/Disable(GL_NORMALIZE) controls whether normal vectors are scaled to unit length before lighting. If you do not use glScale you may be able to disable normalization without ill effects. Normalization is disabled by default. [L,S]
Use connected primitives
Connected primitives such as GL_LINES, GL_LINE_LOOP, GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN, and GL_QUAD_STRIP decrease traversal and transformation load.
glRect usage
If you have to draw many rectangles consider using glBegin(GL_QUADS) ... glEnd() instead. [all]

3.3 Rasterization

Rasterization is the process of generating the pixels which represent points, lines, polygons, bitmaps and the writing of those pixels to the frame buffer. Rasterization is often the bottleneck in software implementations of OpenGL.
Disable smooth shading when not needed
Smooth shading is enabled by default. Flat shading doesn't require interpolation of the four color components and is usually faster than smooth shading in software implementations. Hardware may perform flat and smooth-shaded rendering at the same rate though there's at least one case in which smooth shading is faster than flat shading (E&S Freedom). [S]
Disable depth testing when not needed
Background objects, for example, can be drawn without depth testing if they're drawn first. Foreground objects can be drawn without depth testing if they're drawn last. [L,S]
Disable dithering when not needed
This is easy to forget when developing on a high-end machine. Disabling dithering can make a big difference in software implementations of OpenGL on lower-end machines with 8 or 12-bit color buffers. Dithering is enabled by default. [S]
Use back-face culling whenever possible.
If you're drawing closed polyhedra or other objects for which back facing polygons aren't visible there's probably no point in drawing those polygons. [all]
The GL_SGI_cull_vertex extension
SGI's Cosmo GL supports a new culling extension which looks at vertex normals to try to improve the speed of culling.
Avoid extra fragment operations
Stenciling, blending, stippling, alpha testing and logic ops can all take extra time during rasterization. Be sure to disable the operations which aren't needed. [all]
Reduce the window size or screen resolution
A simple way to reduce rasterization time is to reduce the number of pixels drawn. If a smaller window or reduced display resolution are acceptable it's an easy way to improve rasterization speed. [L,S]

3.4 Texturing

Texture mapping is usually an expensive operation in both hardware and software. Only high-end graphics hardware can offer free to low-cost texturing. In any case there are several ways to maximize texture mapping performance.
Use efficient image formats
The GL_UNSIGNED_BYTE component format is typically the fastest for specifying texture images. Experiment with the internal texture formats offered by the GL_EXT_texture extension. Some formats are faster than others on some systems (16-bit texels on the Reality Engine, for example). [all]
Encapsulate texture maps in texture objects or display lists
This is especially important if you use several texture maps. By putting textures into display lists or texture objects the graphics system can manage their storage and minimize data movement between the client and graphics subsystem. [all]
Use smaller texture maps
Smaller images can be moved from host to texture memory faster than large images. More small texture can be stored simultaneously in texture memory, reducing texture memory swapping. [all]
Use simpler sampling functions
Experiment with the minification and magnification texture filters to determine which performs best while giving acceptable results. Generally, GL_NEAREST is fastest and GL_LINEAR is second fastest. [all]
Use the same sampling function for minification and magnification
If both the minification and magnification filters are GL_NEAREST or GL_LINEAR then there's no reason OpenGL has to compute the lambda value which determines whether to use minification or magnification sampling for each fragment. Avoiding the lambda calculation can be a good performace improvement.
Use a simpler texture environment function
Some texture environment modes may be faster than others. For example, the GL_DECAL or GL_REPLACE_EXT functions for 3 component textures is a simple assignment of texel samples to fragments while GL_MODULATE is a linear interpolation between texel samples and incoming fragments. [S,L]
Combine small textures
If you are using several small textures consider tiling them together as a larger texture and modify your texture coordinates to address the subtexture you want. This technique can eliminate texture bindings.
Use glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_FASTEST)
This hint can improve the speed of texturing when perspective- correct texture coordinate interpolation isn't needed, such as when using a glOrtho() projection.
Animated textures
If you want to use an animated texture, perhaps live video textures, don't use glTexImage2D to repeatedly change the texture. Use glTexSubImage2D or glTexCopyTexSubImage2D. These functions are standard in OpenGL 1.1 and available as extensions to 1.0.

3.5 Clearing

Clearing the color, depth, stencil and accumulation buffers can be time consuming, especially when it has to be done in software. There are a few tricks which can help.
Use glClear carefully [all]
Clear all relevant color buffers with one glClear.

Wrong:

  glClear( GL_COLOR_BUFFER_BIT );
if (stenciling) {
glClear( GL_STENCIL_BUFFER_BIT );
}
Right:

  if (stenciling) {
glClear( GL_COLOR_BUFFER_BIT | GL_STENCIL_BUFFER_BIT );
}
else {
glClear( GL_COLOR_BUFFER_BIT );
}
Disable dithering
Disable dithering before clearing the color buffer. Visually, the difference between dithered and undithered clears is usually negligable.
Use scissoring to clear a smaller area
If you don't need to clear the whole buffer use glScissor() to restrict clearing to a smaller area. [L].
Don't clear the color buffer at all
If the scene you're drawing opaquely covers the entire window there is no reason to clear the color buffer.
Eliminate depth buffer clearing
If the scene you're drawing covers the entire window there is a trick which let's you omit the depth buffer clear. The idea is to only use half the depth buffer range for each frame and alternate between using GL_LESS and GL_GREATER as the depth test function.

Example:

   int EvenFlag;
/* Call this once during initialization and whenever the window
* is resized.
*/
void init_depth_buffer( void )
{
glClearDepth( 1.0 );
glClear( GL_DEPTH_BUFFER_BIT );
glDepthRange( 0.0, 0.5 );
glDepthFunc( GL_LESS );
EvenFlag = 1;
}
/* Your drawing function */
void display_func( void )
{
if (EvenFlag) {
glDepthFunc( GL_LESS );
glDepthRange( 0.0, 0.5 );
}
else {
glDepthFunc( GL_GREATER );
glDepthRange( 1.0, 0.5 );
}
EvenFlag = !EvenFlag;
/* draw your scene */
}
Avoid glClearDepth( d ) where d!=1.0
Some software implementations may have optimized paths for clearing the depth buffer to 1.0. [S]

3.6 Miscellaneous

Avoid "round-trip" calls
Calls such as glGetFloatv, glGetIntegerv, glIsEnabled, glGetError, glGetString require a slow, round trip transaction between the application and renderer. Especially avoid them in your main rendering code.

Note that software implementations of OpenGL may actually perform these operations faster than hardware systems. If you're developing on a low-end system be aware of this fact. [H,L]

Avoid glPushAttrib
If only a few pieces of state need to be saved and restored it's often faster to maintain the information in the client program. glPushAttrib( GL_ALL_ATTRIB_BITS ) in particular can be very expensive on hardware systems. This call may be faster in software implementations than in hardware. [H,L]
Check for GL errors during development
During development call glGetError inside your rendering/event loop to catch errors. GL errors raised during rendering can slow down rendering speed. Remove the glGetError call for production code since it's a "round trip" command and can cause delays. [all]
Use glColorMaterial instead of glMaterial
If you need to change a material property on a per vertex basis, glColorMaterial may be faster than glMaterial. [all]
glDrawPixels
  • glDrawPixels often performs best with GL_UNSIGNED_BYTE color components [all]
  • Disable all unnecessary raster operations before calling glDrawPixels. [all]
  • Use the GL_EXT_abgr extension to specify color components in alpha, blue, green, red order on systems which were designed for IRIS GL. [H,L].
Avoid using viewports which are larger than the window
Software implementations may have to do additional clipping in this situation. [S]
Alpha planes
Don't allocate alpha planes in the color buffer if you don't need them. Specifically, they are not needed for transparency effects. Systems without hardware alpha planes may have to resort to a slow software implementation. [L,S]
Accumulation, stencil, overlay planes
Do not allocate accumulation, stencil or overlay planes if they are not needed. [all]
Be aware of the depth buffer's depth
Your OpenGL may support several different sizes of depth buffers- 16 and 24-bit for example. Shallower depth buffers may be faster than deep buffers both for software and hardware implementations. However, the precision of of a 16-bit depth buffer may not be sufficient for some applications. [L,S]
Transparency may be implemented with stippling instead of blending
If you need simple transparent objects consider using polygon stippling instead of alpha blending. The later is typically faster and may actually look better in some situations. [L,S]
Group state changes together
Try to mimimize the number of GL state changes in your code. When GL state is changed, internal state may have to be recomputed, introducing delays. [all]
Avoid using glPolygonMode
If you need to draw many polygon outlines or vertex points use glBegin with GL_POINTS, GL_LINES, GL_LINE_LOOP or GL_LINE_STRIP instead as it can be much faster. [all]

3.7 Window System Integration

Minimize calls to the make current call
The glXMakeCurrent call, for example, can be expensive on hardware systems because the context switch may involve moving a large amount of data in and out of the hardware.
Visual / pixel format performance
Some X visuals or pixel formats may be faster than others. On PCs for example, 24-bit color buffers may be slower to read/write than 12 or 8-bit buffers. There is often a tradeoff between performance and quality of frame buffer configurations. 12-bit color may not look as nice as 24-bit color. A 16-bit depth buffer won't have the precision of a 24-bit depth buffer.

The GLX_EXT_visual_rating extension can help you select visuals based on performance or quality. GLX 1.2's visual caveat attribute can tell you if a visual has a performance penalty associated with it.

It may be worthwhile to experiment with different visuals to determine if there's any advantage of one over another.

Avoid mixing OpenGL rendering with native rendering
OpenGL allows both itself and the native window system to render into the same window. For this to be done correctly synchronization is needed. The GLX glXWaitX and glXWaitGL functions serve this purpose.

Synchronization hurts performance. Therefore, if you need to render with both OpenGL and native window system calls try to group the rendering calls to minimize synchronization.

For example, if you're drawing a 3-D scene with OpenGL and displaying text with X, draw all the 3-D elements first, call glXWaitGL to synchronize, then call all the X drawing functions.

Don't redraw more than necessary
Be sure that you're not redrawing your scene unnecissarily. For example, expose/repaint events may come in batches describing separate regions of the window which must be redrawn. Since one usually redraws the whole window image with OpenGL you only need to respond to one expose/repaint event. In the case of X, look at the count field of the XExposeEvent structure. Only redraw when it is zero.

Also, when responding to mouse motion events you should skip extra motion events in the input queue. Otherwise, if you try to process every motion event and redraw your scene there will be a noticable delay between mouse input and screen updates.

It can be a good idea to put a print statement in your redraw and event loop function so you know exactly what messages are causing your scene to be redrawn, and when.

SwapBuffer calls and graphics pipe blocking
On systems with 3-D graphics hardware the SwapBuffers call is synchronized to the monitor's vertical retrace. Input to the OpenGL command queue may be blocked until the buffer swap has completed. Therefore, don't put more OpenGL calls immediately after SwapBuffers. Instead, put application computation instructions which can overlap with the buffer swap delay.

3.8 Mesa-specific

Mesa is a free library which implements most of the OpenGL API in a compatible manner. Since it is a software library, performance depends a great deal on the host computer. There are several Mesa-specific features to be aware of which can effect performance.

Double buffering
The X driver supports two back color buffer implementations: Pixmaps and XImages. The MESA_BACK_BUFFER environment variable controls which is used. Which of the two that's faster depends on the nature of your rendering. Experiment.
X Visuals
As described above, some X visuals can be rendered into more quickly than others. The MESA_RGB_VISUAL environment variable can be used to determine the quickest visual by experimentation.
Depth buffers
Mesa may use a 16 or 32-bit depth buffer as specified in the src/config.h configuration file. 16-bit depth buffers are faster but may not offer the precision needed for all applications.
Flat-shaded primitives
If one is drawing a number of flat-shaded primitives all of the same color the glColor command should be put before the glBegin call.

Don't do this:

	glBegin(...);
glColor(...);
glVertex(...);
...
glEnd();

Do this:

	glColor(...);
glBegin(...);
glVertex(...);
...
glEnd();
glColor*() commands
The glColor[34]ub[v] are the fastest versions of the glColor command.
Avoid double precision valued functions
Mesa does all internal floating point computations in single precision floating point. API functions which take double precision floating point values must convert them to single precision. This can be expensive in the case of glVertex, glNormal, etc.


4. Evaluation and Tuning

To maximize the performance of an OpenGL applications one must be able to evaluate an application to learn what is limiting its speed. Because of the hardware involved it's not sufficient to use ordinary profiling tools. Several different aspects of the graphics system must be evaluated.

Performance evaluation is a large subject and only the basics are covered here. For more information see "OpenGL on Silicon Graphics Systems".

4.1 Pipeline tuning

The graphics system can be divided into three subsystems for the purpose of performance evaluation:
  • CPU subsystem - application code which drives the graphics subsystem
  • Geometry subsystem - transformation of vertices, lighting, and clipping
  • Rasterization subsystem - drawing filled polygons, line segments and per-pixel processing
At any given time, one of these stages will be the bottleneck. The bottleneck must be reduced to improve performance. The strategy is to isolate each subsystem in turn and evaluate changes in performance. For example, by decreasing the workload of the CPU subsystem one can determine if the CPU or graphics system is limiting performance.

4.1.1 CPU subsystem

To isosulate the CPU subsystem one must reduce the graphics workload while presevering the application's execution characteristics. A simple way to do this is to replace glVertex() and glNormal calls with glColor calls. If performance does not improve then the CPU stage is the bottleneck.

4.1.2 Geometry subsystem

To isoslate the geometry subsystem one wants to reduce the number of primitives processed, or reduce the transformation work per primitive while producing the same number of pixels during rasterization. This can be done by replacing many small polygons with fewer large ones or by simply disabling lighting or clipping. If performance increases then your application is bound by geometry/transformation speed.

4.1.3 Rasterization subsystem

A simple way to reduce the rasterization workload is to make your window smaller. Other ways to reduce rasterization work is to disable per-pixel processing such as texturing, blending, or depth testing. If performance increases, your program is fill limited.

After bottlenecks have been identified the techniques outlined in section 3 can be applied. The process of identifying and reducing bottlenecks should be repeated until no further improvements can be made or your minimum performance threshold has been met.

4.2 Double buffering

For smooth animation one must maintain a high, constant frame rate. Double buffering has an important effect on this. Suppose your application needs to render at 60Hz but is only getting 30Hz. It's a mistake to think that you must reduce rendering time by 50% to achive 60Hz. The reason is the swap-buffers operation is synchronized to occur during the display's vertical retrace period (at 60Hz for example). It may be that your application is taking only a tiny bit too long to meet the 1/60 second rendering time limit for 60Hz.

Measure the performance of rendering in single buffer mode to determine how far you really are from your target frame rate.

4.3 Test on several implementations

The performance of OpenGL implementations varies a lot. One should measure performance and test OpenGL applications on several different systems to be sure there are no unexpected problems.




RedLight 2009-08-25 14:05 发表评论
]]>
Loading and displaying .X files without DirectX ----OpenGL渲染(?http://www.shnenglu.com/flashboy/archive/2009/05/20/83507.htmlRedLightRedLightWed, 20 May 2009 12:15:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/05/20/83507.htmlhttp://www.shnenglu.com/flashboy/comments/83507.htmlhttp://www.shnenglu.com/flashboy/archive/2009/05/20/83507.html#Feedback2http://www.shnenglu.com/flashboy/comments/commentRss/83507.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/83507.html阅读全文

RedLight 2009-05-20 20:15 发表评论
]]>
在面试一个游戏编E职位前,你需要知道的东西(?http://www.shnenglu.com/flashboy/archive/2009/01/04/71169.htmlRedLightRedLightSun, 04 Jan 2009 14:02:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/01/04/71169.htmlhttp://www.shnenglu.com/flashboy/comments/71169.htmlhttp://www.shnenglu.com/flashboy/archive/2009/01/04/71169.html#Feedback2http://www.shnenglu.com/flashboy/comments/commentRss/71169.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/71169.html阅读全文

RedLight 2009-01-04 22:02 发表评论
]]>
使用光照来表现立体感(?http://www.shnenglu.com/flashboy/archive/2009/01/03/71060.htmlRedLightRedLightSat, 03 Jan 2009 05:59:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/01/03/71060.htmlhttp://www.shnenglu.com/flashboy/comments/71060.htmlhttp://www.shnenglu.com/flashboy/archive/2009/01/03/71060.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/71060.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/71060.html阅读全文

RedLight 2009-01-03 13:59 发表评论
]]>
使用混合来实现半透明效果 http://www.shnenglu.com/flashboy/archive/2009/01/03/71057.htmlRedLightRedLightSat, 03 Jan 2009 05:45:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/01/03/71057.htmlhttp://www.shnenglu.com/flashboy/comments/71057.htmlhttp://www.shnenglu.com/flashboy/archive/2009/01/03/71057.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/71057.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/71057.html阅读全文

RedLight 2009-01-03 13:45 发表评论
]]>
U理的用入?/title><link>http://www.shnenglu.com/flashboy/archive/2009/01/03/71056.html</link><dc:creator>RedLight</dc:creator><author>RedLight</author><pubDate>Sat, 03 Jan 2009 05:43:00 GMT</pubDate><guid>http://www.shnenglu.com/flashboy/archive/2009/01/03/71056.html</guid><wfw:comment>http://www.shnenglu.com/flashboy/comments/71056.html</wfw:comment><comments>http://www.shnenglu.com/flashboy/archive/2009/01/03/71056.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.shnenglu.com/flashboy/comments/commentRss/71056.html</wfw:commentRss><trackback:ping>http://www.shnenglu.com/flashboy/services/trackbacks/71056.html</trackback:ping><description><![CDATA[     摘要:   <a href='http://www.shnenglu.com/flashboy/archive/2009/01/03/71056.html'>阅读全文</a><img src ="http://www.shnenglu.com/flashboy/aggbug/71056.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.shnenglu.com/flashboy/" target="_blank">RedLight</a> 2009-01-03 13:43 <a href="http://www.shnenglu.com/flashboy/archive/2009/01/03/71056.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>OpenGL片断试http://www.shnenglu.com/flashboy/archive/2009/01/03/71055.htmlRedLightRedLightSat, 03 Jan 2009 05:42:00 GMThttp://www.shnenglu.com/flashboy/archive/2009/01/03/71055.htmlhttp://www.shnenglu.com/flashboy/comments/71055.htmlhttp://www.shnenglu.com/flashboy/archive/2009/01/03/71055.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/71055.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/71055.html阅读全文

RedLight 2009-01-03 13:42 发表评论
]]>
Opengl~程低错误 (转蝲)http://www.shnenglu.com/flashboy/archive/2008/09/27/62913.htmlRedLightRedLightSat, 27 Sep 2008 09:13:00 GMThttp://www.shnenglu.com/flashboy/archive/2008/09/27/62913.htmlhttp://www.shnenglu.com/flashboy/comments/62913.htmlhttp://www.shnenglu.com/flashboy/archive/2008/09/27/62913.html#Feedback1http://www.shnenglu.com/flashboy/comments/commentRss/62913.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/62913.html阅读全文

RedLight 2008-09-27 17:13 发表评论
]]>
地Ş׃n面的点法线的计?/title><link>http://www.shnenglu.com/flashboy/archive/2008/09/19/62263.html</link><dc:creator>RedLight</dc:creator><author>RedLight</author><pubDate>Fri, 19 Sep 2008 03:45:00 GMT</pubDate><guid>http://www.shnenglu.com/flashboy/archive/2008/09/19/62263.html</guid><wfw:comment>http://www.shnenglu.com/flashboy/comments/62263.html</wfw:comment><comments>http://www.shnenglu.com/flashboy/archive/2008/09/19/62263.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.shnenglu.com/flashboy/comments/commentRss/62263.html</wfw:commentRss><trackback:ping>http://www.shnenglu.com/flashboy/services/trackbacks/62263.html</trackback:ping><description><![CDATA[     摘要:   <a href='http://www.shnenglu.com/flashboy/archive/2008/09/19/62263.html'>阅读全文</a><img src ="http://www.shnenglu.com/flashboy/aggbug/62263.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.shnenglu.com/flashboy/" target="_blank">RedLight</a> 2008-09-19 11:45 <a href="http://www.shnenglu.com/flashboy/archive/2008/09/19/62263.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>骨骼动画解释(?http://www.shnenglu.com/flashboy/archive/2008/09/19/62260.htmlRedLightRedLightFri, 19 Sep 2008 03:18:00 GMThttp://www.shnenglu.com/flashboy/archive/2008/09/19/62260.htmlhttp://www.shnenglu.com/flashboy/comments/62260.htmlhttp://www.shnenglu.com/flashboy/archive/2008/09/19/62260.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/62260.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/62260.html阅读全文

RedLight 2008-09-19 11:18 发表评论
]]>
教你如何使用maxscript调试?转蝲)http://www.shnenglu.com/flashboy/archive/2008/09/19/62239.htmlRedLightRedLightFri, 19 Sep 2008 00:33:00 GMThttp://www.shnenglu.com/flashboy/archive/2008/09/19/62239.htmlhttp://www.shnenglu.com/flashboy/comments/62239.htmlhttp://www.shnenglu.com/flashboy/archive/2008/09/19/62239.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/62239.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/62239.html 教你如何使用maxscript调试?/strong>
作?李英?#160;(转蝲)

 

  我个h来说3ds max 8脚本调试器用得不多,在这里我讲一下脚本调试器的原理和单的用法。脚本调试器只不q是一个max8的一个调试程序,它允许暂?ds maxȝE和其它U程Q可以用命令threads昄当前?ds max所有线E及U程idQ可以用setThread U程ID)切换当前调试的线E。一般情况调试器默认试最层(q个词不知道大家是否理解我的意思,像是家里面的菜盘子Q最层的盘子是最后放上去的,当然要拿开盘子只能从最上面一个开?的线E,3ds max本n在实C些操作时大量使用多线E,例如渲染Ӟ创Z一个新U程Q关于线E和q程的区别,你可以找相应~程书籍。在3ds max8的目前调试器版本Q还不支持鼠标选择某行讄断点Q至我没有扑ֈq个功能Q你扑ֈ了请告诉我)Q因此我现在断定Q这是一个不实用的工兗但?ds max8已经为我们提供了q个工具Q当然是有用的了Q只不过E稍ȝ?br>  使用脚本调试器(我假设你已经写好了一个脚本)Q首先在你的脚本你想要暂停的地方讄断点Q设|断点可以用break()函数Q也是当脚本运行到q里时暂停程序。这样当你运行这个脚本时你将会在脚本调试器的"输出"~辑H口中显C线E相应的内容。(我个为每q行一个脚本,3ds max׃ؓq个脚本创徏一个线E,所以才可以使用脚本调试器来跟踪脚本。谁能证实一下我的猜?我的E-MAIL: liyingjang@21cn.comQ。当脚本执行C讄断点处就停下来,q时你可以用getVar ""?setVar "" 来查看变量内容和临时讄变量内容。更方便查看变量内容的方法可以点?监视"按钮Q然后输入变量名按回车键卛_昄你刚输入的变量名的g。点?q行"按钮可以使脚本运行到下一个断点处暂停Q查看和讄变量的方法上一步所说的是一栗?br>  ȝ一下:3ds max8的目前的调试器还很不完善Q你也可以用丑陋来Ş容,希望以后的Max版本能把脚本调试器做得好点,好可惜我没有看到在MAX9调试器有一点点的变化)。脚本调试器虽不怎么P但也不是一无是处,臛_你可以少用print和format来查看运行时变量内容和调试脚本?br>  以下是我写了一个简单的调试脚本Q运行后当i=10时会启动脚本调试器?br>global ps=10
for i=0 to 100 do
(
    ps = ps + i
  if i==10 do break()
)
查看q行时断点的i变量Q可以点?监视"按钮Q输入i 然后回RQ就可以看到变量i的gؓ10?/p>

RedLight 2008-09-19 08:33 发表评论
]]>
Z四叉树空间划分的地Ş实时渲染Ҏhttp://www.shnenglu.com/flashboy/archive/2008/05/11/49572.htmlRedLightRedLightSun, 11 May 2008 13:54:00 GMThttp://www.shnenglu.com/flashboy/archive/2008/05/11/49572.htmlhttp://www.shnenglu.com/flashboy/comments/49572.htmlhttp://www.shnenglu.com/flashboy/archive/2008/05/11/49572.html#Feedback1http://www.shnenglu.com/flashboy/comments/commentRss/49572.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/49572.html    地Ş渲染一直以来都是计机囑Ş学中一个重要的研究领域。ƈ且在q一斚w已经诞生了许多优U的算法。其中包括基于体素的渲染ҎQ也有基于多边Ş的渲染方法。早期的游戏Q如三角z特U部队就是采用体素渲染法的成功例子。体素法cM光线q踪渲染Q它从屏q空间出发,扑ֈ地Ş与屏q像素发出的线交点Q然后确定该像素的颜艌Ӏ这U方法不依赖具体的图形硬Ӟ整个渲染q程完全使用CPU处理Q因此它不能使用Cg来加速,q且对于一个场景来_往往不只是地形,q有其他使用多边形描q的物体Q体素法渲染的图像很难与g渲染的多边Şq行混合Q因此这U方法现在用得极。而多边Ş渲染Ҏ则成ZU主。选择多边形来描述和渲染地形有很多的理由和优点。最重要的是它能够很好地使用g加速,q且能够和其他多边Ş对象一L一理?/span>

    已有大量优秀的基于多边Ş的地形渲染算法。比较经典的法?/span>M. Duchaineau{h提出ROAM法。这个算法采用一三角二叉树来描q整个地形。一个地形在最初的层次上由两个较大的等腰直角三角Şl成Q这两个{腰直角三角形可以被不断地细分来展现地Ş的更多细节。每一ơ细分过E都向直角三角Ş的斜边的中点处增加一个由高程数据所描述的顶点,该点所在的直角三角形一分ؓ二,同时该算法也定义了一些规则来保证地Ş中不会因盔R两个三角形细节层ơ的不同而出现裂~。这个算法已被许多游戏所采用。还有一cȝ法,通过地形在X-Z投媄面上不断地规则细分来得到不同的细节,q就是本文要介绍的四叉树I间划分法。另外,最新提出的一个地形算法也不得不提QHugues Hoppe?/font>2004q提出的几何裁剪图方?Geometry Clipmaps)Q算法用了最新硬件所支持的顶点纹理来定义地Ş的外观,q且对于距离摄媄Z同远q的地方采用不同的纹理层Q最大限度地使用g加速了地Ş渲染的过E。这个方法听h非常妙Q但它目前只被较的g支持。因为顶点纹理是Shader Model 3.0才支持的功能Q也是说只有DirectX 9.0cU别的显卡才能支持这U算法。这对于某些有普及性要求的囑Ş应用E序Q尤其是Ҏ戏来讲不是一件好的事情。因此大多数人现在还在用经典的地Ş渲染Ҏ?/span>

    首先Q基于四叉树的地形渲染方法用高E数据作为数据源。且法要求高程数据的大必Mؓ2n+1的正方Ş。所谓高E数据,卌彩范围在0-255的灰度图片,不同的灰度代表了不同的高度倹{如果某高程数据指出q个高程数据最高处的Y坐标值是4000Q那么在高程数据中一个gؓ255的像素点pC个点所代表的地形区域的高度?000Q同理如果该像素值是127那么pC个点所代表的地形区域的高度?000×(127/255)=2000。高E数据的每个像素都对应所渲染|格中的一个顶炏V另外还有一个参数描q顶点与点之间的水q,以及一个描q最大高度的参数。因此地形的基本数据l构如下Q?/span>

    struct Terrain
    {
        char **DEM; //一个描q高E数据的二维数组
        float CellSpace;
        float HeightScale; 
    }

    其中Q各变量的具体意义如下图所C:

    有了q些参数Q我们可以很Ҏ地由高程数据的参数值得到它所表述的多边Ş|格。得到这个网g后,可以单地把它攑օ点数组QƈZ建立一个顶点烦引,可以传入硬件进行渲染了。然而,事情q不是这么简单。对于较尺寸的高程数据(?29×129)Q这样做实可行Q但随着高程数据规模的增大,所需的顶Ҏ和描q网格的三角形数会急剧膨胀。这个数值很快就会大到最新的昑֍也无法接受。比如一?025×1025的高E数据,我们需?025×1025=1050625个顶点,以及1050625×2=2101250个三角Ş。就你的显卡每U能够渲?000万个三角形,你也只能得到不到5fps的渲染速度Q况且你的场景可能还不只包括地Ş。因此我们必L办法在不影响视觉效果的情况下~减所渲染的三角Ş数量Q另外还应该注意一ơ性将最多的数据预先传给g以节U带宽?/span>

    q里要讲解的法Q目的就是在不媄响或在视觉可以接受的范围内羃减所渲染三角形的数量Q以辑ֈ实时渲染的要求。根据测试,本算法在漫游大小?025*1025的地形时速度E_?50fps以上(在nVidia Geforce 6200 + P4 1.6GHz的硬件上得到)?/span>

    ׃地Ş覆盖范围q,但它的投影在XZq面上均匀分布(以下采用OpenGL中的x坐标p,Y轴ؓ竖直向上的坐标u)Q因此我们有必要考虑对地形进行空间划分。正是由于这L均匀分布Q给我们的划分过E带来了便利。我们不需要具体地d割某个三角ŞQ只要选择那些q顶点且和X或Z轴垂直的q面作ؓ划分面即可。例如对于一个高E数据,我们可以以坐标原点作为地形的中心点,然后沿着X轴和Z轴依ơ展开来分布各个顶炏V如下如所C?/span>

    首先Q我们可以选择X=0和Z=0q两个^面,地形划分ؓ{大的四个区域,然后对划分出来的四个子区域进行递归划分Q每ơ划分都选择交于区域中心点ƈ且互相垂直的两个q面作ؓ划分面,直到每个子区域都只包含一个地形单元块Q即两个三角形)而不能再划分为止。例如对于上图所C?*9大小的地形块Q经q划分之后如下图所C:

    由图可知Q只有高E数据满?cite style="FONT-STYLE: normal">大小2n+1的正方Şq个条gQ我们才可能对地形进行均匀划分?/cite>我们可以把划分结果用一|来表qͼ׃每次划分之后产生四个子节点,因此q棵树叫四叉树。那么,q棵树中应该存储那些信息呢?首先对于每个节点Q应该指定这个节Ҏ代表的地形的区域范围。ƈ不是把地形网g实际的顶Ҏ入树中,而是要在树中说明q个节点覆盖了地形的那些区域。比如一个子节点应该有一个Center(X,Y)变量Q指定这个节点的中心Ҏ对应的顶点烦引,或编受ؓ了方便v见,可以把地形中心点~号?0,0)然后沿着坐标轴递增。此外还要有个变量指定这个节点到底覆盖了地Ş的多个点。如下图所C?/span>

    我们目前的四叉树的数据结构如下:

    struct QuadTreeNode
    {
        QuadTreeNode *Children[4];
        int CenterX,CenterY;
        int HalfRange;
    }

    有了四叉树之后,如何利用它的优势呢?首先我们考虑单的视见体裁?View Frustum CullingQ以下简UVFC)。相信很多接触过基本囑Ş优化的h都应该熟悉VFCQVFC的作用既是对那些明显位于可见qx头体之外的多边Ş在把它们传给昑֍之前剔除掉。这个过E由CPU来完成。虽然简单,但它却非常有效。VFCq程如下Q?/span>

    1.为每个节点计包围球。包围球可以单的以中心顶点ؓ球心Q最大坐标值点(节点所覆盖的所有顶点的最大X、Y、Zg为此点的坐标?到球心的距离为半径?/span>

    2.Ҏ当前的投影和变换矩阵计算此时可视qx头体的六个^面方E。这一步可以参考Azure的Blog上的一文章,q篇文章l出了VFC的具体代码?a target=_blank>单击q里?/span>

    3.从树的根l点以深度优先的序遍历树。每ơ访问节ҎQ测试该节点包围球与视见体的怺情况。在下面的情况下Q包围球与视见体怺Q?/span>

        1) 球心在六个^面所包围的凸状区域内部?br>        2) 球心在六个^面所包围的凸状区域外部,但球心到某个q面的距d于半径?/span>

    4.如果怺试昄包围球和视见体存在交集,l箋递归遍历此节点的4个子节点Q如果此节点已经是叶节点Q则q个节点应被l制。如果不存在交集Q放弃这个节点,对于q个节点的所有子节点不再递归查。因为如果一个节点不可见Q那么其子节点一定不可见?/span>

    q样Q我们剔除了那些不在视见体内的地形区域,节约了一些资源。但q还不够。在某些情况下,VFC可能q会指出整个地Ş都可见,在这U情况下Q将q么多三角Ş都画出显然是不可取的?/span>

    因此q要考虑地Ş的细节层?LOD)。我们应该考虑刎ͼ地Ş不可能所有部分都一样^坦或陡峭。对于^坦的部分Q我们用q多的三角ŞLq是没有意义的。而对于v伏程度较大的区域Q只有较多的三角形数量才不让人感到尖锐的p。再者,无论地Ş起伏E度如何Q那些距视点很q的区域Q也没有必要p太多的资源去渲染Q毕竟它们投影到屏幕上的面积很小Q对其进行简化也是必要的?/span>

    既然我们要对起伏E度不同的区域采用不同的l节U别Q我们首先必LCU描q地形v伏程度的量。与其说起伏E度Q不如说是地形的某个点因ؓ被简化后而生的误差。要计算q个误差Q我们先要了解地形是如何被简化的?/span>

    考虑下图所C的地Ş块,它的渲染l果如下囑֏图所C?/span>

   现在如果要对所需渲染的三角Şq行化,我们可以考虑q个地Ş块每条边中间的顶?下图左侧U色?Q?/span>

   如果这些红色的点剔除Q我们可以得C囑֏ҎC的化后的网根{误差就在这一步生。由于红色的点被剔除后Q原本由U色点所表示的地形高度现在变成了两侧黑色点插值后的高度。这个高度就是误差。如下图?/span>

    因此Q对于每个节点,我们先计这个节Ҏ有边中点被删除后所造成的误差,分别CؓΔH1, ΔH2, ΔH3, ΔH4。如果这个节点包含子节点Q递归计算子节点的误差Qƈ把四个子节点的误差记?#916;Hs1, ΔHs2, ΔHs3, ΔHs4。这个节点的误差是q八个误差g的最大倹{由于这是一个递归的过E,因此应该把这个过E加到四叉树的生成过E中Qƈ向四叉树的数据结构中加入一个误差变量。如下?/span>

    struct QuadTreeNode
    {
        QuadTreeNode *Children;
        int CenterX,CenterY;
        int HalfRange;
        float DeltaH;  //节点误差?br>    }

    下面来看一下地形的具体渲染q程?/span>

    首先Q我们位于四叉树的根l点。我们此时考虑根结点的误差Q如果这个误差小于一个阈|直接使用根结点的中心点以及此节点的四个边角点作ؓ点渲染一个三角扇形,q个三角扇Ş是渲染出来的地形。但是更l常的情况下Q根l点的误差值是很大的,因此法认ؓ要对根结点进行细分,以展现更多细节。于是对于根l点的每个子节点Q重复这个步骤,x查它的误差值是否大于阈|如果大于Q直接渲染这个节点,如果于Q递归l分节点。目前我们的法伪代码如下?/span>

    procedure DrawTerrain(QuadTreeNode *node)
    {
      if (node->DeltaH > k)
      {
           for (i=0;i<4;i++)
           {
                DrawTerrain(node->Children[i]);//递归划分
           }
      }
      else
      {
           GraphicsAPI->DrawPrimitive(node);//以节点的中心点和四个边角点绘制三角扇?
      }   
    }

    q个伪代码在一个较高的层次上表qC法的基本思想。然而我们还有许多问题要考虑。其一是目前我们仅仅考虑了地形的l节层次和地形表面v伏程度的关系Q但q应该考虑地Ş块距视点远q跟地Şl节层次的关pR解册个问题很单,我们只需在伪代码的条件中加入距离q一因素卛_。即?/span>

        if (node->DeltaH > k)
        {
            ...
        }
        else ...

    改ؓQ?/span>

        if (node->DeltaH / d > k)
        {
            ...
        }
        else ...

    其中d点中心点与视点之间的距离。而事实上Q当l节E度与距ȝqx成反比时Q能够减更多的三角形,而且视觉效果更好Q只要阈值k讄得当Q根本感觉不出地形因点的Ud而发生几何Ş变。因此,我们最l的条g式ؓQ?/span>

    node->DeltaH / d2 > k

    q有一个很重要的问题,是q个法所产生的地形会因ؓ节点之间l节层次的不同而生裂~。下图说明了裂缝的生原因?/span>

    有两个方法可以解册个问题,一个方法是删除左侧节点中生裂~的点Q两条边能够重合。另一U方法是Zؓ地在右侧地Ş块中插入一条边Q这条边q接中心点和造成裂缝的顶点,从而消除裂~。在渲染地ŞӞ可以采取下面的办法避免裂~的产生Q?/span>

    1.在预处理阶段Qؓ所有顶点创Z个标记数l,标记以该点Z心点的节点在某一帧是否被l分。如果被l分则标Cؓ1Q否则标??/span>

    2.从根节点开始,以广度优先的序遍历四叉树,使用之前提出的条件式判断节点是否需要分剌Ӏ如果公式表明需要分Ԍq且与节点相ȝ四个节点的中心点都被标记?Q那么把q个节点及其四个子节点的标记设ؓ1Qƈ递归l分q个节点。否则,这个节点的标记设ؓ1Q把q个节点的四个子节点的标记设?Q然后采用下面的Ҏl制q个地Ş块:

        1)节点的中心点和四个边角点d到即绘制的三角扇Ş列表中?br>        2)依次查与四条边相ȝ节点的标记数l,如果相应的标Cؓ1Q那么将该点dC角扇形的点列表中,否则跌该点?br>        3)l制三角扇Ş?/span>

    我们最l的伪代码如下?/span>

bool IsNodeInFrustum(QuadTreeNode *node)

{

   return (node->BoudingSphere in frustum);

}

bool NeighbourIsValid(QuadTreeNode *node)

{

   return (all four neighbours of node are identified as 1)

}

void RenderTerrain()

{

   list<QuadTreeNode *>next,current,draw;

   int level =0;
   current.push_back(root);
   while (current.size()!=0)

   {

      for each thisNode in current

      {
         if (!IsNodeInFrustum(thisNode))
            continue;
         if (level == MaxResolution)
            draw.push_back(thisNode);
         else

         if (thisNode->DeltaH/(distance*distance) > k

             && NeighbourIsValid(thisNode) )

         {

             SetFlag(thisNode,1);

             for j= 1 to 4

             {

                next.push_back(thisNode->Children[j]);

                SetFlag(thisNode->Children[j],1)

             }

         }

         else

         {

            SetFlag(thisNode,1); 

            for j= 1 to 4

             {

                draw.push_back(thisNode->Children[j]);

                SetFlag(thisNode->Children[j],0);

             }

         } 

      }

      SwapList(current,next);
      next.clear();

      level++;

   }

   GraphicsAPI->DrawPrimitives(draw);  

}

    另外Q一个重要的优化是利用硬件的~冲区或点数组(对于不支持顶点缓冲的g而言)。因为地形无论怎样化,点数据L固定不变的。我们在每一帧动态生的仅仅是顶点烦引,因此我们有必要实现将地Ş的所有顶Ҏ据输入到点~冲中,然后在渲染时一ơ性将所有的索引传给昑֍Q以提高速度。实验表明,使用点~冲比直接用glBegin/glEndl制囑Ş要快5倍以上?/span>

    以上讲述了如何做到实时地渲染大型地Ş。主要应用了LOD和VFC两种手段来精三角形数量。然而VFC只能剔除不在视见体内的图形,而对于在视见体内但被其他更近的物体遮挡的情况却无能ؓ力。如果要实现地Ş的自遮挡剔除Q地q线法是一个好的选择。然而当你的场景不仅仅是包含地ŞӞ地^U算法也只能处理地Ş的自遮挡情况。因为地q线法只对2.5D的地?卛_XZq面上无重合投媄的场?有效。对于完?D场景Q地q线q不能很好的工作。所以当你在引擎中用地形时Q可以考虑地形分块后攑օ场景的管理树中,如BSP或Octree{。然后根据引擎的性质使用入口(Portal)、PVS或者遮挡测?Occlusion Culling){方法进行遮挡剔除。值得的是Q遮挡测试是一个非常灵zȝ实时的剔除算法,且无需M预计过E。但要想有效的实现它q不是一件容易的事。我曑ְ地Ş分块后用遮挡剔除来完成地Ş的自遮挡Q但是渲染速度不但没有提升Q反而有d的下降。因此如果要使用遮挡剔除的话必须和引擎结合v来统一q行遮挡试Q才有可能提高效率?/span>

    现在你应该了解了基本的地形实时渲染方法。要惌地Ş的外观更加真实,我们q需要更多的工作。我们需要ؓ地Ş加上U理贴图和光照。首先考虑地Ş的光照。由于地形的多边形网格是实时产生的,它会随着视点的移动而变化,因此如果你直接用OpenGL内置的顶点光照,你会得到极度不稳定的光照效果。你会看到地形表面会因ؓ你的Ud而不断蟩动。因此我们必M用其他的光照Ҏ来避免这个问题。我们想C光照贴图。光照脓图是一个游戏中常用的光照技术。它是一个覆盖了场景中所有多边Ş的脓图。通过l脓图赋|我们可以得到多边形表面复杂的光照效果。用好的算法计出来的光照贴图可以模拟极度逼真的光影效果。它l我们带来的视觉享受q远地超q了OpenGL的内|光照。有兛_照脓囄计算可以参考我译的一文章:辐射度算?Radiosity)

   

   你可以简单地为地形覆盖上单一的纹理,q看h些许增加了地形的真实性:

    在上图中Q我们创Z一个地形,q运用了一个重复的U理。这个过E让地Ş的无论哪一个区域看h都是一LQ例如都是草圎ͼ。这昄不太真实Q也q于乏味。或怽会创Z一q超大的囄Q以拉覆盖的方式映到地Ş表面。这样做的后果是内存开销q于庞大Q这样做也很会受到硬件的限制。因此我们应该用一U更好的U理贴图方式Q纹理烦引脓图?/span>

    U理索引贴图对三个可重复的纹理进行烦引脓图。所谓烦引脓图,是对三个可重复U理q行索引Q以军_地Ş的哪些区域需要用哪些纹理的混合来脓图。因为对于Q意的贴图Q都׃l包?个颜色通道Q即R、G、BQ的像素l成。用于烦引的贴图的像素ƈ不表C地形的某个区域的具体颜Ԍ而是表示地Ş的某个区域用何种具体的纹理脓图。因为具体的U理l节存储在这三个可重复的U理中,因此索引贴图的脓图方式也为拉伸到地Ş表面Q但它的分L率可以大大降低?/span>

    U理索引贴图的工作方式如下:对于地Ş投媄到屏q上的像素,查找该像素所映射到烦引脓图上的像素。然后根据这一像素R、G、B分量的不同,军_R、G、B分量所代表的具体纹理脓囄混合因子。根据这个؜合因子؜合三个可重复贴图后,؜合得到的最l颜色D出到屏幕上?/span>

    例如Qo索引贴图的R分量代表沙W的纹理,G分量代表草地QB分量代表岩石。如果烦引脓图上一个像素的值是(0,255,0)Q即l色Q则q个像素所对应的地形区域的具体U理׃ؓ草地。如果该像素颜色值是(127,127,0)Q即黄色Q则该像素所对应的地形区域的U理地和沙W的؜合,看v来既有草Q又有沙。又如下图显CZ一个样本烦引脓图,以及使用该脓囄引纹理之后的渲染效果?/span>

索引贴图QR=沙WQG=草地,B=岩石Q?/span>

渲染效果

    原理很简单,下面讲解一下具体的实现q程。首先,我们准备4个纹理,其中1个纹理烦引脓图,它将被拉伸覆盖整个地形,然后3张细节脓图,q将它们l定到相应的U理通道上。然后用Vertex Shader为每个顶点自动计烦引脓囄U理坐标Q在Fragment Shader里,对烦引脓图进行纹理查找,使用查找得到的颜色值的RGB颜色信息混合3张细节脓图,得到当前像素的颜艌Ӏ最后还应该把这个颜色和光照贴图中的值相乘,得到最l的l果。下面是相关的Shader代码Q用GLSL~写?/span>

Vertex Shader:

uniform float TexInc;   //U理~放?用于查找索引U理
void main()
{
  gl_TexCoord[6] = gl_Vertex;
  gl_TexCoord[0] = gl_MultiTexCoord0;
  gl_TexCoord[2] = TexInc*vec4(gl_Vertex.xz,0.0,0.0);
  gl_Position = ftransform();
}

Fragment Shader:

uniform sampler2D IndexMap;
uniform sampler2D LightMap;
uniform sampler2D texR,texG,texB,texA;
void main()
{
  vec4 idx,lm,r,g,b,color;
  idx = texture2D(IndexMap,gl_TexCoord[0].xy); //索引?br>  lm = texture2D(LightMap,gl_TexCoord[0].xy);  //光照?br>  r = texture2D(texR,gl_TexCoord[2].xy);   //R通道U理
  g = texture2D(texG,gl_TexCoord[2].xy);   //G通道U理
  b = texture2D(texB,gl_TexCoord[2].xy);   //B通道U理
  color = lm*(idx.x*r + idx.y*g+idx.z*b);  //混合颜色
  gl_FragColor = color;
}

    最后,如果你对本文有不解之处,Ƣ迎和我共同讨论?/span>



RedLight 2008-05-11 21:54 发表评论
]]>
X文g的用(完整Q?/title><link>http://www.shnenglu.com/flashboy/archive/2008/05/11/49571.html</link><dc:creator>RedLight</dc:creator><author>RedLight</author><pubDate>Sun, 11 May 2008 13:32:00 GMT</pubDate><guid>http://www.shnenglu.com/flashboy/archive/2008/05/11/49571.html</guid><wfw:comment>http://www.shnenglu.com/flashboy/comments/49571.html</wfw:comment><comments>http://www.shnenglu.com/flashboy/archive/2008/05/11/49571.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.shnenglu.com/flashboy/comments/commentRss/49571.html</wfw:commentRss><trackback:ping>http://www.shnenglu.com/flashboy/services/trackbacks/49571.html</trackback:ping><description><![CDATA[     摘要:   <a href='http://www.shnenglu.com/flashboy/archive/2008/05/11/49571.html'>阅读全文</a><img src ="http://www.shnenglu.com/flashboy/aggbug/49571.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.shnenglu.com/flashboy/" target="_blank">RedLight</a> 2008-05-11 21:32 <a href="http://www.shnenglu.com/flashboy/archive/2008/05/11/49571.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>投媄矩阵的实C及如何从投媄矩阵中获取各视裁体^?教程)http://www.shnenglu.com/flashboy/archive/2008/04/16/47292.htmlRedLightRedLightWed, 16 Apr 2008 09:49:00 GMThttp://www.shnenglu.com/flashboy/archive/2008/04/16/47292.htmlhttp://www.shnenglu.com/flashboy/comments/47292.htmlhttp://www.shnenglu.com/flashboy/archive/2008/04/16/47292.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/47292.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/47292.html阅读全文

RedLight 2008-04-16 17:49 发表评论
]]>
DirectX 9的坐标系l变?http://www.shnenglu.com/flashboy/archive/2008/04/16/47289.htmlRedLightRedLightWed, 16 Apr 2008 09:45:00 GMThttp://www.shnenglu.com/flashboy/archive/2008/04/16/47289.htmlhttp://www.shnenglu.com/flashboy/comments/47289.htmlhttp://www.shnenglu.com/flashboy/archive/2008/04/16/47289.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/47289.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/47289.html阅读全文

RedLight 2008-04-16 17:45 发表评论
]]>
Direct3D中实现图元的鼠标拑֏http://www.shnenglu.com/flashboy/archive/2008/04/16/47288.htmlRedLightRedLightWed, 16 Apr 2008 09:44:00 GMThttp://www.shnenglu.com/flashboy/archive/2008/04/16/47288.htmlhttp://www.shnenglu.com/flashboy/comments/47288.htmlhttp://www.shnenglu.com/flashboy/archive/2008/04/16/47288.html#Feedback0http://www.shnenglu.com/flashboy/comments/commentRss/47288.htmlhttp://www.shnenglu.com/flashboy/services/trackbacks/47288.html阅读全文

RedLight 2008-04-16 17:44 发表评论
]]>
ĻþþƷ| þ99Ʒþþþþþò| þó˾Ʒ| þAV뾫Ʒɫҹ| ھƷþþӰԺ| þó18վ| ޡvþþ뾫Ʒ| aëƬ÷˾þ| Ʒպþ| Ʒþþþþô| þþþù˾Ʒҹ| ޾Ʒ׽þþþþ | vaþþþúݺ| þҹӰԺѹۿ| þþƷһ | þҹۺϾþ| ƷŮþþ| һۺϾþù| ɫۺϾþۺ| ɫۺϾþۺۺ| þþƷ޸| þݹƷһ| þþƷҹһ| ˾þùѹۿƵ| aëƬ÷˾þ| þþþƷһ| ˳˳ۺþþ| þӰԺþ㽶߿ۿ| þҹɫƷ| avþþþòվ | Ʒ99þþƷ| 99þóĻ| պAVþһ| ޾þþһ| ŷþþþƷӰԺ| ĻþþƷ| ޹˾þþƷ99| ۺһ˾þþƷ | þۺϺݺۺϾþ| ŷպþĻ| þרƷ|