[<<] [<] Page 1 of 1 [>] [>>] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Performance hints?
From: "Richard Copeman" ####@####.#### Date: 26 Nov 2007 12:04:14 +0000 Message-Id: <008401c83024$76d18e60$642aa8c0@Buffy> Hi, I am using 16 bit frame buffer driver and am getting killed on performance. I have hooked up a JTAG debugger and run some profiling on the code and it is spending about 35% of the cpu time in a small loop copying memory in linear16_drawhorzline(). It is here: _addr/line________|source |linear16_drawhorzline(PSD psd, MWCOORD x1, MWCOORD x2, MWCOOR 82 |{ 83 | ADDR16 addr = psd->addr; | 85 | assert (addr != 0); 86 | assert (x1 >= 0 && x1 < psd->xres); 87 | assert (x2 >= 0 && x2 < psd->xres); 88 | assert (x2 >= x1); 89 | assert (y >= 0 && y < psd->yres); 90 | assert (c < psd->ncolors); | 92 | DRAWON; 93 | addr += x1 + y * psd->linelen; 94 | if(gr_mode == MWMODE_COPY) { | /* FIXME: memsetw(dst, c, x2-x1+1)*/ =========>>>>> | while(x1++_<=_x2) 97 | *addr++ = c; | } else { | while (x1++ <= x2) { 100 | applyOp(gr_mode, c, addr, ADDR16); 101 | ++addr; | } | } 104 | DRAWOFF; Can somebody tell me why the memsetw() function has been commented out with a FIXME? Surely, that would have been a faster approach. Maybe I am expecting too much of Nano-X. If so, can somebody please confirm or deny this for me? I am trying to draw to an off screen buffer (640x640x16bpp) and then flush it to screen using GrArea(). Maybe there is a better way of doing it? If I am barking up the wrong tree here, can some kind soul point me at a direct fb library I can use? Thanks, Richard. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [nanogui] Performance hints?
From: "Greg Haerr" ####@####.#### Date: 27 Nov 2007 20:17:47 +0000 Message-Id: <100f01c83132$8013db90$0300a8c0@RDP> > it is spending about 35% of the cpu time in a small loop copying memory in > linear16_drawhorzline(). That's pretty interesting. Is just drawhorzline, or is there significant time being spent in other low-level draw routines? I don't recall why the memsetw is not in use, perhaps there wasn't a memsetw in the C library for the systems at the time. I would suggest that you uncomment it and see where the next bottleneck is. We should also be using Duff's loop unrolling in many places. The blit routine is another major area that could be sped up, its a bit of a longer discussion, since there are emulation layers upstairs in engine/devdraw.c, including GdArea and GdAreaInternal that should be combined with GdBlit and the two or three low level routines there now. > I am trying to draw to an off screen buffer (640x640x16bpp) and then flush it to screen using GrArea(). Maybe there is a better way of doing it? It would be better to use GrCopyArea in this case. That allows all bits to stay on the server. GrArea usually is used with GrReadArea or when the bits are in the client process and need to be moved to the server. Regards, Greg | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[<<] [<] Page 1 of 1 [>] [>>] |