nanogui: Ways to speed up a simple application?
Subject:
Re: Ways to speed up a simple application?
From:
"Aaron J. Grier" ####@####.####
Date:
18 Mar 2009 21:49:52 -0000
Message-Id: <20090318214842.GK3628@arwen.poofy.goof.com>
On Tue, Mar 17, 2009 at 10:47:23PM -0300, Ricardo Jasinski wrote:
> Aaron J. Grier ####@####.#### wrote:
> > to mitigate this a little, don't clear the existing screen until the
> > new one is ready. draw the new screen to an off-screen buffer, then
> > copy off-screen to on-screen. this won't make the change operation
> > any faster, but it will mean less time staring at a blank screen.
>
> Good point, I'll look into that. However, I'm not sure how it could be
> done, since all I do is update the button labels; everything else is
> done after the callback function ends, and control returns to the
> fltk::run loop.
at the FLTK level, this might be as simple as calling
Window::set_double_buffer() .
> What do you think, should we implement these fonts as "built-in" /
> "compiled-in" fonts? Do you believe this could result in a significant
> speed boost?
start working on nano-X only if you've determined that freetype is a
problem. as a sanity check, see if you use a built-in font (say,
"System") and see if the speed changes.
do you have a profiler? what is it showing you?
if freetype is eating all your cycles, working on nano-X is not going to
help you.
> Maybe [1-bit expansion] this is something I can implement in our
> system, since we are using a soft-core processor and custom
> instructions may be added to its instruction set. Btw, maybe you have
> an idea for an operation that could speed things up significantly if
> implemented in hardware?
low or zero-overhead copy/write loops are usually a big win for
graphics. I imagine that an accelerator like the Epson can take
advantage of DRAM's constant refreshing for blitting. I don't know how
much of this would be feasable with your hardware.
on the 68331 I was using, knowing the C constructs that translated into
zero-overhead loops gave big speed wins. (the i386 equivalent to "rep
stos")
> > in hindsight, creating a low-level DrawBitmap function (as indicated
> > in engine/devdraw.c) would likely give the widest benefit.
>
> Do you mean in case I had hardware accelerated graphics? Or is it
> something I can do with the framebuffer only?
removing per-pixel overhead when doing 1-to-screenbpp expansion writes
gave me a significant speed gain even on a dumb framebuffer.
> > how deep down the rabbit hole do you want to go? (=
>
> Let me think. I think I would start with the first choice in the list
> below, and go down all the way to the bottom, until I achieve the
> performance we need:
> - some structural change in my application source code
> - some hardware tweaking that could be done without touching any
> source code
> - driver optimizations that could be done without touching the
> application code
>
> And, what would be the most fun of all, implement some sort of
> hardware acceleration, but I don't think I'd have the time to do that
> within our current deadlines. Anyway, given enough dead ends, I've
> seen many deadlines change... :)
sounds like you need to answer the following before worrying about
nano-X:
- is FLTK double-buffering?
- is freetype eating your CPU time?
nano-X might end up being a wild goose chase...
--
Aaron J. Grier | "Not your ordinary poofy goof." | ####@####.####