nanogui: More on Image handeling and optimizations

Previous by date:	26 Jan 2000 17:15:13 -0000 How to draw images in Microwindows/Nano-X, Greg Haerr
Next by date:	26 Jan 2000 17:15:13 -0000 Re: More on Image handeling and optimizations, Greg Haerr
Previous in thread:	26 Jan 2000 17:15:13 -0000 Re: More on Image handeling and optimizations, Kyle Harris
Next in thread:	26 Jan 2000 17:15:13 -0000 Re: More on Image handeling and optimizations, Greg Haerr

Subject: Re: More on Image handeling and optimizations
From: Morten Rolland ####@####.####
Date: 26 Jan 2000 17:15:13 -0000
Message-Id: <388F3698.8966EDFD@screenmedia.no>

Hello Greg and all!

Sorry for not being able to work as much on nano-X as
I'd like and what is probably needed.  There are just
too many areas that needs tending...  I have supplied
my patches to pre4 that illustrates our needs, and
a possible way to solve them.

Notice that I have implemented a 'driver_gc_t' low
level graphics context that works very well in the
low level Area handeling code and the interface to
it.  It helps keep the low level drivers both
elegant and fast (only a pointer gets passed around
inside and into the driver).

> I spent the whole weekend coming up to speed on alpha
> blending.  I will have a version for Microwindows out immediately
> after 0.87.  We will support alpha blending for 16bpp and 32bpp
> directly, as well as producing RGB->palette index conversion
> tables for 8bpp systems.  (I have to make the 8bpp work,
> since that's the only mode I can get my damn framebuffer
> to operate in).  The table sizes for 8bpp will total 64k.  No
> tables are required for 16bpp or 32bpp truecolor.

OK, nice!  I have alpha blending as well, for 16-bits, not
the highest of quality (some quantization noise for the
green channel because lookup of green is split in two
separate lookups pluss an addition to save table space.)

Do you unpack, multiply, add and pack for each pixel in
the 16 and 32 bit cases?  How fast is this method?
I rejected this method without even testing it because
I figured it would be too slow...

> In the beginning I plan to support constant alpha
> blending for an entire image, as well as per-pixel
> alpha blending with 24bpp images, with the alpha
> channel in an additional 8bpp.

OK, my code is ugly as it adds another pointer to memory
so the 16-bit image data are separate from an 8-bit
alpha map.  This will requre an API change, which I have
not figured out how to do yet.  One possible and
attractive solution is to make new Gd* and Gr*
functions.

> : I have been thinking of requiring word or dword padding
> : on GrArea as well - better safe than sorry.
> 
> It would probably be a good idea to require dword padding now.
> then we would have a common format for all monochrome
> and color images within Microwindows, and the data could
> be used between all routines...

Yes... Too bad my version of an optimized pixel area function
uses reversed bit order from what is both natural and common,
because of the T1lib output format....  The reason why the
Area function has been extended to support this is I need
an optimized version in the driver, and Area was natural
as I didn't want to intrude too much on Blit.  They should
probably be merged.

> This is a problem.  Microwindows, being message oriented,
> doesn't allocate any space for events, it just passes messages.
> Nano-X, however, is constantly allocating space for client
> connections and event structures.
> 
> In addition, with the new clipping code, we will be moving
> towards dynamic rectangle allocations.  [Yes, you can use
> the old clipping...]
> 
> I don't think it will be easy to meet your specific zero-memory
> after initialization need...

I don't really require *no* allocations after initialization, but
I'd like to be able to touch all the data-pages I *think* will
be needed, so that there is less chance of a sudden memory
shortage unless the server uses more memory than testing
indicated that it would (e.g. map a lot of pages initially
"for keeps" and have them used by malloc later on).
Maybe the page allocation code in Linux takes care of this
problem when there is no swap - I don't know.

We will probably have very few, maybe even zero overlapping
windows, so the clipping code will be predictable, and events
and client connections are behaving well I hope (little
fragmentation, same size reuse etc... And *no* leaks? :-)
We'll expend quite some effort to assure there are no
memory allocation problems and track down those that
exists when bringing Nano-X to production quality.

Back to the server side image storage question; what I'm
affraid of in this respect is nano-X server suddenly
dying because some seldomly used app requests a lot of
server side memory while opera or other is busy using all
available memory.  Preparing for all such possiblities are
error prone and wastes memory when not needed (if pre-
allocated).

The way we handle images (mmap into client app), and then transfer
them to the server is better for our use, as it is more acceptable
with a client crash than a nano-X server crash...
Also; when mmaping the image-file, the linux buffer cache may
reuse the mmaped pages when deemed not needed (the original
contents are read from flash memory), and common design themes can
be shared easilly.  This practically eliminates the need for
image data in RAM at the client (which mainly inspired me to
the "extract smaller image from larger" feature of GrArea or
similar).

Just a weird thought; We could mmap the image data
right into the server... Client says: "Map this file" to
server and later have pieces of it painted later on... Hmmm.
However, most big images are dynamic, so the gain will
probably be little (images in Opera).

> : Yes... But we will probably not get around this completely
> : anyway - ie. doing word aligned memcopy on an 8 bit display
> : would restrict your choices on where to put the image...:-)
> 
> No - the DWORD padding just ensures that the source
> bitmap is aligned for high-speed access across cache lines,
> etc.  Images are NOT restricted to multiples of the padding, and
> the copy to the destination isn't restricted either.  This is
> just a speed issue, not a limitation on size or placement.

Good - thats what I figured too, without giving it too much
thought.  This is not in my patch, but I agree we should settle
for 4 byte alignment.

> : Yes, but *inter* device blitting?  (Blitting from one gfx card
> : to another...?)
> 
> That's a horse of a different wheelbase.  Adding plans for
> multiple graphics cards is probably not something
> Microwindows will do in the near future.

Yes, but the screen->memory optimization is still an issue, no?
But I aggree we leave it for now - I was thinking long term.

> : If doing GrArea with blit, we need to setup a suitable psd for the
> : operation on every call to GrArea, which is kind of not needed.
> : One could pre-allocate a memory psd and only update the bits inside
> : it that are relevant to the blit in question, but this is kind of
> : an unclean situation.  I'd like the device-drivers and the memory
> : driver to fiddle with the internals of the psd as much as possible,
> : and not the engine code?
> 
> I've already accomplished this, but you may not have noticed.
> The first 10 words of a SCREENDEVICE are exactly what you're
> looking for.  The difference is that the function pointers are also
> included, so that another level of indirection isn't required.
> I haven't finished all the work with the function pointers, though.

No, not quite.  At least not in my version - I'd like to pass arguments
to the low level driver like "void *pixels" and "int dstx" for the
low level primitives.  Quickly changing arguments shouldn't
get passed in a psd IMHO.

OK.  Have a look at the patch - you may want to remove the patches to
the config files, you have them allready, or they are specific to my
setup.

I'm not done yet, but it illustrates very well what functionality
we need, and the optimizations I've attempted may prove valuable
input, and I'd love comments!

Best regards,
Morten Rolland

PS: The code should also run under X11 and other modes than 16bpp by
    using the older code in GdArea as a fallback, but this is not
    very well tested...

[Content type application/octet-stream not shown. Download]

Previous by date:	26 Jan 2000 17:15:13 -0000 How to draw images in Microwindows/Nano-X, Greg Haerr
Next by date:	26 Jan 2000 17:15:13 -0000 Re: More on Image handeling and optimizations, Greg Haerr
Previous in thread:	26 Jan 2000 17:15:13 -0000 Re: More on Image handeling and optimizations, Kyle Harris
Next in thread:	26 Jan 2000 17:15:13 -0000 Re: More on Image handeling and optimizations, Greg Haerr