[<<] [<] Page 3 of 4 [>] [>>] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: Microwindows for Hercules
From: Ben Pfaff ####@####.#### Date: 16 Jul 1999 22:25:54 -0000 Message-Id: <87yaggi69b.fsf@pfaffben.user.msu.edu> Chipzz ####@####.#### writes: > A good idea, almost. The BOGL library performs this for the packed pixel > modes, but the VGA requires OUT instructions inbetween memory accesses, > so it can't run on a generalized bit-depth algorithm in planes mode. (The VGA > design has to be seen/studied to be believed, I've never seen such a complicated > piece of hardware for something kinda-conceptually simple) Hmm then that's something that could be checked for in between STOSB instructions (or the like). We could for example use something like this (just an idea), where ? is a flag that isn't used (maybe the carry flag?): PUSH flags register CLI ... {If out needed} ST? ... {Bresenham} ... STOSB {or something like that, like OR} LOOPN? JN? :End ... {Perform OUT} ... LOOP ... :End ... POP flags register We could of course also use something else than a flag, like a register, if Bresenham doesn't already use all of them... Just an idea, I never did VGA 4 bit programming, I always used mode 13h. Hmm, it's a good idea. Unfortunately, I don't think that it will work out. The OUTs that need to be performed are not the same for each pixel; rather, they are dependent on what needs to be written. At any rate, adding those jumps will kill your performance on 8086-class processors, since it increases code size (read Abrash's _Zen of Assembly Language Programming_). Most VGA16 code ends up looking something like this: Set up lots of internal VGA registers with OUT operations. Read a byte from the memory byte that contains the pixel(s) of interest to load the internal VGA latches. Write an arbitrary byte whose value doesn't matter to the same memory byte. Start over. Which may be more of a problem is if we would use res > 320x200x256. These don't fit in one page, and we would have to do page swapping. (Except if we got a linear framebuffer). But there won't be many 8088-80286 that sup- port those res anyway.. Actually there's semi-standard 800x600x16 support that works on most SVGA controllers using the same VGA16 code; that requires 800 * 600 / 2 = 240,000 bytes memory, but it's mapped into 800 * 600 / 8 = 60,000 bytes memory at one bit per pixel with the VGA16 code. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
RE: Microwindows for Hercules
From: Greg Haerr ####@####.#### Date: 16 Jul 1999 22:42:53 -0000 Message-Id: <01BECFA9.B2A20DA0.greg@censoft.com> : Actually there's semi-standard 800x600x16 support that works on most : SVGA controllers using the same VGA16 code; that requires 800 * 600 / : 2 = 240,000 bytes memory, but it's mapped into 800 * 600 / 8 = 60,000 : bytes memory at one bit per pixel with the VGA16 code. : Speaking of this, what would be *really* cool would be to add SVGA bios support to my scr_bios driver, so that we could support the higher than 640x480 modes.... Any volunteers? Greg | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: Microwindows for Hercules
From: Alan Cox ####@####.#### Date: 18 Jul 1999 14:26:14 -0000 Message-Id: <E115rki-0006Q7-00@the-village.bc.nu> > No need. MicroWindows handles the Bresenham algorithm in the mid > level code in devdraw.c. It uses successive calls to drawpixel to make it work. > In this way, people like you and me don't have to rewrite bresenham for every > card someone wants.... The code in devdraw.c is very naiive. It assumes pixel plotting is the underlyin op. On many cards line slices are the underlying operation, horizontal or vertical. What you probably want to do is generate a series of draw_horizontal(x,y,l) or draw_vertical(x,y,l) calls for most things > This might be useful when bitblt is implemented though... Having 32K of offscreen memory is always useful. Alan | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
RE: Microwindows for Hercules
From: Greg Haerr ####@####.#### Date: 19 Jul 1999 17:10:43 -0000 Message-Id: <01BED1D6.BD213830.greg@censoft.com> : The code in devdraw.c is very naiive. It assumes pixel plotting is the underlyin : op. On many cards line slices are the underlying operation, horizontal or : vertical. What you probably want to do is generate a series of : : draw_horizontal(x,y,l) : : or : draw_vertical(x,y,l) : : calls for most things : That's a good idea. This would certainly speed up diagonal lines on systems with a fast horizontal line draw. The vertical doesn't add much, as most video planes aren't optimized for vertical line drawing. Currently, there aren't any applications that draw diagonal lines though, so the speed issue is mute. : > This might be useful when bitblt is implemented though... : : Having 32K of offscreen memory is always useful. : Definitely. I plan on adding offscreen drawing memory, but it requires some big architecture changes. Greg | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: Microwindows for Hercules
From: Ben Pfaff ####@####.#### Date: 19 Jul 1999 17:28:02 -0000 Message-Id: <87g12ky1k1.fsf@pfaffben.user.msu.edu> Greg Haerr ####@####.#### writes: : The code in devdraw.c is very naiive. It assumes pixel plotting is the underlyin : op. On many cards line slices are the underlying operation, horizontal or : vertical. What you probably want to do is generate a series of : : draw_horizontal(x,y,l) : : or : draw_vertical(x,y,l) : : calls for most things That's a good idea. This would certainly speed up diagonal lines on systems with a fast horizontal line draw. The vertical doesn't add much, as most video planes aren't optimized for vertical line drawing. Currently, there aren't any applications that draw diagonal lines though, so the speed issue is mute. A few days ago I was considering a faster-than-Bresenham(sp?) algorithm along the lines of what Alan was saying. I came up with two problems, both of which would only apply to assembly-language implementations on the 8086 through 80286: 1. AFAICT it would require at least one division operation, whereas standard Bresenham doesn't need any. This wouldn't be a problem for long diagonal lines, just for short ones. Division is expensive. 2. I can't think of a way to fit all the necessary info into the 8086 register set. The standard Bresenham algorithm fits, just barely, but it looks like an ``extended'' algorithm that keeps track of spans would need to use memory as well. This is a big loss on the 8086 IIRC. Can anyone inform me how long DIV r/m16 takes on an 8086? I seem to have lost my cycle-timing books, or perhaps I threw them out in a fit of optimism. -- "Debian for hackers, Red Hat for suits, Slackware for loons." --CmdrTaco <URL:http://slashdot.org/articles/99/03/22/0928207.shtml> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
RE: Microwindows for Hercules
From: Greg Haerr ####@####.#### Date: 19 Jul 1999 17:51:03 -0000 Message-Id: <01BED1DC.91334820.greg@censoft.com> : A few days ago I was considering a faster-than-Bresenham(sp?) : algorithm along the lines of what Alan was saying. I came up with two : problems, both of which would only apply to assembly-language : implementations on the 8086 through 80286: What would be *really* cool would be a super-fast implementation of VGA_drawhline() in assembly. That's something that would vastly improve nano-X and microwindows *now*, since fillrectangle is based on drawhline. This routine for VGA and standard memory ops would be great. Greg | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: Microwindows for Hercules
From: Alan Cox ####@####.#### Date: 19 Jul 1999 18:10:53 -0000 Message-Id: <E116Hix-0007sI-00@the-village.bc.nu> > 1. AFAICT it would require at least one division operation, > whereas standard Bresenham doesn't need any. This > wouldn't be a problem for long diagonal lines, just for > short ones. Division is expensive. No. You can do it by using Besenham and still speed it up > fits, just barely, but it looks like an ``extended'' > algorithm that keeps track of spans would need to use > memory as well. This is a big loss on the 8086 IIRC. Its not a big deal Firstly: if(x2-x1 > y2-y2) horizonal_optimised(); else vertical_optimised(); Next for Bresenham you drop the plot_pixel call and instead when you bump x (or y in vertical mode) you do plot_line(oldx,oldy, x,y); bump it oldx=x oldy=y Saves you function calls costs you four memory accesses per line - thats a win on everything. > Can anyone inform me how long DIV r/m16 takes on an 8086? I seem to "weeks" 8) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: Microwindows for Hercules
From: Ben Pfaff ####@####.#### Date: 19 Jul 1999 18:22:50 -0000 Message-Id: <87btd8xz1w.fsf@pfaffben.user.msu.edu> Alan Cox ####@####.#### writes: > 1. AFAICT it would require at least one division operation, > whereas standard Bresenham doesn't need any. This > wouldn't be a problem for long diagonal lines, just for > short ones. Division is expensive. No. You can do it by using Besenham and still speed it up > fits, just barely, but it looks like an ``extended'' > algorithm that keeps track of spans would need to use > memory as well. This is a big loss on the 8086 IIRC. Its not a big deal Firstly: if(x2-x1 > y2-y2) horizonal_optimised(); else vertical_optimised(); Well, yes, obviously. Next for Bresenham you drop the plot_pixel call and instead when you bump x (or y in vertical mode) you do plot_line(oldx,oldy, x,y); bump it oldx=x oldy=y Saves you function calls costs you four memory accesses per line - thats a win on everything. Okay that's one way to look at it. The routine that I was looking at is in Wilton's _Programmer's Guide to PC and PS/2 Video Systems_. He has a routine that does one pixel per ten or so CPU instructions on VGA16. Getting that fast is easy; I was looking to do even better than that using clever things with bit masks to write multiple pixels at once. You're looking at optimizing at the generic level with calls to a hardware-specific routine; I was thinking about optimizing an already fast x86 asm routine. Oh well. -- "Unix... is not so much a product as it is a painstakingly compiled oral history of the hacker subculture." --Neal Stephenson | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
RE: Microwindows for Hercules
From: Greg Haerr ####@####.#### Date: 19 Jul 1999 19:21:18 -0000 Message-Id: <01BED1E9.24C8FF10.greg@censoft.com> : Next for Bresenham you drop the plot_pixel call and instead when you : bump x (or y in vertical mode) you do plot_line(oldx,oldy, x,y); bump it : oldx=x oldy=y : : Saves you function calls costs you four memory accesses per line - thats : a win on everything. : I agree. Easy and simple, saves function calls and costs 2 subtracts and a two stores. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
RE: Microwindows for Hercules
From: Greg Haerr ####@####.#### Date: 19 Jul 1999 19:25:16 -0000 Message-Id: <01BED1E9.B0376C80.greg@censoft.com> : Okay that's one way to look at it. The routine that I was looking at : is in Wilton's _Programmer's Guide to PC and PS/2 Video Systems_. He : has a routine that does one pixel per ten or so CPU instructions on : VGA16. Getting that fast is easy; I was looking to do even better : than that using clever things with bit masks to write multiple pixels : at once. It'd be cool to optimize that. My asmplan4.s replacement high-speed driver for vgaplan4.c uses Wilton's code as a base. Feel free to test and enhance that code. : : You're looking at optimizing at the generic level with calls to a : hardware-specific routine; I was thinking about optimizing an already : fast x86 asm routine. Oh well. : Currently, there's not a direct entry point for the line draw, it's commented out. Only if the entire line draw is unclipped will a low-level routine be called anyways, but you could test by uncommenting that code in GdLine in devdraw.c and calling it outside the driver interface. Greg | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[<<] [<] Page 3 of 4 [>] [>>] |