nanogui: nasty client/server bug fix


Previous by date: 3 Dec 2000 15:41:22 -0000 TrueType fonts dir, Ado Nishimura
Next by date: 3 Dec 2000 15:41:22 -0000 Re: Reading and Displaying the Bitmap at Run Time, Greg Haerr
Previous in thread: 3 Dec 2000 15:41:22 -0000 nasty client/server bug fix, Greg Haerr
Next in thread:

Subject: Re: nasty client/server bug fix
From: Morten Rolland ####@####.####
Date: 3 Dec 2000 15:41:22 -0000
Message-Id: <3A2A6ACA.AD3F11D1@screenmedia.no>

Hello Greg,

> Morten,
>     I've finally solved the rare bug that you talked about
> previously between Nano-X clients and the server.

Oh.  Does that mean you didn't get my mail about this very
same problem, dated Tue 10 Oct 2000 13:22:07 +0200
subject "Hello" ?

Thing is, we fixed it back then, and one hell of a debugging
effort it was too as you found out yourself.  We found the
problem to be partly what you describe and also closely bound
to shared memory operation.

We also had to introduce a "NOP" protocol operation to be used
to make the samantics around GrPrepareSelect/GrServiceSelect
work in the case when a stored event was received just prior to
going to sleep - the nop reply is used to make the select
"wake up" so that the GrServiceSelect function can call the
event handler with the stored event without (much) delay.

> This
> manifested itself when the server posted a GsError, for
> instance, and clients would get out of sync, as well as
> when folks used GrGetNextEventTimeout, or GrPrepareSelect.

We have not looked much at GsError, but we predicted that a
timed out GrGetNextEvent would cause the same problem that
we experienced, yes.

> The reason for this is actually quite complicated, but a simple
> explanation is that if a GetNextEvent request was sent, but
> timed out before getting a response, then the server, if
> multiple events were queued and the system was busy,
> would write more than one event on the wire.

Yes, this sounds right.  Also, the sending of an async
"GetNextEvent" by the server may also be interpreted as
"shared memory command execution completed" by the client,
causing the shared memory to be reused before it was acted
upon by the server, causing a lot of trubles...

With our fix, this is a typical function in client.c that needs
to return information from the server:

void 
GrCheckNextEvent(GR_EVENT *ep)
{
        if ( nxGetStoredEvent(ep) )
                return;

        AllocReq(CheckNextEvent);
        nxFlushWait();

        nxSocketReadTyped(GrNumGetNextEvent, ep, sizeof(*ep));
        nxFlushFinish();
}

Now, the nxFlushWait() indicates that this operation needs to be
sent to the server asap so the reply can and will be received.  This
function does one of two things: flushes the buffer by write()ing
it to the socket, or send a command to execute the shared memory
segment.

The nxFlushFinish() cleans up by possibly waiting until a reply to the
flush (that may hav been sent) is received.

Here are two others:

void 
GrFlush(void)
{
        nxFlushAuto();
        nxFlushFinish();
}

void
GrSync(void)
{
        nxFlushWait();
        nxFlushFinish();
}


GrFlush will ship all queued commands to the Nano-X server, while
the GrSync function will wait until the Nano-X server has executed
the queued commands before continuing.  The nxFlush* commands will
do one thing if shared memory is used and another if it is not.

As I wrote in the beforementioned mail, I will have to work a little
bit to produce a patch for the latest releases, as we use 0.88pre3 as
a base for our changes, but I would do it if you want them.

The patch will contain a rewrite of nxproto.c that makes it a bit
more shallow and easier to follow imho.  Or I can send you the entire
thing that we are working on now if you want for inspection and ideas.

Regards,
Morten Rolland, Screen Media

Previous by date: 3 Dec 2000 15:41:22 -0000 TrueType fonts dir, Ado Nishimura
Next by date: 3 Dec 2000 15:41:22 -0000 Re: Reading and Displaying the Bitmap at Run Time, Greg Haerr
Previous in thread: 3 Dec 2000 15:41:22 -0000 nasty client/server bug fix, Greg Haerr
Next in thread:


Powered by ezmlm-browse 0.20.