A Discussion about Synchronization

發表時間：2002-09-23 08:55:32

IP:61.218.xxx.xxx 未訂閱

From "The Bits..." the C Builder Information & Tutorial Site http://www.thebits.org ?The Bits..."2001. ?TheBits.org. All rights reserved This article was originally printed by UK Borland User Group magazine, www.ukbug.co.uk. In a break from code, (my brain hurts at the time of writing) Im going to look at something thats annoyed me recently, the reporting of some bugs?in TMultiReadExclusiveWriteSynchronizer. These reports are hitting thebits.org mailing lists, the Borland newsgroups, and a couple have even made it to the excellent The Delphi Bug List. They show, to me at least, a misunderstanding about what should be happening with synchronization. The first goes something like this. Thread A acquires a Read Lock. Okay, then Thread B tries to acquire a Write Lock, which it cant because Thread A is currently holding a Read Lock. Fair enough. Now, if Thread A tries to acquire another Read Lock, it cant because Thread B has it blocked for writing, and ... you have a deadlock. Is this a bug. Nope! Simple. To clarify, if Thread C attempted to acquire the second Read Lock instead of Thread A, then there wouldnt be a deadlock, it is the fact that Thread A attempts to obtain the lock that is causing the problem. The reason this isnt a bug, more bad practice was shown when someone said that the following code is perfectly legal,

    ThreadA->BeginRead();
....
ThreadA->BeginRead();
...
ThreadA->EndRead();
...
ThreadA->EndRead();

It is legal in the sense that it will compile, but totally illegal in logic. Again, to clarify, if youve got the permission you require, then why do you need it again. The problem with the above code is that it shows the author has not understood properly the concept of synchronization. If you acquire a lock of any kind, you must assume that someone else is also trying to acquire a lock with the same, or indeed different access rights. If, as in the case of the bug?you forget this, then you are asking for a deadlock to occur. Promoting a Lock? This reported bug made me squirm even more. Again, whilst it may be legal?in coding terms, it does not mean the code is correct in logic,

Thread->BeginRead();
...
Thread->BeginWrite();
....
Thread->EndWrite();
...
Thread->EndRead();

The comment that went with the above states that if two threads run this code concurrently then a deadlock will occur.?Ill be honest, if one thread runs the above code and a deadlock didnt occur, then that in my opinion is a bug! The Principal In both the above cases a simple principal of concurrent processing has been ignored, that of the single track logical approach. If you think of a single railway line with multiple trains trundling up and down it, then you should never have any trouble with threading. The authors of the above bugs?have forgotten this. When you enter a synchronization lock, whether it be a TCriticalSection, TMultiReadExclusiveWriteSynchronizer or any other of the available Windows API synchronization objects (discussed by Steve Scott in Jul/Aug 99 and Sep/Oct 99 issues), you must remember this single line principal. When a train enters a single piece of track (assuming constants in speed, length, and reliability etc) it is possible to send another train in the same direction down the track without worrying too much. This in effect is the BeginRead of TMultiReadExclusiveWriteSynchronizer, your going into the synchronizer one way. However, when you want to send a train halfway down the track, then turn it round and send it back in the direction it came, lets call this a WriteLock, you cant have anything else on the track, in either direction and remain safe. Now apply this logic back to the first case. ThreadA has entered the track in one direction, and quite correctly, ThreadB, which needs exclusivity is waiting. Now, ThreadA wants to follow itself down the track. Hang on, science fiction, time travel and holes in the space time continuum being ignored for a moment, thats not physically possible. ThreadA cant re-enter the track at least until its come out the other end. This is true even for multiple instances of the same object. (Sci-Fi lives!) We can of course have multiple instances of ThreadA entering the track in one direction, but the same principal still applies. Whilst each instance itself can enter the section of line, that instance cant physically enter again until its physically left the other end. This same principal puts the second bug?into perspective also. A train entering a section of track can, in theory at least, acquire the right to turn round half way down. In logic however, it would be a stupid thing to do. You have a little man, with very long arms, able to reach either end of the section of track youre protecting. A train enters the track and asks the little man for permission to go one way. He gives you permission and a token with the appropriate rights of passage. Another train comes along and asks to travel the same way youve gone, he checks and says okay, and off it comes after you. Now then, the logical part. Half way down the section, the first train decides he wants to turn round. How can he? Hes not at either end, so the little man cant give him permission until the train reaches the end of the section. It doesnt matter whether theres another train following you in or not, he simply cant give you permission until you come out, its against the rules. (Which is probably true, it must be laid down in the rules and regulations of single line railway working somewhere too or thered have been an awful lot more accidents than there have been already!) Bug Or Not These two examples show a misunderstanding of the principals involved. Whilst theory may account for many things, in the real world you have to deal with reality. If youre trying to implement either of the above bugs? then you are at fault, not the compiler for letting you do it (thought it should flag the situation maybe), as youre trying to achieve the impossible, like reading a NULL pointer address, its plain wrong. If you think of these things in the simple single track railway principal youll see that deadlocks are easily avoided, if you try and do something clever, youll come unstuck. In very simple terms, if you acquire a lock in a thread, make sure you release it before you try and acquire it again in the same thread. This is more pertinent in the second example, here the author should have acquired a write lock in the first place, not tried to promote the read lock he already has. More importantly, think your code through carefully. In the following logic the programmer hasnt really made the best of the synchronizer, get read lock read counter from Form1 for(int x = 0; x < counter; x ) do something release read lock This would be far better in logic as, get read lock read counter from Form1 into LocalCounter release read lock for(int x = 0; x < LocalCounter; x ) do something thus releasing the lock before tying it up in a potentially lengthy process. In a similar manner, the following, get write lock read counter from Form1 do lots and lots with Counter write counter back to Form1 release write lock is far better written as get read lock read counter from form1 into LocalCounter release read lock do lots and lots with LocalCounter get write lock write counter back to Form1 release write lock The only time that the first method is preferable is when you actually wish to protect the global value of Counter from other threads whilst youre working on it, something you should try to structure your threads to avoid. The Classic Deadlock However it is easy to make a mistake. The classic deadlock comes from the following code, assuming MySynch is a TMultiReadExclusiveWriteSynchronizer and ZedVariable is an application global variable. You have a function,

void __fastcall MyThread::IncrementZ(void)
{
int x;    MySynch->BeginWrite();
try
        {//protect shared memory
Form1->ZedVariable  ;
        }
catch(...)
        {//ensure we release the lock
        }
MySynch->EndWrite();
}    and then elsewhere in your thread you do,    int a;    MySynch->BeginRead();
try
        {//get a from the main form,
        a = Form1->SpareVariable;
a = a   IncrementZ();
        }
catch(...)
        {
        }
MySynch->EndRead();
...

Whilst the code above is trivial, and it shows the problem, it is also a very common mistake made, especially on multi-programmer projects. When you call the function, youve already entered the piece of railway line your trying to protect. Once in the function, well, you see the problem, youll never achieve the write lock you want. If you want to study thread deadlocking principals properly take a look into The Edinburgh Concurrency Workbench, (http://www.dcs.ed.ac.uk/home/cwb/index.html), a wonderful tool for spotting deadlocks in your logic and code. If youre really desperate you can look up Calculus of Concurrent Systems (CCS), a subject I did at University. At the time, although I always came top of the group, I didnt understand a word or equation of it, (and I still dont 8o), but its obviously stood me in good stead when working with threads. And Event Synchronization is not a simple matter either! Finally, whilst Ive discussed so far true synchronization between multiple thread instances, we have to remember also that Windows itself throws some wonderful problems our way. Windows is, and thus our applications are, based on the Event model of processing. What this means is that multiple events of the same type can occur, probably before you want them to. This was brought home recently when someone was reporting some weird results in their code, again, it was a bug. At first glance you can see why, to simplify greatly, (Flag is a private Boolean variable, Counter a private integer of the form, set to false and 0 respectively in the OnCreate event.)

void __fastcall TForm1::Button1Click(TObject *Sender)
{
Flag = !Flag;
Memo2->Lines->Add("Loop starting");    for(int x = 0; x < 1000; x  )
   {
    if(Flag)
        Counter  ;
    else
        Counter--;
        Memo1->Lines->Add(Counter);
   }
Memo2->Lines->Add("Loop Finished");
}

For simplicity, create a new project with two TMemos and a TButton, with the above in the Buttons OnClick. Run the application and hit the button as many times as you like, when its all finished Memo1 should report either 0 or 1000. This works because C Builder holds the events in the message queue as they occur. The bug?took some tracking down because the event he was using was much larger. Add the following just before the call to Memo1->Lines->Add(Counter), Application->ProcessMessages(); and run the application again. Hit the button multiple times in quick succession, and Ill be very surprised if you end up with 0 or 1000 this time. Why? Remembering the Wrong things Well, when we enter an event, our single thread runs sequentially through it. However, weve stacked up some more events by repeatedly hitting the button. This isnt a problem whilst we leave the message queue alone, but when we add the call to ProcessMessages, events fire off, in effect the newest event will become the active one, with the others in effect, stacked up behind. To explain, we enter the event for the first time, and BCB gives us a local instance of the event. When we jump out of the event, ProcessMessages, BCB pauses the local event, takes a snapshot of where we are, somewhere in the loop, Flag will be true, Counter will be something depending how fast we are with our button, and the next instance of the event will fire. In this second occurrence we check the global Flag, its true so we set the flag to false, and start our loop, and taking the value of Counter being whatever it was in the first occurrence of the event when we ProcessMessages, we decrement by 1000, . Well thats okay, but what happens when we get back to the first occurrence of our event? We return to the point where we left, and Flag is now false, Counter has been changed 1000 down by the second event, and things are not well at all. As we now return somewhere within the loop, we never reset Flag, so when we read it, its false, we now decrement by the remainder of our counter, and so on. So, we end up with a negative number. If you fire the event three times, the flag will be true in the third instance of the event, and thus true for the remainder of the other two, thus you get a positive number. Even number of events, negative result, odd number of events positive result. Whilst it is true that the negative/positive results are consistent, the number returned is inconsistent. This is because what is going on is changed somewhat by the time it takes the second event to actually fire (that is, the first event wont pause immediately the ProcessMessages call is given, only when the second event is fired), Is this a bug? I havent a clue. I mentioned earlier my brain hurts and theres no way I can work out what the compiler should do when a programme is being deliberately awkward! Correcting the Snapshot Theres more than one solution to this, ranging from the simple to the exotic. Move the ProcessMessages call outside the loop, or write a thread, even invoke the nulling of the ButtonOnClick handler I described in an earlier article, (Nov/Dec 98), but there is a more simple, and elegant solution. Change the function so it looks like this,

void __fastcall TForm1::Button1Click(TObject *Sender)
{
bool MyFlag;    Flag = !Flag;
MyFlag = Flag;
  
Memo2->Lines->Add("Loop starting");    for(int x = 0; x < 1000; x  )
   {
   if(MyFlag)
       Counter  ;
   else
       Counter--;
       Application->ProcessMessages();        
       Memo1->Lines->Add(Counter);
   }
Memo2->Lines->Add("Loop Finished");
Flag = !MyFlag;        //put it back to the way it should be
}

Now the variable MyFlag is protected between calls to the function. When BCB takes a snapshot of our event for future use, it records a local instance of the variable MyFlag, and importantly each instance of the event gets its own instance MyFlag. Without this local variable, BCB cant take a snapshot of the Global variables concerned. You could, if you wished, protect Counter in a similar way, though the result would be the same. As with all synchronization issues, the more complex the problem, the simpler the sol ution. 網路志工聯盟----Visita網站http://www.vista.org.tw ---[ 發問前請先找找舊文章 ]--- 發表人 - axsoft 於 2002/09/23 08:59:56