Client response times + orphans

Message boards : Client : Client response times + orphans
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4570
Credit: 2,100,463
RAC: 8
Message 2273 - Posted: 5 Jan 2006, 11:45:32 UTC
Last modified: 5 Jan 2006, 11:53:12 UTC

If you are interested in what is currently going on with the client development you can read the technical details below. Basicly it is about improving the response times between the client application and the BOINC core client so that the core client does not orphan the Blender application (or as it happened a few times, let the Blender application output its debug information into another project\'s workslot).

The current response times are ranging from 0.001 to 120 secs with an average of around 20 secs. This is because the controlling application can only handle BOINC messages (pause, shutdown etc) every time it gets a progress update from the controlled application (Blender). Blender only sends progress updates every time it has rendered an entire line - which can take a long time if the image to be rendered is very complex.

To improve the response times the idea is to continue execution in the controller even though there are no progress updates. That way the message parsing part of the controller can rely on something else than the progress updates to run (for instance a timer). In the upcomming tests this timer will be set to 0.6 secs, which should make the response times cap at 0.6 sec - it is still possible to get lower response times depending on when the last progress update was done and when the timer last ticked.

Furthermore a single condition was found that could potentially kill the controller but leave the controlled application (Blender) running. This is considered a serious bug since the Blender instance will run to completion before exitting, possibly breaking the memory limit set in the BOINC preferences (because other projects can be loaded in). The fix for this will not be included in the first couple of tests, though, since it is important to not vary too many factors at the same time. It only happens in the cases where the built-in error preventer (called the terminator) kills the application. The terminator is designed to shut down everything if the controller looses contact with the controlled application for an extended amount of time, in fact it only shut down the controller. Judging from the debug output returned in previous sessions this happens very very seldomly.
...So you could say that the error preventer caused an error ;)

The information needed to find the above bugs was all gathered during the most recent sessions (137-139) as well as some older sessions (131-132, 135). A great thanks goes to the people who participated in those sessions and the mirror owners who made it possible to run the tests at such a large scale.
ID: 2273 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Raimund Barbeln
Project donor
Avatar

Send message
Joined: 14 Mar 05
Posts: 73
Credit: 25,881
RAC: 1
Message 2286 - Posted: 9 Jan 2006, 9:01:28 UTC

Thanks for the update Janus.
You are doing a great job.

So, when do you think tose new tests will be run?

when life gives you lemons, make lemonade!
ID: 2286 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4570
Credit: 2,100,463
RAC: 8
Message 2287 - Posted: 9 Jan 2006, 12:26:32 UTC - in response to Message 2286.  
Last modified: 9 Jan 2006, 12:31:16 UTC

The first internal tests didn\'t show good results. The Blender application got killed way too much. I need to look into this before releasing a series of public tests, as this points towards a so-far unknown issue with the terminator.

I\'m afraid the public tests will have to be rescheduled to the next test-timeslot (approx. Jan 25) or even further to the first February timeslot. January is a very busy month for me, so development is a bit slow. Hopefully it will be possible to catch up to speed in the beginning of Feb.

The first version that will reach public testing is the Windows one; shortly followed by the linux one.

This doesn\'t change the previous schedule much, the only difference is that the client tests that were supposed to be released publicly for the last couple of days were kept in-house due to too many errors.
ID: 2287 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Seventh Serenity
Project donor
Avatar

Send message
Joined: 16 Apr 05
Posts: 4
Credit: 3,574
RAC: 0
Message 2297 - Posted: 21 Jan 2006, 17:46:25 UTC
Last modified: 21 Jan 2006, 17:52:05 UTC

Thanks for all the hard work. Since I know the donations have gone to use (unlike PlanetQuest where I have sent $40 and now they have been quiet for months) I\'ve donated a couple of Euros and will hopefully donate more in the coming weeks. I\'ve also re-attached my systems (3 of them) which 2 of them are to be upgraded to much faster CPUs in a couple of weeks.

Also, I even put the homepage as my browsers homepage since I\'m so interested in this project (I\'m a PSP & PC gamer, so animations/models etc grab my attention easily =D ).
ID: 2297 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4570
Credit: 2,100,463
RAC: 8
Message 2303 - Posted: 26 Jan 2006, 22:08:14 UTC

Today the client went through 31 recode and evaluation cycles. It is currently running another evaluation cycle and things are looking really well so far.
There were a few surprises as well - more about that later.

As a consequence of some portability issues the client has been seperated from 2 into 4 different binaries - 2 of which are tiny third party programs (zip and unzip). This will hopefully end the constant battle with library inclusions, overlaps, errors etc. that has taken plase at each previous public release.

Also the Windows client is now compiled in cygwin and a GPL\'ed cygwin dll is sent along with the binary to all the users.
In other words you should expect more files to be downloaded than usual in the next test - although they will probably each be a bit smaller than the previous client. Here\'s the expected content of the next release:
cygwin1.dll - the cygwin library
x.zip - Blender release archive
x.exe - Controller
zip - Archival program used to compress the output before returning it
unzip - Archival program used to uncompress the input files and executables sent from the server
in.zip - An input file archive containing a part to be rendered

Almost all goals for this release have been achieved now, so expect a public test to pop up either tomorrow (or early February if the current test window is missed).
ID: 2303 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4570
Credit: 2,100,463
RAC: 8
Message 2318 - Posted: 27 Jan 2006, 11:40:26 UTC
Last modified: 27 Jan 2006, 12:24:56 UTC

Been through 4 more cycles now. A small problem persists: it doesn\'t respond at all...
This seems related to the use of a multimedia timing driver. There\'s a few potential sollutions to this, one will be tested in version 4.24 shortly.

[Edit:] 4.24 seems to work better. It will have to run for another 1.5 hours before it is clear whether it works flawlessly. [/Edit]
ID: 2318 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4570
Credit: 2,100,463
RAC: 8
Message 2417 - Posted: 19 Feb 2006, 12:07:59 UTC

Been through yet another load of test cycles now. Some of the problems were related to a strange (and difficult do debug) platform issue when compiling windows programs in Cygwin.
Version 4.30 has the new code and is currently being tested.
ID: 2417 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4570
Credit: 2,100,463
RAC: 8
Message 2501 - Posted: 9 Mar 2006, 17:03:17 UTC - in response to Message 2417.  

After yet another bunch of cycles it seems the core of this issue has finally been found by one of the participants. Version 4.34 carries a new version of the unzip program as well as updated code.
ID: 2501 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4570
Credit: 2,100,463
RAC: 8
Message 2503 - Posted: 10 Mar 2006, 6:44:40 UTC - in response to Message 2501.  

After yet another bunch of cycles it seems the core of this issue has finally been found by one of the participants. Version 4.34 carries a new version of the unzip program as well as updated code.

Yup, version 4.34 seems to be the lucky winner of this client development round. So far it has a much lower error-rate than the previous versions (which were at about 60%, this one is more like 5% and the errors do not seem to be related to BURP at all).
ID: 2503 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Steve Cressman
Avatar

Send message
Joined: 27 Mar 05
Posts: 142
Credit: 3,243
RAC: 0
Message 2523 - Posted: 10 Mar 2006, 16:06:42 UTC - in response to Message 2503.  
Last modified: 10 Mar 2006, 16:27:25 UTC

After yet another bunch of cycles it seems the core of this issue has finally been found by one of the participants. Version 4.34 carries a new version of the unzip program as well as updated code.

Yup, version 4.34 seems to be the lucky winner of this client development round. So far it has a much lower error-rate than the previous versions (which were at about 60%, this one is more like 5% and the errors do not seem to be related to BURP at all).

Yes it is working quit well except for those using boinc v5.3.xx. But that is what you get when you don\'t use the recomended version, lol. Well someone has to test them. Those testing boinc v5.3.xx, don\'t forget to report on the boinc site that it has problems with burp so they can fix it.
:)
Win98SE XP2500+ Boinc v5.8.8

And God said"Let there be light."But then the program crashed because he was trying to access the 'light' property of a NULL universe pointer.
ID: 2523 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Client : Client response times + orphans