project communication failed

Message boards : Number crunching : project communication failed
Message board moderation

To post messages, you must log in.

AuthorMessage
zombie67 [MM]
Project donor
Avatar

Send message
Joined: 9 Dec 06
Posts: 93
Credit: 2,492,267
RAC: 649
Message 7021 - Posted: 21 Nov 2007, 1:21:27 UTC
Last modified: 21 Nov 2007, 1:24:16 UTC

When I click update to try to get work (as I see there is a bit available right now), I get:

11/20/2007 5:16:42 PM|BURP|Fetching scheduler list
11/20/2007 5:17:04 PM||Project communication failed: attempting access to reference site
11/20/2007 5:17:05 PM||Access to reference site succeeded - project servers may be temporarily down.
11/20/2007 5:17:07 PM|BURP|Deferring communication for 1 days 0 hr 0 min 0 sec
11/20/2007 5:17:07 PM|BURP|Reason: 10 consecutive failures fetching scheduler list

Others that I have talked to are able to get work. This is the only project that I am having this problem with. FWIW, I use BAM. I have tried resetting, with no change.

Edit: I also tried detach/attach. Same problem
Reno, NV
Team: SETI.USA

ID: 7021 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Project donor
Avatar

Send message
Joined: 9 Dec 06
Posts: 93
Credit: 2,492,267
RAC: 649
Message 7023 - Posted: 21 Nov 2007, 1:55:49 UTC - in response to Message 7021.  

Problem solved by quitting and restarting BOINC. FWIW, this was across all 15 of my windows XP boxes, but not my linux box. Also, at least one other person on #boinc was experiencing the same problem, solved the same way.
Reno, NV
Team: SETI.USA

ID: 7023 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AC
Project donor
Avatar

Send message
Joined: 30 Sep 07
Posts: 121
Credit: 143,874
RAC: 0
Message 7024 - Posted: 21 Nov 2007, 2:41:42 UTC
Last modified: 21 Nov 2007, 3:01:16 UTC

Incidentally, I\'m getting similar behavior on my machines. Restarting the BOINC Manager (v.5.10.13/winXP) didn\'t fix it though... restarting the machine did. I left the client running on another machine as a control, and it still isn\'t connecting.

EDIT: My \"control\" machine still had BOINC related processes (BOINC & 2x SETI) running after shutting down the manager, so I\'m guessing that when I restarted it, it just reattached to the existing processes. After killing the processes manually (sorry SETI) and restarting the manager I was able to connect to BURP. I\'m not sure if it\'s why the processes were still running, but I do have the \"keep in memory when suspended\" option enabled, and they were suspended before restarting the client.
ID: 7024 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AC
Project donor
Avatar

Send message
Joined: 30 Sep 07
Posts: 121
Credit: 143,874
RAC: 0
Message 7025 - Posted: 21 Nov 2007, 7:48:17 UTC - in response to Message 7023.  

Problem solved by quitting and restarting BOINC.


This one worked for my 3rd XP box. This time I checked first to make sure no BOINCish processes were left running after closing the manager. There weren\'t, and restarting the manager seemed to do the trick.
ID: 7025 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4574
Credit: 2,100,463
RAC: 8
Message 7026 - Posted: 21 Nov 2007, 8:33:28 UTC
Last modified: 21 Nov 2007, 8:34:52 UTC

This is a result of the main server crashing again yesterday at around 21.00 UTC. The server was brought down for 2 hours for some debugging and restarted a little later.
BOINC seemingly has an issue with the dynamic DNS system in use here when the server crashes. The DNS entries are set so that it should theoretically refresh the DNS information every 10 mins, but I\'m not sure that BOINC ever does another DNS lookup when it has done the first one. Restarting BOINC forces it to forget the DNS information and fetch it again.

I wonder how long it will take to automatically pick up on the new info - if it ever does that... anyone up for a test?
ID: 7026 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : project communication failed