Page 1 of 1

Re: Milkyway issues

Posted: Sat May 16, 2020 11:23 am
by UBT - Timbo
Hi Richard

I doubt the version of BOINC is going to be a problem - as the Manager program is separate to the boinc program and as long as both of their respective filenames are what each expects to "see" (irrespective of version number) then neither will care too much.

re: app_config

This is my version:
<app_config>
<app>
<name>milkyway</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.75</cpu_usage>
</gpu_versions>
</app>
</app_config>
This for an older CPU and older (single) GPU, both of which are less powerful than yours. So, I doubt that this is the issue - but if you want to edit your file to the values shown in mine then try it - so, just suspend any MW tasks, find the "app_config.xml" file within the ProgramData/BOINC/projects/milkyway.cs.rpi.edu_milkyway folder and right click on the filename and select "Notepad" to Open it, change the values and then save it.

Note that "max_concurrent" will limit the max number of concurrent tasks but if you are not using CPUs for MW, then this is fine...but in the future if you do use CPUs for this project you'll only get 2 tasks at once.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

Your current setting of <gpu_usage>0.1</gpu_usage> is allowing 10 instances of the projects tasks to run on each GPU...that may be too much even for a high end GPU. So, using <gpu_usage>1.0</gpu_usage> for now will at least get you going.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

If that works in terms of allowing tasks to flow, the next step is to change the <gpu_usage>1.0</gpu_usage> value to <gpu_usage>0.5</gpu_usage> - this will then allow 2 instances of the project tasks to run concurrently.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

re: app_info.xml

Also: you may want an "app_info.xml" file within the MW project directory which contains the following code:
<app_info>
<app>
<name>milkyway</name>
</app>

<app_version>
<app_name>milkyway</app_name>
<coproc>
<type>NVIDIA</type>
<count>1.0</count>
</coproc>
</app_version>
</app_info>
re: GPU settings

This time, suspend ALL crunching and close BOINC Manager down.

Within the ProgramData/BOINC directory is a file called "cc_config.xml". Once you find this, open it with Notepad and look within the file for the following line:
<use_all_gpus>1</use_all_gpus>
If it's there (exactly as shown above) then BOINC should use both GPUs for crunching. If the value is "0" then change it to "1".
Then save the file, close Notepad and then resume BOINC Manager.

re: Deferring tasks

In simple terms, BOINC has a built-in mechanism (the client-side "scheduler") where it "knows" your crunching history and it assigns a "weighting" to each project. This then means that you are not in full control of which tasks are given "priority"...and it will stay that way until BOINC Manager "decides" that you have now crunched enough tasks of each project to give them all equal "weight".

https://boinc.berkeley.edu/wiki/REC-based_scheduler

The advice here is to try a couple of things:

1) Try setting ALL projects to "Won't get new tasks" and then RESET each of the projects listed with BOINC Manager - do this once you have NO tasks to crunch (otherwise it'll simply abort them all and if you are halfway through a task then any crunching would be wasted).

2) Set the cache to be a bit higher than you would normally use on the "troublesome project" (in this case MW) as the revised schedular (in v7.x of BOINC) needs about 10+ tasks to reset it's internal settings for the "REC" (Recent Estimated Credit).

Then only allow MW to download tasks and give it a couple of hours.

See if any of the above helps.

regards
Tim

PS: If any other forum members can correct any mistakes I've made please mention it and I'll update the above.

Re: Milkyway issues

Posted: Sat May 16, 2020 2:53 pm
by ChelseaOilman
Unless Boinc Manager is showing 7.15.0.0 in the bottom right corner I don't think you have it set up correctly. You have to stop the client, Rename or remove the 7.16.5 boinc.exe file and replace it with the 7.15.0.0 boinc.exe file before restarting the client. They can't both run at the same time, I don't think. It's one or the other.
homefarm wrote: Sat May 16, 2020 9:59 am Am I missing something pls? ( my 73 year old brain can't seem to get any further ;-) )
I'm 72 and struggle with this stuff as well. Especially Linux! lol

Re: Milkyway issues

Posted: Sat May 16, 2020 7:50 pm
by homefarm
ChelseaOilman wrote: Sat May 16, 2020 2:53 pm Unless Boinc Manager is showing 7.15.0.0 in the bottom right corner I don't think you have it set up correctly. You have to stop the client, Rename or remove the 7.16.5 boinc.exe file and replace it with the 7.15.0.0 boinc.exe file before restarting the client. They can't both run at the same time, I don't think. It's one or the other.
homefarm wrote: Sat May 16, 2020 9:59 am Am I missing something pls? ( my 73 year old brain can't seem to get any further ;-) )
I'm 72 and struggle with this stuff as well. Especially Linux! lol
HI, Yes definitely 7.15.0. Also the files were replaced as per instructions.

I worked all morning on this, I think I freaked out the MW server, so it was punishing me. Bit like Oliver Twist " Please, sir, may I have some more please?"

Having a break to do some farming, and selling a young bull, I am now in peak of mental ability :?

Re: Milkyway issues

Posted: Sat May 16, 2020 7:55 pm
by homefarm
UBT - Timbo wrote: Sat May 16, 2020 11:23 am Hi Richard

I doubt the version of BOINC is going to be a problem - as the Manager program is separate to the boinc program and as long as both of their respective filenames are what each expects to "see" (irrespective of version number) then neither will care too much.

re: app_config

This is my version:
<app_config>
<app>
<name>milkyway</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.75</cpu_usage>
</gpu_versions>
</app>
</app_config>
This for an older CPU and older (single) GPU, both of which are less powerful than yours. So, I doubt that this is the issue - but if you want to edit your file to the values shown in mine then try it - so, just suspend any MW tasks, find the "app_config.xml" file within the ProgramData/BOINC/projects/milkyway.cs.rpi.edu_milkyway folder and right click on the filename and select "Notepad" to Open it, change the values and then save it.

Note that "max_concurrent" will limit the max number of concurrent tasks but if you are not using CPUs for MW, then this is fine...but in the future if you do use CPUs for this project you'll only get 2 tasks at once.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

Your current setting of <gpu_usage>0.1</gpu_usage> is allowing 10 instances of the projects tasks to run on each GPU...that may be too much even for a high end GPU. So, using <gpu_usage>1.0</gpu_usage> for now will at least get you going.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

If that works in terms of allowing tasks to flow, the next step is to change the <gpu_usage>1.0</gpu_usage> value to <gpu_usage>0.5</gpu_usage> - this will then allow 2 instances of the project tasks to run concurrently.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

re: app_info.xml

Also: you may want an "app_info.xml" file within the MW project directory which contains the following code:
<app_info>
<app>
<name>milkyway</name>
</app>

<app_version>
<app_name>milkyway</app_name>
<coproc>
<type>NVIDIA</type>
<count>1.0</count>
</coproc>
</app_version>
</app_info>
re: GPU settings

This time, suspend ALL crunching and close BOINC Manager down.

Within the ProgramData/BOINC directory is a file called "cc_config.xml". Once you find this, open it with Notepad and look within the file for the following line:
<use_all_gpus>1</use_all_gpus>
If it's there (exactly as shown above) then BOINC should use both GPUs for crunching. If the value is "0" then change it to "1".
Then save the file, close Notepad and then resume BOINC Manager.

re: Deferring tasks

In simple terms, BOINC has a built-in mechanism (the client-side "scheduler") where it "knows" your crunching history and it assigns a "weighting" to each project. This then means that you are not in full control of which tasks are given "priority"...and it will stay that way until BOINC Manager "decides" that you have now crunched enough tasks of each project to give them all equal "weight".

https://boinc.berkeley.edu/wiki/REC-based_scheduler

The advice here is to try a couple of things:

1) Try setting ALL projects to "Won't get new tasks" and then RESET each of the projects listed with BOINC Manager - do this once you have NO tasks to crunch (otherwise it'll simply abort them all and if you are halfway through a task then any crunching would be wasted).

2) Set the cache to be a bit higher than you would normally use on the "troublesome project" (in this case MW) as the revised schedular (in v7.x of BOINC) needs about 10+ tasks to reset it's internal settings for the "REC" (Recent Estimated Credit).

Then only allow MW to download tasks and give it a couple of hours.

See if any of the above helps.

regards
Tim

PS: If any other forum members can correct any mistakes I've made please mention it and I'll update the above.
Hi Tim, Thank you so much for your copious notes and suggestions, much appreciated. I will now re-do the app_config etc.

Re: Milkyway issues

Posted: Sat May 16, 2020 9:18 pm
by homefarm
Hi Tim

1. Removed MW entirely.
2. Rejoined MW. Using 7.15.0 boinc.exe
3. app_config.xml as below
<app_config>
<app>
<name>milkyway</name>
<max_concurrent>6</max_concurrent>
<gpu_versions>
<gpu_usage>0.1666666</gpu_usage>
<cpu_usage>.5</cpu_usage>
</gpu_versions>
</app>
</app_config>

4. MW downloads a batch of 6 tasks, then runs 6 on GV100 only, takes up to 85% usage of GPU, run time 56 secs for 6 tasks simultaneously.
5. MW then goes to sleep, no reloads without "update"
6. Leaving it to run for a while

Addendum: The other 7890XE with 2 GPUs , running 7.15.0, has developed a cache of tasks (?)

Re: Milkyway issues

Posted: Sat May 16, 2020 9:27 pm
by ChelseaOilman
This is my app_config.xml file for MW to run 4 tasks per GPU:

Code: Select all

<app_config>
<app>
<name>milkyway</name>
<gpu_versions>
<gpu_usage>0.25</gpu_usage>
<cpu_usage>0.50</cpu_usage>
</gpu_versions>
</app>
</app_config>
It's pretty simple. I'm not using anything else. With MW you may be able to use less CPU, it doesn't need much.

Re: Milkyway issues

Posted: Sat May 16, 2020 9:35 pm
by homefarm
ChelseaOilman wrote: Sat May 16, 2020 9:27 pm This is my app_config.xml file for MW to run 4 tasks per GPU:

Code: Select all

<app_config>
<app>
<name>milkyway</name>
<gpu_versions>
<gpu_usage>0.25</gpu_usage>
<cpu_usage>0.50</cpu_usage>
</gpu_versions>
</app>
</app_config>
It's pretty simple. I'm not using anything else. With MW you may be able to use less CPU, it doesn't need much.
Hi, Now running 6 on the Titan V CEO, exactly as your app_config, but updated numeric values.

My only issue seems it's reluctance to send more tasks, even on 7.15.0

<app_config>
<app>
<name>milkyway</name>
<max_concurrent>6</max_concurrent>
<gpu_versions>
<gpu_usage>0.15</gpu_usage>
<cpu_usage>.2</cpu_usage>
</gpu_versions>
</app>
</app_config>

Re: Milkyway issues

Posted: Sat May 16, 2020 9:40 pm
by ChelseaOilman
Maybe MW has to figure out it's dealing with a reliable host before it starts sending extra tasks. I always get a bunch.
I never do MW CPU tasks if that has any effect.
Also, since using the 7.15.0.0 client I never use the batch file to hammer the server and my cache is set to 0.1

Re: Milkyway issues

Posted: Sat May 16, 2020 11:34 pm
by UBT - Timbo
homefarm wrote: Sat May 16, 2020 9:35 pm Hi, Now running 6 on the Titan V CEO, exactly as your app_config, but updated numeric values.

My only issue seems it's reluctance to send more tasks, even on 7.15.0

<app_config>
<app>
<name>milkyway</name>
<max_concurrent>6</max_concurrent>
<gpu_versions>
<gpu_usage>0.15</gpu_usage>
<cpu_usage>.2</cpu_usage>
</gpu_versions>
</app>
</app_config>
Hi Richard

A quick question for you:

Did you try the "<gpu_usage>1.0</gpu_usage>" setting as I mentioned in my "long message" above?

If so, did BOTH GPUs get work?

Because I'm wondering this:

If you had the original setting as this: "<gpu_usage>0.1</gpu_usage>" then this might try to run 10 tasks on EACH GPU...as BOINC Manager doesn't know the capabilities of both GPUs, unless you tell it...and I doubt the 2080ti can run 10 MW tasks at once

Using my setting (of 1.0) would only run ONE task per GPU and in theory that should work.

If you are now running "<gpu_usage>0.15</gpu_usage>" that will try to run 6 tasks on BOTH GPUs...and I don't know if the 2080ti can cope with even concurrent 6 tasks.

Obviously if both GPUs were identical, then this shouldn't matter...but they aren't the same and so, one might have to create a specific "app_info.xml" that gives the parameters for both GPUs seperately...

re: getting new tasks.

As mentioned, you need to let BOINCs scheduler adjust to your new settings and this it can only do, once it has processed and completed some MW tasks...so, set the GPU usage to 1.0, allow new tasks to download from MW ONLY and then leave it for a few hours...as it needs time to re-configure the schedular to allow more tasks to download.

regards
Tim

Re: Milkyway issues

Posted: Sun May 17, 2020 5:19 am
by homefarm
Hi Tim
Yes I tried it, and it limited the tasks but still only finds 1 GPU.

I have detached the MW project on the problem PC, but left the other 2 PCs attached and running.

Next week I will start to do a new build, and will transfer the 2080ti to that new PC. That will make life simpler, and possibly more productive for MW.

Ironically MW is not currently a task I would choose to run for my own interest, but was trying to prepare the project for use in a Sprint. At least I still have 2 PCs happily running MW.

This 7890XE successfully runs Folding on background low priority, whilst also running Prime Grid in Boinc. The 2 projects share the GPUs, and make full use of their power. It's a good combination for me, allowing a beneficial project (Folding ie Covid-19) to run with a fun project (prime Grid).

Many thanks for all the tips, and advice.. :)

Quote for today

"It is the long history of humankind (and animal kind, too) that those who learned to collaborate and improvise most effectively have prevailed."
– Charles Darwin

Re: Milkyway issues

Posted: Sun May 17, 2020 9:31 am
by UBT - Timbo
homefarm wrote: Sun May 17, 2020 5:19 am Hi Tim
Yes I tried it, and it limited the tasks but still only finds 1 GPU.
Hi Richard

OK - well, the GPU config was only to fix whether you could process MW tasks on both GPUs - this "tip" didn't really have anything to do with the task limiting you were experiencing, which I think is down to other factors to do with the schedular.
homefarm wrote: Sun May 17, 2020 5:19 am I have detached the MW project on the problem PC, but left the other 2 PCs attached and running.

Next week I will start to do a new build, and will transfer the 2080ti to that new PC. That will make life simpler, and possibly more productive for MW.

Ironically MW is not currently a task I would choose to run for my own interest, but was trying to prepare the project for use in a Sprint. At least I still have 2 PCs happily running MW.

This 7890XE successfully runs Folding on background low priority, whilst also running Prime Grid in Boinc. The 2 projects share the GPUs, and make full use of their power. It's a good combination for me, allowing a beneficial project (Folding ie Covid-19) to run with a fun project (prime Grid).

Many thanks for all the tips, and advice.. :)

Quote for today

"It is the long history of humankind (and animal kind, too) that those who learned to collaborate and improvise most effectively have prevailed."
– Charles Darwin
Yup - and fair enough.

Trying to squeeze as much out of a PC, by loading it with various extra's such as multiple (and different) GPUs is always tricky...I know I had some issues when going back a few years, I had 2x GTX580's in the same PC and eventually I got both crunching concurrently...but it took a while to figure out the tweaks needed.

Ideally, of course, when BOINC Manager is installed, it *should* examine all your working hardware and, as you add projects (and other hardware), it *should* re-configure itself and make changes to the various XML files it uses... It should also have a built-in editor, so one can tweak these settings "on the fly", instead of having to use an external editor and keeping fingers crossed that it works.

But BM hasn't evolved to that point and so we're stuck with making manual changes and having to trawl websites finding fixes when things DON'T work as they should. :-(

Hope the new build goes well. :-)

regards
Tim

Re: Milkyway issues

Posted: Sun May 17, 2020 12:06 pm
by homefarm
UBT - Timbo wrote: Sun May 17, 2020 9:31 am
homefarm wrote: Sun May 17, 2020 5:19 am Hi Tim
Yes I tried it, and it limited the tasks but still only finds 1 GPU.
Hi Richard

OK - well, the GPU config was only to fix whether you could process MW tasks on both GPUs - this "tip" didn't really have anything to do with the task limiting you were experiencing, which I think is down to other factors to do with the schedular.
homefarm wrote: Sun May 17, 2020 5:19 am I have detached the MW project on the problem PC, but left the other 2 PCs attached and running.

Next week I will start to do a new build, and will transfer the 2080ti to that new PC. That will make life simpler, and possibly more productive for MW.

Ironically MW is not currently a task I would choose to run for my own interest, but was trying to prepare the project for use in a Sprint. At least I still have 2 PCs happily running MW.

This 7890XE successfully runs Folding on background low priority, whilst also running Prime Grid in Boinc. The 2 projects share the GPUs, and make full use of their power. It's a good combination for me, allowing a beneficial project (Folding ie Covid-19) to run with a fun project (prime Grid).

Many thanks for all the tips, and advice.. :)

Quote for today

"It is the long history of humankind (and animal kind, too) that those who learned to collaborate and improvise most effectively have prevailed."
– Charles Darwin
Yup - and fair enough.

Trying to squeeze as much out of a PC, by loading it with various extra's such as multiple (and different) GPUs is always tricky...I know I had some issues when going back a few years, I had 2x GTX580's in the same PC and eventually I got both crunching concurrently...but it took a while to figure out the tweaks needed.

Ideally, of course, when BOINC Manager is installed, it *should* examine all your working hardware and, as you add projects (and other hardware), it *should* re-configure itself and make changes to the various XML files it uses... It should also have a built-in editor, so one can tweak these settings "on the fly", instead of having to use an external editor and keeping fingers crossed that it works.

But BM hasn't evolved to that point and so we're stuck with making manual changes and having to trawl websites finding fixes when things DON'T work as they should. :-(

Hope the new build goes well. :-)

regards
Tim
The real oddity is that the other 7890XE, running 1 2080ti and 1 2060, is loaded up, and simultaneously running 8 tasks across the 2 GPUs. Boinc is still 7.14.2, and I made a cc_config change for more than 1 gpu; that's it.

So when MW isn't looking, I am going to sneak up and change the 2060 for a 2080ti from the problem PC. Hopefully it will carry on with a cache and 8 tasks without realising I've switched its coprocessor :pray:

Re: Milkyway issues

Posted: Sun May 17, 2020 1:30 pm
by UBT - Timbo
homefarm wrote: Sun May 17, 2020 12:06 pm The real oddity is that the other 7890XE, running 1 2080ti and 1 2060, is loaded up, and simultaneously running 8 tasks across the 2 GPUs. Boinc is still 7.14.2, and I made a cc_config change for more than 1 gpu; that's it.

So when MW isn't looking, I am going to sneak up and change the 2060 for a 2080ti from the problem PC. Hopefully it will carry on with a cache and 8 tasks without realising I've switched its coprocessor :pray:
Hi Richard

I can undersand your frustration as it does sound crazy that one dual-GPU set up works and the other doesn't.

One thing to note is that if BOINC detects a hardware change it can sometimes abort all the tasks...so I'd be tempted to stop allowing any new work to download (on this other PC) and then run down all the existing tasks and THEN change the GPU's over.

(Or of course you can change it anyways, but don't do it "mid-crunch" while tasks are still being crunched as you could lose those tasks and any time spent on them to date would be wasted).

regards
Tim

Re: Milkyway issues

Posted: Sun May 17, 2020 2:11 pm
by ChelseaOilman
Once you move the 2080 Ti out you'll have room for another Titan V in that box. Might empty your bank account unless you can find a good deal on a used one though.

I never mix GPUs in any of my computers. All my boxes have dual GPUs and GPUs in each box are identical. Saves a lot of hassle.

Re: Milkyway issues

Posted: Sun May 17, 2020 2:25 pm
by homefarm
ChelseaOilman wrote: Sun May 17, 2020 2:11 pm Once you move the 2080 Ti out you'll have room for another Titan V in that box. Might empty your bank account unless you can find a good deal on a used one though.

I never mix GPUs in any of my computers. All my boxes have dual GPUs and GPUs in each box are identical. Saves a lot of hassle.
What a good idea :lol:

This one was on eBay, as new unused. Once I received it and checked it out I found it was the Titan V CEO JH version. It's Substantially uprated in almost every way, and very similar to the Tesla GV100, except it does not have ECC RAM, thus making it a consumer/semi-professional model. I got lucky!

I will take your advice and double up on the 2080ti, and move the 2060 to an office I9-9900K, with a MEG mobo that doesn't have an output for the intel video processor. I can still run remotely on it.

Re: Milkyway issues

Posted: Sun May 17, 2020 2:27 pm
by homefarm
UBT - Timbo wrote: Sun May 17, 2020 1:30 pm
homefarm wrote: Sun May 17, 2020 12:06 pm The real oddity is that the other 7890XE, running 1 2080ti and 1 2060, is loaded up, and simultaneously running 8 tasks across the 2 GPUs. Boinc is still 7.14.2, and I made a cc_config change for more than 1 gpu; that's it.

So when MW isn't looking, I am going to sneak up and change the 2060 for a 2080ti from the problem PC. Hopefully it will carry on with a cache and 8 tasks without realising I've switched its coprocessor :pray:
Hi Richard

I can undersand your frustration as it does dound crazy that one dual-GPus et up works and the other doesn't.

One thing to note is that if BOINC detects a hardware change it can sometimes abort all the tasks...so I'd be tempted to stop allowing any new work to download (on this other PC) and then run down all the existing tasks and THEN change the GPU's over.

(Or of course you can change it anyways, but don't do it "mid-crunch" while tasks are still being crunched as you could lose those tasks and any time spent on them to date would be wasted).

regards
Tim
Yes I agree I'll close down gracefully

Re: Milkyway issues

Posted: Sun May 17, 2020 5:46 pm
by homefarm
Hi all,

https://boinc.berkeley.edu/forum_thread ... =&start=20

Massive discussions re the problem I and many others have experienced, with task supply.
I have enjoyed asking, learning and applying the suggestions herein and elsewhere.
I have come to the decision that the PCs working properly can continue with MW, if MW is in Sprints. Otherwise it's cheerio MW, hello "new friend that can feed GPUs properly".
I am running 7.15.0 now (again) and letting it sort itself out.
Should I find anything helpful to forum members I'll post it, but perhaps enough said for now.
Cheers all, have a nice day!