Tweaking app xml files

Post by **UBT - Timbo** » Sat Dec 31, 2016 7:09 pm

Each project will normally just crunch tasks based on each PC's capabilities.

However, there are times, esp now with CPU and GPU processing power improving in leaps and bounds, that your PC could actually do more.

And likewise, there could be times when you want to specifically limit certain projects from doing too much (as it could slow your PC down).

This is partly where the app_config.xml and app_info.xml files come into use as you can subtlety tweak the project so that it performs based on your willingness to share your computers resources (or not, as the case may be).

I'll list some project specific codes that might help members in further posts to this thread.

Post by **Woodles** » Tue Jan 03, 2017 10:44 am

For speeding up processing times on the Collatz Conjecture project, locate the collatz_sieve_1.21_windows_intelx86__opencl_amd_gpu.config (32 bit OS) or collatz_sieve_1.21_windows_x86_64__opencl_amd_gpu.config (64 bit OS) file in the C:\ProgramData\BOINC\projects\boinc.thesonntags.com_collatz project directory.

This will initially be an empty file. There are various settings that can be entered into this file to control the processing of the Collatz workunits with an empty configuration file, the workunits take approximately twice as long to process)

(name) (values)
verbose 1/0
If enabled, more data appears in the output - no effect on crunching
kernels/reduction 1 - 64
Number of kernels done before reducing. Higher numbers speed processing, too high crashes the video driver. 8 - 48 originally recommended although higher values work well with more powerful cards.
threads 6 - 11
Amount of GPU registers used (2^6 = 64, 2^11 = 2048) Bigger numbers allows more threads to be run in parallel, too big and slow external RAM needs to be used instead of the fast GPU registers. AMD best at 6 - 8, NVIDIA work best at 8 - 9 Bigger values for more modern cards.
lut_size 2 - 31
Size of the lookup table in powers of 2, each entry takes 8 bytes. (17 = 2^17 = 131,072 entries = 1,048,576 bytes or 1 GB) Larger is better, >20 will probably crash the GPU driver. Scale to fit in the GPU L1/L2 cache. 4GB cache = 512M entries = 2^19 entries -> 19
sieve_size 25 - 32
Size of the sieve used (2^25 to 2^32) as well as the number of of items per kernel (26 = roughly 1 million items per kernel, 27 = roughly 2 million items per kernel). Higher is better, too high crashes the GPU driver.
sleep >0
Number of milliseconds for the CPU to sleep while waiting for the kernel to be processed. Bigger values gives less CPU usage and better response but slows down the workunit processing.
cache_sieve 1/0
Cache the local sieve table between workunits. If 0, it's re-created for each workunit. Stored on disc not on the card, GPU memory size immaterial.
reduce_cpu 1/0
If enabled, more CPU utilisation (weirdly opposite to what the item name would suggest) but a more responsive graphics system.

For maximum credits per workunit:
threads should be scaled to the number of GPU registers present on your card.
lut_size should be scaled to fit in the GPU level 1/level 2 cache.
sleep should be set to 1.
cache_sieve should be set to 1.
reduce_cpu should be set to 0.

kernels/reduction and sieve_size are the ones to experiment with. (Note: not many cards list the number of GPU registers present so threads is usually also one for experimenting with)

Sample configurations grabbed from the top hosts:
AMD 480 (~13 minutes)