3.11-1 intel_pstate governor doesn't downclock. [SOLVED]
Hi!
A long post, but hopefully I get atleast some insight :) As the topic states. Since the 3.11-1.dmz.1-liquorix-amd64 update, my cpu (i7-3770K) seems to stay at turbo mode, i.e. 3.8GHz. i7z program says that all cores stay at 3898MHz (max turbo for this cpu). Core0 stays at C0 state 100%, whole the time, but the rest seem to swap between C1, C3 and C6 depending on load. It stays this way even in recovery mode, when there's almost no cpu utilization. Top and ps seem to indicate no out-of-control processes. :: Code ::
Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4] TURBO ENABLED on 4 Cores, Hyper Threading ON True Frequency 3598.97 MHz (99.97 x [36]) Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 39x/39x/39x/39x Current Frequency 3898.86 MHz [99.97 x 39.00] (Max of below) Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp Core 1 [0]: 3898.86 (39.00x) 100 0 0 0 0 39 Core 2 [1]: 3897.28 (38.98x) 1 0.0864 0 99.7 0 33 Core 3 [2]: 3897.46 (38.99x) 1.7 2 0 96.1 0 30 Core 4 [3]: 3895.78 (38.97x) 1 4.86 0 94.9 0 32 /sys/devices/system/cpu/cpu[0-7]/cpufreq/* :: Code ::
gvoima@alpha:/sys/devices/system/cpu$ cat cpu*/cpufreq/scaling_driver intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate gvoima@alpha:/sys/devices/system/cpu$ cat cpu*/cpufreq/scaling_governor powersave powersave powersave powersave powersave powersave powersave powersave gvoima@alpha:/sys/devices/system/cpu$ cat cpu*/cpufreq/scaling_min_freq && cpuinfo_min_freq 1600000 1600000 1600000 1600000 1600000 1600000 1600000 1600000 gvoima@alpha:/sys/devices/system/cpu$ cat cpu*/cpufreq/scaling_max_freq && cpuinfo_max_freq 3900000 3900000 3900000 3900000 3900000 3900000 3900000 3900000 gvoima@alpha:/sys/devices/system/cpu$ cat cpu*/cpufreq/scaling_available_governors performance powersave performance powersave performance powersave performance powersave performance powersave performance powersave performance powersave performance powersave Inxi: :: Code :: gvoima@alpha:~$ inxi -bxx
System: Host: alpha Kernel: 3.11-1.dmz.1-liquorix-amd64 x86_64 (64 bit, gcc: 4.7.3) Desktop: Gnome dm: gdm3 Distro: Debian GNU/Linux jessie/sid Machine: Mobo: ASUSTeK model: P8Z77-M version: Rev 1.xx Bios: American Megatrends version: 1806 date: 01/03/2013 CPU: Quad core Intel Core i7-3770K CPU (-HT-MCP-) clocked at 3885.00 MHz Graphics: Card: NVIDIA GT200b [GeForce GTX 275] bus-ID: 01:00.0 chip-ID: 10de:05e6 X.Org: 1.12.4 driver: nvidia Resolution: 2560x1440@60.0hz GLX Renderer: GeForce GTX 275/PCIe/SSE2 GLX Version: 3.3.0 NVIDIA 325.15 Direct Rendering: Yes Network: Card: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller driver: r8169 ver: 2.3LK-NAPI port: d000 bus-ID: 03:00.0 chip-ID: 10ec:8168 Drives: HDD Total Size: 2376.5GB (23.4% used) RAID: System: supported: raid1 Device: 1: /dev/md127 Unused Devices: none Info: Processes: 254 Uptime: 3 min Memory: 1169.2/7938.2MB Runlevel: 2 Gcc sys: 4.7.3 alt: 4.6/4.8 Client: Shell (bash 4.2.45 running in gnome-terminal-) inxi: 1.9.14 I'm quite sure there was no packets that could affect this, the last time I did an apt-get upgrade. And liquorix 3.10 did have a working governor, because I did some load testing and monitored cpuinfo with: watch grep MHz /proc/cpuinfo Back to top |
|||||
I had this issue start in 3.10, different cpu:
:: Code :: inxi -MCxxS
System: Host: yawn Kernel: 3.10-11.dmz.1-liquorix-686 i686 (32 bit, gcc: 4.7.3) Desktop: KDE 4.10.5 (Qt 4.8.5) dm: kdm Distro: sidux-20070102-d:1 Machine: Mobo: ASRock model: A770DE+ Bios: American Megatrends version: P1.70 date: 09/07/2010 CPU: Dual core AMD Athlon 64 X2 5000+ (-MCP-) cache: 1024 KB flags: (lm nx pae sse sse2 sse3 svm) bmips: 0 Clock Speeds: 1: 2600.00 MHz 2: 2600.00 MHz debian sid, updated to present. No idea if the stuff is related, sadly, the excessive rate of linux kernel releases almost guarantees, no, it does guarantee, that things that were working will break, and things that were broken, might get fixed, might not, and new features will be added, and those may break old features. This is why in sane software development, you do not release 4 primary versions a year, give or take, it's too many, too little time to stabilize and fix bugs as they happen. Back to top |
|||||
Hmm, this must be a bug in intel_pstate. I was using intel_pstate in 3.10 and my temperatures were the same as the temperatures when using acpi-cpufreq with ondemand.
Anyway, check this post under the arch linux forums. It appears that when implemented correctly, intel_pstate should be significantly better than acpi-cpufreq. There's no denying though that something is wrong with intel_pstate - I'm just watching Breaking Bad and it has one core stuck in C0: :: Code :: Cpu speed from cpuinfo 2394.00Mhz
cpuinfo might be wrong if cpufreq is enabled. To guess correctly try estimating via tsc Linux's inbuilt cpu_khz code emulated now True Frequency (without accounting Turbo) 2394 MHz CPU Multiplier 24x || Bus clock frequency (BCLK) 99.75 MHz Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4] TURBO ENABLED on 4 Cores, Hyper Threading ON True Frequency 2493.75 MHz (99.75 x [25]) Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 34x/33x/32x/32x Current Frequency 3269.68 MHz [99.75 x 32.78] (Max of below) Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % Temp Core 1 [0]: 3269.68 (32.78x) 98.8 0 0 0 57 Core 2 [1]: 3204.18 (32.12x) 14.8 77.8 2.45 0 55 Core 3 [2]: 3212.76 (32.21x) 12.3 75.6 7.92 0 49 Core 4 [3]: 3198.73 (32.07x) 11.6 83.1 1.41 0 51 C0 = Processor running without halting C1 = Processor running with halts (States >C0 are power saver) C3 = Cores running with PLL turned off and core cache turned off C6 = Everything in C3 + core state saved to last level cache Above values in table are in percentage over the last 1 sec [core-id] refers to core-id number in /proc/cpuinfo 'Garbage Values' message printed when garbage values are read Source: bbs.archlinux.org/viewtopic.php?pid=1294415#p1294415 In the meantime, we can add intel_pstate=disable to our boot parameters and see what happens. I'll test it myself too and if the temperatures drops, set intel_pstate to disable by default. If you don't mind, can you test with intel_pstate and see if acpi-cpufreq with ondemand is indeed lowering your temperatures? Back to top |
|||||
Sorry for the late reply, but I've been very busy lately.
I'll test as soon as I get some free time. Back to top |
|||||
I just thought I'd add my two cent.
I'm getting the same behavior with liquorix 3.11-1. My i7 2630QM stays in the high range of 2500 - 2800 MHz and only occasionally drops below 1500 MHz. Version 3.11-2 is much worse. It doesn't ever drop below 2700 MHz and the temp. skyrockets from ~32 C to the mid 70's in under half a minute from a cold boot up. I think my nVidia 325.08 edgers driver may be staying at higher frequencies for longer periods. Although, that could easily be me. I'm running a self compiled 3.11 clean except for using oldconfig and the frequency scaling for the cpu is fine. Ranges between 760 - 1400 MHz and only hits the 2000+ mark rarely and as needed. Back to top |
|||||
I just rebooted into liquorix 3.11-2 with intel_pstate=disable set as boot parameter. Frequency scaling is much better but not great. It now normally stays at 800 MHz but never lower. Only occasionally peaks and simply goes straight to 2001 MHz but no higher. Nor is there any middle ground. It's either 800 or 2001 MHz and that's it.
Back to top |
|||||
Same behavior with 3.11-3. My i7 stays in turbo mode.
Back to top |
|||||
warfacegod, this is intended behavior. Liquorix will only burn more power than a typical stock kernel that uses the default scaling configuration. This is to improve performance and reduce latency. There's no magic configuration that lets me save power and reduce latency at the same time, so the power has to give.
As far as intel_pstate goes, yes there's some kind of bug where the CPU prefers turbo and high frequencies over low frequencies when the system is relatively idle, I really don't know what's going on with that. It could be with Sandy Bridge and Ivy Bridge processors, the frequency of the CPU doesn't matter as much as the C state that a core is in, so intel_pstate intelligently keeps your CPU as a higher frequency. This is why I asked gvoima what his temperatures were with intel_pstate in liquorix versus a kernel that appears to be working "correctly". All that matters is that the CPU governor is reducing power usage, which results in reduced heat. For example, when my laptop is idle, my CPU cores are at about 40*C. Right now I'm typing up this response while watching a 720p video on YouTube. The temperatures on my cores are 45*C right now and intel_pstate is putting my CPU cores in turbo: :: Code :: damentz@damentz64:~$ inxi -C
CPU: Quad core Intel Core i7-3630QM CPU (-HT-MCP-) cache: 6144 KB flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) Clock Speeds: 1: 3192.00 MHz 2: 2880.00 MHz 3: 3144.00 MHz 4: 3192.00 MHz 5: 3000.00 MHz 6: 2328.00 MHz 7: 3192.00 MHz 8: 2760.00 MHz And here's i7z: :: Code :: Cpu speed from cpuinfo 2394.00Mhz
cpuinfo might be wrong if cpufreq is enabled. To guess correctly try estimating via tsc Linux's inbuilt cpu_khz code emulated now True Frequency (without accounting Turbo) 2394 MHz CPU Multiplier 24x || Bus clock frequency (BCLK) 99.75 MHz Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4] TURBO ENABLED on 4 Cores, Hyper Threading ON Max Frequency without considering Turbo 2493.75 MHz (99.75 x [25]) Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 34x/33x/32x/32x Real Current Frequency 3054.37 MHz [99.75 x 30.62] (Max of below) Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp Core 1 [0]: 2957.64 (29.65x) 13.6 4.14 2.1 0 77 45 Core 2 [1]: 3054.37 (30.62x) 15.1 2.39 1 0 77.3 48 Core 3 [2]: 3015.63 (30.23x) 12.9 2 2.13 0 79.6 45 Core 4 [3]: 3046.31 (30.54x) 16.4 9.59 1.02 0 68.5 46 And lastly, the last kernel upload changes nohz behavior from 'FULL' to 'IDLE'. I have a suspicion that intel_pstate has not been fully tested with nohz_full and it relies on the kernel to have a steady tick while the CPU is doing work. Back to top |
|||||
Ok, first and foremost sorry for the very late answer. I've been so busy that I completely forgot this thread :)
And secondly, the only problem here is that the one core is stuck at c0 state at 100% With the kernel parameter provided the only difference is, that the one core (core 0) temperature drops about 3-5 degrees celsius. This is not a problem atm. because I use a watercooling, a mATX case and it's winter, but in the summer when the overall temperature is much higher, the one core can raise the whole cpu temperature quite a lot. Let's hope there's a fix when we get to kernel 3.12, there should be some (major?) fixes relating to the governors. And btw. I'm still using the same kernel when I posted this thread. [edit] oh, and if there's anything I could test, please feel free to ask. I should have some spare time now. Back to top |
|||||
Looks like 3.11.8 may have fixed the issue. Here is the new output from i7z - no 100% C0 on Core 0:
:: Code :: Max Frequency without considering Turbo 2493.75 MHz (99.75 x [25])
Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 34x/33x/32x/32x Real Current Frequency 2333.36 MHz [99.75 x 23.39] (Max of below) Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp Core 1 [0]: 2333.36 (23.39x) 1 0.257 0 0 98.9 35 Core 2 [1]: 2234.35 (22.40x) 1 0.932 0 0 98.4 35 Core 3 [2]: 2275.17 (22.81x) 1 3.43 0 0 95.8 35 Core 4 [3]: 2163.00 (21.68x) 1.29 0.439 0 0 98.4 35 EDIT: Just read that you never upgraded your kernel. My early kernels for 3.11 have full dynticks enabled. This is wrong and increases heat output by never letting Core 0 get out of C0. Upgrade to the latest in the repository and that should fix your heat problems. Back to top |
|||||
All times are GMT - 8 Hours
|