Discussion:
nvme(4): some non-operational power states are broken
(too old to reply)
Alexey Sukhoguzov
2024-09-23 09:40:15 UTC
Permalink
Hi,

My NVMe controller is Toshiba XG5, and it has 6 power states: the
first three (0-2) are normal and the last three (3-5) are NOPS.
Here is 'nvmecontrol power -l nvme0' output:

# Max pwr Enter Lat Exit Lat RT RL WT WL Idle Pwr Act Pwr Workloadd
-- -------- --------- --------- -- -- -- -- -------- -------- --
0: 8.0000W 0.000ms 0.000ms 0 0 0 0 0.0000W 0.0000W 0
1: 3.9000W 0.000ms 0.000ms 1 1 1 1 0.0000W 0.0000W 0
2: 2.0000W 0.000ms 0.000ms 2 2 2 2 0.0000W 0.0000W 0
3: 0.0500W* 1.500ms 1.500ms 3 3 3 3 0.0000W 0.0000W 0
4: 0.0050W* 6.000ms 14.000ms 4 4 4 4 0.0000W 0.0000W 0
5: 0.0030W* 50.000ms 80.000ms 5 5 5 5 0.0000W 0.0000W 0

The problem is that only one of the NOPS is working as expected
(state 3). Another two (states 4-5) skyrocket the controller's power
consumption far beyond normal (0-2) power states do, and far beyond
reasonable. For example, when the controller is in state 3, my
system consumes about 3-3.5 W at idle (according to acpiconf with
laptop power cable unplugged), in states 0-2 - about 4 W, and in
states 4-5 consumption is approaching 6 W. Thus, the NVMe becomes
the hottest part of the system (>50C, still idle), and it eats up
almost half of the battery alone.

Linux doesn't have this issue, so it seems to be nvme(4) related.
All the above data is collected on 14.1-RELEASE Live USB with no
filesystem mounted. 15-CURRENT has the same problem.

Any ideas what it might be?

Regards,
Alexey


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Warner Losh
2024-09-23 09:47:16 UTC
Permalink
Post by Alexey Sukhoguzov
Hi,
My NVMe controller is Toshiba XG5, and it has 6 power states: the
first three (0-2) are normal and the last three (3-5) are NOPS.
# Max pwr Enter Lat Exit Lat RT RL WT WL Idle Pwr Act Pwr Workloadd
-- -------- --------- --------- -- -- -- -- -------- -------- --
0: 8.0000W 0.000ms 0.000ms 0 0 0 0 0.0000W 0.0000W 0
1: 3.9000W 0.000ms 0.000ms 1 1 1 1 0.0000W 0.0000W 0
2: 2.0000W 0.000ms 0.000ms 2 2 2 2 0.0000W 0.0000W 0
3: 0.0500W* 1.500ms 1.500ms 3 3 3 3 0.0000W 0.0000W 0
4: 0.0050W* 6.000ms 14.000ms 4 4 4 4 0.0000W 0.0000W 0
5: 0.0030W* 50.000ms 80.000ms 5 5 5 5 0.0000W 0.0000W 0
The problem is that only one of the NOPS is working as expected
(state 3). Another two (states 4-5) skyrocket the controller's power
consumption far beyond normal (0-2) power states do, and far beyond
reasonable. For example, when the controller is in state 3, my
system consumes about 3-3.5 W at idle (according to acpiconf with
laptop power cable unplugged), in states 0-2 - about 4 W, and in
states 4-5 consumption is approaching 6 W. Thus, the NVMe becomes
the hottest part of the system (>50C, still idle), and it eats up
almost half of the battery alone.
Linux doesn't have this issue, so it seems to be nvme(4) related.
All the above data is collected on 14.1-RELEASE Live USB with no
filesystem mounted. 15-CURRENT has the same problem.
Any ideas what it might be?
Does Linux have active power state management?

Warner
Post by Alexey Sukhoguzov
Regards,
Alexey
Alexey Sukhoguzov
2024-09-23 12:42:47 UTC
Permalink
Post by Warner Losh
Does Linux have active power state management?
I didn't know there was such a thing, thanks! I tried to boot with
'pcie_aspm=off' kernel parameter and dmesg said that ASPM was
disabled, but I didn't see any difference in terms of temperature
or /sys/class/power_supply/BAT0/power_now value, they all are within
normal boundaries.
Post by Warner Losh
And what's the workload?
This machine is my workstation, so I would say there is almost no
workload in terms of IO.
Post by Warner Losh
What performance are you seeing?
I have no other issues with the device, except for power consumption
in states 4 and 5. After enabling APST (I patched the kernel for
this) and limiting idle transitions to state 3 this problem was
mostly resolved as well, but I think it would still be worth finding
out what is actually going on.
Post by Warner Losh
And what's the reported model number? What form factor?
Model number is KXG5AZNV256G, form factor is M.2 2280. If necessary,
I can send Identify command output.
Post by Warner Losh
We used these at work years ago, but in only one hw spin.
For production machines idle power consumption may not be relevant
at all, but for a laptop it's a huge difference. And that's a good
news if it's just my faulty device and no one else is having the
same problem. No additional work is needed then, which is always
great :)


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Warner Losh
2024-09-23 10:10:15 UTC
Permalink
Post by Warner Losh
Post by Alexey Sukhoguzov
Hi,
My NVMe controller is Toshiba XG5, and it has 6 power states: the
first three (0-2) are normal and the last three (3-5) are NOPS.
# Max pwr Enter Lat Exit Lat RT RL WT WL Idle Pwr Act Pwr Workloadd
-- -------- --------- --------- -- -- -- -- -------- -------- --
0: 8.0000W 0.000ms 0.000ms 0 0 0 0 0.0000W 0.0000W 0
1: 3.9000W 0.000ms 0.000ms 1 1 1 1 0.0000W 0.0000W 0
2: 2.0000W 0.000ms 0.000ms 2 2 2 2 0.0000W 0.0000W 0
3: 0.0500W* 1.500ms 1.500ms 3 3 3 3 0.0000W 0.0000W 0
4: 0.0050W* 6.000ms 14.000ms 4 4 4 4 0.0000W 0.0000W 0
5: 0.0030W* 50.000ms 80.000ms 5 5 5 5 0.0000W 0.0000W 0
The problem is that only one of the NOPS is working as expected
(state 3). Another two (states 4-5) skyrocket the controller's power
consumption far beyond normal (0-2) power states do, and far beyond
reasonable. For example, when the controller is in state 3, my
system consumes about 3-3.5 W at idle (according to acpiconf with
laptop power cable unplugged), in states 0-2 - about 4 W, and in
states 4-5 consumption is approaching 6 W. Thus, the NVMe becomes
the hottest part of the system (>50C, still idle), and it eats up
almost half of the battery alone.
Linux doesn't have this issue, so it seems to be nvme(4) related.
All the above data is collected on 14.1-RELEASE Live USB with no
filesystem mounted. 15-CURRENT has the same problem.
Any ideas what it might be?
Does Linux have active power state management?
And what's the workload? What performance are you seeing? And what's the
reported model number? What form factor? I have an m.2 XG5 hanging around.
We used these at work years ago, but in only one hw spin. We measured no
power diffs btn the states in the system with our streaming workload. They
also had a higher latency more quickly than other vendors, but not enough
to matter so far (the machines they were in are nearing EOL) and it was
only during high write loads iirc. I never looked at the temperature since
they never got above our limits.

Warner
Post by Warner Losh
Warner
Post by Alexey Sukhoguzov
Regards,
Alexey
Loading...