Discussion:
ACPI? problem with release 8.0
(too old to reply)
Malcolm Kay
2010-04-10 07:52:43 UTC
Permalink
My machine had two SATA 300GB drives
(WDC WD3200KS-00PFB0 21.00M21) one carrying FreeBSD RELEASE-6.3
and the other RELEASE-7.0 all of which worked OK.

Recently added SATA 1TB (WDC WD10EADS-00P8B0 01.00A01) and
installed RELEASE 8.0 thereon. When I boot to RELEASE 8.0
I find after some time, few minutes to rather more minutes
the system just powers down without warning or any obvious cause.
It seems to mostly happen when the system is relatively quiet.

Suspecting the ACPI I added:
hint.acpi.0.disabled=1
to loader.conf.
I then found RELEASE 8.0 would not boot -- or at least
it was unable to mount root. I get a "mountroot>" prompt
but this seemed not to accept anything I could think of,
and "?" to list available targets yielded nothing. Rebooting and
overriding this with option 2 (enable ACPI) in the boot menu
took me back to a bootable but fragile system.

Changing the loader.conf entry to:
debug.acpi.disabled=all
had the same effect as the hint.acpi.0.disabled=1.

I then thought to be somewhat selective with debug.acpi.disabled
and intended to try:
debug.acpi.disabled=acad button cpu lid thermal timer video
only now as I write this I discover I actually entered:
debug.acpi.disabled=acadbutton cpu lid thermal timer video

Now the RELEASE-8.0 booted but remained fragile.

I've repaired this last entry and will proceed to try it.
Meanwhile I feel I am fumbling about in the dark without
sufficient (or any real) knowledge of the range of tasks
performed by ACPI.

Is my guess that I have an interaction problem between ACPI and
RELEASE-8.0 a reasonable one? Where can I go from here?

The system uses a Gigabyte GA-M55SLI-S4 mother board and the
prcessor is AMD Athlon(tm) 64 X2 Dual Core Processor 5600+

Please offer suggestions or comments.

Malcolm Kay
Malcolm Kay
2010-04-12 06:01:33 UTC
Permalink
I desperately need to make some progress on this issue.

Is it likely that the issue is real rather than hardware
or disk corruption? Earlier releases are operating OK on the same
machine.

I have now confirmed that:
debug.acpi.disabled=acad button cpu lid thermal timer video
still leaves the system crashing and powering down when idle for
a while. And the more extensive:
debug.acpi.disabled=acad bus children button cmbat cpu ec isa
lid pci pci_link sysresource thermal timer video
does the same.

I don't really need power management but with acpi disabled the
disks are not visible to the system.

Are there sysctl variables that can influence this behaviour?
Currently I believe we have:

hw.acpi.supported_sleep_state: S1 S4 S5
hw.acpi.power_button_state: S5
hw.acpi.sleep_button_state: S1
hw.acpi.lid_switch_state: NONE
hw.acpi.standby_state: S1
hw.acpi.suspend_state: NONE
hw.acpi.sleep_delay: 1
hw.acpi.s4bios: 0
hw.acpi.verbose: 0
hw.acpi.disable_on_reboot: 0
hw.acpi.handle_reboot: 0
hw.acpi.reset_video: 0
hw.acpi.cpu.cx_lowest: C1
machdep.idle: amdc1e
machdep.idle_available: spin, amdc1e, hlt, acpi,

However on the earlier RELEASEs that work I note we do not have
machdep.idle or machdep.idle_available. Instead I find:
machdep.cpu_idle_hlt: 1
machdep.hlt_cpus: 0

Although I've not been able to relate this directly to my problem
from Googling it seems that there some issues with amdc1e under
BSD, Linux and perhaps Windows. But all the references seem to
amd c1e are related to systems in 64 bit mode while I am running
(or trying to run) i386 so I wonder why I have:
machdep.idle: amdc1e

Maybe my problem is not acpi as such but this idle mode.

My thought is to change this to
machdep.idle: hlt
or even
machdep.idle: acpi

Any comments or ideas please!

Thank you for your attention.

Malcolm Kay
Post by Malcolm Kay
My machine had two SATA 300GB drives
(WDC WD3200KS-00PFB0 21.00M21) one carrying FreeBSD
RELEASE-6.3 and the other RELEASE-7.0 all of which worked OK.
Recently added SATA 1TB (WDC WD10EADS-00P8B0 01.00A01) and
installed RELEASE 8.0 thereon. When I boot to RELEASE 8.0
I find after some time, few minutes to rather more minutes
the system just powers down without warning or any obvious
cause. It seems to mostly happen when the system is relatively
quiet.
hint.acpi.0.disabled=1
to loader.conf.
I then found RELEASE 8.0 would not boot -- or at least
it was unable to mount root. I get a "mountroot>" prompt
but this seemed not to accept anything I could think of,
and "?" to list available targets yielded nothing. Rebooting
and overriding this with option 2 (enable ACPI) in the boot
menu took me back to a bootable but fragile system.
debug.acpi.disabled=all
had the same effect as the hint.acpi.0.disabled=1.
I then thought to be somewhat selective with
debug.acpi.disabled=acad button cpu lid thermal timer video
debug.acpi.disabled=acadbutton cpu lid thermal timer video
Now the RELEASE-8.0 booted but remained fragile.
I've repaired this last entry and will proceed to try it.
Meanwhile I feel I am fumbling about in the dark without
sufficient (or any real) knowledge of the range of tasks
performed by ACPI.
Is my guess that I have an interaction problem between ACPI
and RELEASE-8.0 a reasonable one? Where can I go from here?
The system uses a Gigabyte GA-M55SLI-S4 mother board and the
prcessor is AMD Athlon(tm) 64 X2 Dual Core Processor 5600+
Please offer suggestions or comments.
Malcolm Kay
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to
Adam Vande More
2010-04-12 07:10:01 UTC
Permalink
On Mon, Apr 12, 2010 at 1:01 AM, Malcolm Kay
Post by Malcolm Kay
I desperately need to make some progress on this issue.
Is it likely that the issue is real rather than hardware
or disk corruption? Earlier releases are operating OK on the same
machine.
debug.acpi.disabled=acad button cpu lid thermal timer video
still leaves the system crashing and powering down when idle for
debug.acpi.disabled=acad bus children button cmbat cpu ec isa
lid pci pci_link sysresource thermal timer video
does the same.
I don't really need power management but with acpi disabled the
disks are not visible to the system.
Are there sysctl variables that can influence this behaviour?
hw.acpi.supported_sleep_state: S1 S4 S5
hw.acpi.power_button_state: S5
hw.acpi.sleep_button_state: S1
hw.acpi.lid_switch_state: NONE
hw.acpi.standby_state: S1
hw.acpi.suspend_state: NONE
hw.acpi.sleep_delay: 1
hw.acpi.s4bios: 0
hw.acpi.verbose: 0
hw.acpi.disable_on_reboot: 0
hw.acpi.handle_reboot: 0
hw.acpi.reset_video: 0
hw.acpi.cpu.cx_lowest: C1
machdep.idle: amdc1e
machdep.idle_available: spin, amdc1e, hlt, acpi,
However on the earlier RELEASEs that work I note we do not have
machdep.cpu_idle_hlt: 1
machdep.hlt_cpus: 0
Although I've not been able to relate this directly to my problem
from Googling it seems that there some issues with amdc1e under
BSD, Linux and perhaps Windows. But all the references seem to
amd c1e are related to systems in 64 bit mode while I am running
machdep.idle: amdc1e
Maybe my problem is not acpi as such but this idle mode.
My thought is to change this to
machdep.idle: hlt
or even
machdep.idle: acpi
Any comments or ideas please!
Thank you for your attention.
Is there anything in /var/log/messages which indicates the cause? Can you
monitor cpu temp?
--
Adam Vande More
Malcolm Kay
2010-04-13 04:23:49 UTC
Permalink
Post by Adam Vande More
On Mon, Apr 12, 2010 at 1:01 AM, Malcolm Kay
Post by Malcolm Kay
I desperately need to make some progress on this issue.
Is it likely that the issue is real rather than hardware
or disk corruption? Earlier releases are operating OK on the
same machine.
debug.acpi.disabled=acad button cpu lid thermal timer video
still leaves the system crashing and powering down when idle
debug.acpi.disabled=acad bus children button cmbat cpu ec
isa lid pci pci_link sysresource thermal timer video
does the same.
I don't really need power management but with acpi disabled
the disks are not visible to the system.
Are there sysctl variables that can influence this
hw.acpi.supported_sleep_state: S1 S4 S5
hw.acpi.power_button_state: S5
hw.acpi.sleep_button_state: S1
hw.acpi.lid_switch_state: NONE
hw.acpi.standby_state: S1
hw.acpi.suspend_state: NONE
hw.acpi.sleep_delay: 1
hw.acpi.s4bios: 0
hw.acpi.verbose: 0
hw.acpi.disable_on_reboot: 0
hw.acpi.handle_reboot: 0
hw.acpi.reset_video: 0
hw.acpi.cpu.cx_lowest: C1
machdep.idle: amdc1e
machdep.idle_available: spin, amdc1e, hlt, acpi,
However on the earlier RELEASEs that work I note we do not
machdep.cpu_idle_hlt: 1
machdep.hlt_cpus: 0
Although I've not been able to relate this directly to my
problem from Googling it seems that there some issues with
amdc1e under BSD, Linux and perhaps Windows. But all the
references seem to amd c1e are related to systems in 64 bit
mode while I am running (or trying to run) i386 so I wonder
machdep.idle: amdc1e
Maybe my problem is not acpi as such but this idle mode.
My thought is to change this to
machdep.idle: hlt
or even
machdep.idle: acpi
Any comments or ideas please!
Thank you for your attention.
Is there anything in /var/log/messages which indicates the
cause? Can you monitor cpu temp?
No clues in messages -- seems to just power down without any
warning.

I don't seem to have any thermal monitoring readily available
except in the BIOS screens -- which seem to indicate everything
is fine. But I guess this is not really indicative of what is
happening with a running system. But the same machine has run
earlier versions of FreeBSD staying up months at a time and only
going down on power failures or on odd occassions I might want
to look at BIOS settings or some such, so I feel fairly
confident it is not a thermal issue.

Hmm, I think there might be a BIOS setting to switch on health
reporting which I expect would show up under sysctl.

Thanks for the contribution.

The more I think about it the more I believe the issue is
connected with machdep.idle: amdc1e
I am going to try changing this.

Thanks and regards,

Malcolm
Ian Smith
2010-04-12 18:33:55 UTC
Permalink
In freebsd-questions Digest, Vol 306, Issue 1, Message: 18
Post by Malcolm Kay
I desperately need to make some progress on this issue.
Then I suggest taking it to freebsd-acpi@ without passing go .. maybe
with a bit more data to hand, as outlined in the ACPI debugging section
of the handbook.
Post by Malcolm Kay
Is it likely that the issue is real rather than hardware
or disk corruption? Earlier releases are operating OK on the same
machine.
Sounds like a real issue, but I don't know the hardware. Does it have
the latest available BIOS update? If not, that's step one. Will it
stay up long enough to get a verbose dmesg off it? Do you have a
verbose dmesg from an earlier working release for comparison?
Post by Malcolm Kay
debug.acpi.disabled=acad button cpu lid thermal timer video
still leaves the system crashing and powering down when idle for
debug.acpi.disabled=acad bus children button cmbat cpu ec isa
lid pci pci_link sysresource thermal timer video
does the same.
I don't really need power management but with acpi disabled the
disks are not visible to the system.
ACPI needs to work on modern hardware, no question.
Post by Malcolm Kay
Are there sysctl variables that can influence this behaviour?
hw.acpi.supported_sleep_state: S1 S4 S5
hw.acpi.power_button_state: S5
hw.acpi.sleep_button_state: S1
hw.acpi.lid_switch_state: NONE
hw.acpi.standby_state: S1
hw.acpi.suspend_state: NONE
hw.acpi.sleep_delay: 1
hw.acpi.s4bios: 0
hw.acpi.verbose: 0
May help to set hw.acpi.verbose=1 in /boot/loader.conf while debugging;
especially useful after verbose boot for detail in dmesg and messages.
Post by Malcolm Kay
hw.acpi.disable_on_reboot: 0
hw.acpi.handle_reboot: 0
hw.acpi.reset_video: 0
hw.acpi.cpu.cx_lowest: C1
Is that with acpi.thermal disabled? If so, showing hw.acpi and
debug.acpi with everything enabled might provide more clues.
Post by Malcolm Kay
machdep.idle: amdc1e
machdep.idle_available: spin, amdc1e, hlt, acpi,
However on the earlier RELEASEs that work I note we do not have
machdep.cpu_idle_hlt: 1
machdep.hlt_cpus: 0
Although I've not been able to relate this directly to my problem
from Googling it seems that there some issues with amdc1e under
BSD, Linux and perhaps Windows. But all the references seem to
amd c1e are related to systems in 64 bit mode while I am running
machdep.idle: amdc1e
Maybe my problem is not acpi as such but this idle mode.
Could well be. Someone on acpi@ will know about amdc1e, I don't,
but any BIOS setting re C1E could be relevant to this.
Post by Malcolm Kay
My thought is to change this to
machdep.idle: hlt
or even
machdep.idle: acpi
Maybe try setting it to acpi first (without any disabled parts) and try?
Can't do any worse than crash the same?
Post by Malcolm Kay
Any comments or ideas please!
Thank you for your attention.
Malcolm Kay
Post by Malcolm Kay
My machine had two SATA 300GB drives
(WDC WD3200KS-00PFB0 21.00M21) one carrying FreeBSD
RELEASE-6.3 and the other RELEASE-7.0 all of which worked OK.
Recently added SATA 1TB (WDC WD10EADS-00P8B0 01.00A01) and
installed RELEASE 8.0 thereon. When I boot to RELEASE 8.0
I find after some time, few minutes to rather more minutes
the system just powers down without warning or any obvious
cause. It seems to mostly happen when the system is relatively
quiet.
Adam's suggestion to check that esp. CPU temperature is within spec is
worth checking; if you don't have any thermal zones in your ACPI I'd be
surprised, and maybe concerned. A finger on the heatsink is next best.
Post by Malcolm Kay
Post by Malcolm Kay
hint.acpi.0.disabled=1
to loader.conf.
I then found RELEASE 8.0 would not boot -- or at least
it was unable to mount root. I get a "mountroot>" prompt
but this seemed not to accept anything I could think of,
and "?" to list available targets yielded nothing. Rebooting
and overriding this with option 2 (enable ACPI) in the boot
menu took me back to a bootable but fragile system.
debug.acpi.disabled=all
had the same effect as the hint.acpi.0.disabled=1.
As it should.
Post by Malcolm Kay
Post by Malcolm Kay
I then thought to be somewhat selective with
debug.acpi.disabled=acad button cpu lid thermal timer video
debug.acpi.disabled=acadbutton cpu lid thermal timer video
Now the RELEASE-8.0 booted but remained fragile.
I've repaired this last entry and will proceed to try it.
Meanwhile I feel I am fumbling about in the dark without
sufficient (or any real) knowledge of the range of tasks
performed by ACPI.
Is my guess that I have an interaction problem between ACPI
and RELEASE-8.0 a reasonable one? Where can I go from here?
The system uses a Gigabyte GA-M55SLI-S4 mother board and the
prcessor is AMD Athlon(tm) 64 X2 Dual Core Processor 5600+
The last para may hold the primary keys to the solution set ..

cheers, Ian
Malcolm Kay
2010-04-13 05:08:33 UTC
Permalink
Post by Ian Smith
In freebsd-questions Digest, Vol 306, Issue 1, Message: 18
On Mon, 12 Apr 2010 15:31:33 +0930 Malcolm Kay
Post by Malcolm Kay
I desperately need to make some progress on this issue.
.. maybe with a bit more data to hand, as outlined in the ACPI
debugging section of the handbook.
Yes, I have now realised this; but now somewhat reticent to move
there now and be criticised for cross-posting
Post by Ian Smith
Post by Malcolm Kay
Is it likely that the issue is real rather than hardware
or disk corruption? Earlier releases are operating OK on
the same machine.
Sounds like a real issue, but I don't know the hardware. Does
it have the latest available BIOS update? If not, that's step
one. Will it stay up long enough to get a verbose dmesg off
it? Do you have a verbose dmesg from an earlier working
release for comparison?
Probably not; I have considered it.
But the manufacturer's site warns not to upgrade unless you have
identifyable problems (or something similar).
And since earlier release work well I'm not anxious to open a new
can of worms. If I become sufficiently desparate I'll try it.
Post by Ian Smith
Post by Malcolm Kay
debug.acpi.disabled=acad button cpu lid thermal timer
video still leaves the system crashing and powering down
debug.acpi.disabled=acad bus children button cmbat cpu ec
isa lid pci pci_link sysresource thermal timer video
does the same.
I don't really need power management but with acpi disabled
the disks are not visible to the system.
ACPI needs to work on modern hardware, no question.
Post by Malcolm Kay
Are there sysctl variables that can influence this
hw.acpi.supported_sleep_state: S1 S4 S5
hw.acpi.power_button_state: S5
hw.acpi.sleep_button_state: S1
hw.acpi.lid_switch_state: NONE
hw.acpi.standby_state: S1
hw.acpi.suspend_state: NONE
hw.acpi.sleep_delay: 1
hw.acpi.s4bios: 0
hw.acpi.verbose: 0
May help to set hw.acpi.verbose=1 in /boot/loader.conf while
debugging; especially useful after verbose boot for detail in
dmesg and messages.
Looks as though it might be useful, but I'm starting to believe
acpi itself may not be the problem
Post by Ian Smith
Post by Malcolm Kay
hw.acpi.disable_on_reboot: 0
hw.acpi.handle_reboot: 0
hw.acpi.reset_video: 0
hw.acpi.cpu.cx_lowest: C1
Is that with acpi.thermal disabled?
No, this is run with acpi as default configured.
Boot | login as root | sysctl -a > sysctl.dump | shutdown -p now
(Get out before crash so that I don't get into trouble with
fsck on reboot, yes it runs in the background but takes forever.)

Rebooting in FreeBSD 7.0 I can now mount the 8.0 partitions and
look at the dump in my own time -- and also prepare these
emails. (Fsck also runs under 7.0 on the 8.0 partitions if 8.0
was allowed to crash.)
Post by Ian Smith
If so, showing hw.acpi
and debug.acpi with everything enabled might provide more
clues.
OK
Post by Ian Smith
Post by Malcolm Kay
machdep.idle: amdc1e
machdep.idle_available: spin, amdc1e, hlt, acpi,
However on the earlier RELEASEs that work I note we do not
have machdep.idle or machdep.idle_available. Instead I
find: machdep.cpu_idle_hlt: 1
machdep.hlt_cpus: 0
Although I've not been able to relate this directly to my
problem from Googling it seems that there some issues with
amdc1e under BSD, Linux and perhaps Windows. But all the
references seem to amd c1e are related to systems in 64 bit
mode while I am running (or trying to run) i386 so I wonder
machdep.idle: amdc1e
Maybe my problem is not acpi as such but this idle mode.
don't, but any BIOS setting re C1E could be relevant to this.
Post by Malcolm Kay
My thought is to change this to
machdep.idle: hlt
or even
machdep.idle: acpi
Maybe try setting it to acpi first (without any disabled
parts) and try? Can't do any worse than crash the same?
I think this should be my next task.
I have on hand another machine (not mine) running realease 8.0
but using an Intel Core i7 processor. This shows
machdep.idle: acpi
machdep.idle_available: spin, mwait, mwait_hlt, hlt, acpi,
Post by Ian Smith
Post by Malcolm Kay
Any comments or ideas please!
Thank you for your attention.
Malcolm Kay
Post by Malcolm Kay
My machine had two SATA 300GB drives
(WDC WD3200KS-00PFB0 21.00M21) one carrying FreeBSD
RELEASE-6.3 and the other RELEASE-7.0 all of which worked
OK.
Recently added SATA 1TB (WDC WD10EADS-00P8B0 01.00A01)
and installed RELEASE 8.0 thereon. When I boot to RELEASE
8.0 I find after some time, few minutes to rather more
minutes the system just powers down without warning or
any obvious cause. It seems to mostly happen when the
system is relatively quiet.
Adam's suggestion to check that esp. CPU temperature is within
spec is worth checking; if you don't have any thermal zones in
your ACPI I'd be surprised, and maybe concerned. A finger on
the heatsink is next best.
See my response to Adam.
Post by Ian Smith
Post by Malcolm Kay
Post by Malcolm Kay
hint.acpi.0.disabled=1
to loader.conf.
I then found RELEASE 8.0 would not boot -- or at least
it was unable to mount root. I get a "mountroot>" prompt
but this seemed not to accept anything I could think of,
and "?" to list available targets yielded nothing.
Rebooting and overriding this with option 2 (enable ACPI)
in the boot menu took me back to a bootable but fragile
system.
debug.acpi.disabled=all
had the same effect as the hint.acpi.0.disabled=1.
As it should.
I guess so but wondered whether 'all' meant all the individually
selectables but still leaving some essential parts of acpi
active.
Post by Ian Smith
Post by Malcolm Kay
Post by Malcolm Kay
I then thought to be somewhat selective with
debug.acpi.disabled=acad button cpu lid thermal timer
video only now as I write this I discover I actually
entered: debug.acpi.disabled=acadbutton cpu lid thermal
timer video
Now the RELEASE-8.0 booted but remained fragile.
I've repaired this last entry and will proceed to try it.
Meanwhile I feel I am fumbling about in the dark without
sufficient (or any real) knowledge of the range of tasks
performed by ACPI.
Is my guess that I have an interaction problem between
ACPI and RELEASE-8.0 a reasonable one? Where can I go
from here?
The system uses a Gigabyte GA-M55SLI-S4 mother board and
the prcessor is AMD Athlon(tm) 64 X2 Dual Core Processor
5600+
The last para may hold the primary keys to the solution set ..
cheers, Ian
I'll report (for posterity) if changing machdep.idle: works.

Thanks for your attention and thoughts,

Malcolm

Loading...