Boot loop after upgrade

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Boot loop after upgrade

Michel D'HOOGE
Hello,

I believe I have the same problem as Duncan...
I'm also using ArchLinux and when I upgraded the kernel from 4.11.9-1
to 4.12.8-2, it stopped booting through Xen. I have a black screen and
the only working option is to power off. I do have the "noreboot"
option set, but because (I guess) of an NVidia Optimus GPU, I can't
see anything. It's also EFI-based.

I can still boot any kernel versions if I skip XEN. And I can still
boot through xen if I use the LTS kernel (4.9.44-1, then 4.9.45-1).

So it must have something to do with the 4.12 version of the kernel,
but it is very hard to debug (and I don't have any serial output on my
laptop). I tried many different configurations, both at Xen and kernel
levels (e.g. acpi=off, earlyprintk=efi) but nothing works.

I'll try to diff the kernel configurations when I have some spare time!

Michel

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Boot loop after upgrade

Roger Pau Monné-3
On Tue, Aug 29, 2017 at 12:14:11PM +0200, Michel D'HOOGE wrote:

> Hello,
>
> I believe I have the same problem as Duncan...
> I'm also using ArchLinux and when I upgraded the kernel from 4.11.9-1
> to 4.12.8-2, it stopped booting through Xen. I have a black screen and
> the only working option is to power off. I do have the "noreboot"
> option set, but because (I guess) of an NVidia Optimus GPU, I can't
> see anything. It's also EFI-based.
>
> I can still boot any kernel versions if I skip XEN. And I can still
> boot through xen if I use the LTS kernel (4.9.44-1, then 4.9.45-1).
>
> So it must have something to do with the 4.12 version of the kernel,
> but it is very hard to debug (and I don't have any serial output on my
> laptop). I tried many different configurations, both at Xen and kernel
> levels (e.g. acpi=off, earlyprintk=efi) but nothing works.
>
> I'll try to diff the kernel configurations when I have some spare time!

I don't know much about EFI, but it seems 4.12 (and probably older
kernels) have a bug when running on Xen with EFI. The following patch
should fix the bug:

https://lkml.org/lkml/2017/6/23/391

Roger.

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Boot loop after upgrade

Michel D'HOOGE
In reply to this post by Michel D'HOOGE
Roger,
Thank you for your link, I'll see how I can test it...

Meanwhile I digged a bit into the ArchLinux kernel packages (if the
problem was in
the Linux kernel, I believe there'd be more people complaining!). But this is
totally crazy when one doesn't know what to look for :-(

So the history is there:
https://git.archlinux.org/svntogit/packages.git/log/trunk?h=packages/linux
and because Duncan posted his first message on 2017-08-06, I started from there
backwards till 4.11.9.

@Duncan: Can you check in pacman.log to which version you upgraded?

Michel



2017-07-13  4.12.1-2: FS#54788 KASLR, no SCSI_MQ (unstable)

  -# CONFIG_RANDOMIZE_BASE is not set
  +CONFIG_RANDOMIZE_BASE=y
  +CONFIG_X86_NEED_RELOCS=y
   CONFIG_PHYSICAL_ALIGN=0x1000000
  +CONFIG_RANDOMIZE_MEMORY=y
  +CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa

  -CONFIG_SCSI_MQ_DEFAULT=y
  +# CONFIG_SCSI_MQ_DEFAULT is not set

  -# CONFIG_OPTIMIZE_INLINING is not set
  +CONFIG_OPTIMIZE_INLINING=y


2017-07-08  4.12-2: more modules (FS#54603)
  This one has a lot of config changes, but none with "XEN" in their names.


2017-07-05  4.12
   Also a lot of changes, with some XEN-related:

   CONFIG_PARAVIRT_SPINLOCKS=y
   # CONFIG_QUEUED_LOCK_STAT is not set
   CONFIG_XEN=y
  +CONFIG_XEN_PV=y
  +CONFIG_XEN_PV_SMP=y
   CONFIG_XEN_DOM0=y
   CONFIG_XEN_PVHVM=y
  +CONFIG_XEN_PVHVM_SMP=y
   CONFIG_XEN_512GB=y
   CONFIG_XEN_SAVE_RESTORE=y

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Boot loop after upgrade

John Thomson-2
Hi Michel,

On Tue, 29 Aug 2017, at 22:52, Michel D'HOOGE wrote:
> if the problem was in the Linux kernel, I believe there'd be more people complaining!

There does seem to be a linux 4.12 kernel bug due to
CONFIG_INTEL_ATOMISP
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711298

I tested an Arch Linux PV domU guest:
fails to boot with 4.12.3-1-ARCH, 4.12.8-2-ARCH, 4.12.10-1-ARCH
4.12.10-1-ARCH built with CONFIG_INTEL_ATOMISP=n boots and works

The Arch Linux 201707 install disk uses 4.11 and boots

John

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Boot loop after upgrade

Michel D'HOOGE
Hi John,

Many thanks for your answer!
I recompiled a kernel with CONFIG_INTEL_ATOMISP=n and now I can boot
version 4.12.8.

So the problem is now identified... but I'm not sure where to file a
bug report. I guess it'll be addressed globally but maybe I can ask
the Arch maintainer to change that precise option.

Thanks again
--
Michel

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Boot loop after upgrade

Michel D'HOOGE
This is getting weirder!
This morning, the screen was black again when using the configuration
that worked yesterday...

I tried to just boot into the plain 4.12.8 kernel, and reboot when it
asked for the cryptroot password. But that wasn't enough.

So I did like yesterday: fully boot into Arch 4.12.8 and then reboot
into my recompiled kernel through xen. And that works.
I'll try different scenarios and keep you informed of the results.

:)
Michel

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Boot loop after upgrade

John Thomson-2
Hi Michel,

On Fri, 1 Sep 2017, at 17:57, Michel D'HOOGE wrote:
> the screen was black again when using the configuration that worked yesterday
> So I did like yesterday: fully boot into Arch 4.12.8 and then reboot
> into my recompiled kernel through xen. And that works.

I also currently have this problem, but have not looked into it.
It is separate to the Arch Linux 4.12 Xen PV CONFIG_INTEL_ATOMISP issue.

I boot using UEFI -> GRUB multiboot2 -> xen on Arch Linux 4.12.10 dom0
From a cold boot Xen dom0 does not start.
If I boot UEFI -> GRUB multiboot2 -> Arch Linux 4.12.10 without xen
Then reboot, Xen on Arch Linux 4.12.10 dom0 works.

John

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Boot loop after upgrade

Michel D'HOOGE
> I boot using UEFI -> GRUB multiboot2 -> xen on Arch Linux 4.12.10 dom0
> From a cold boot Xen dom0 does not start.
> If I boot UEFI -> GRUB multiboot2 -> Arch Linux 4.12.10 without xen
> Then reboot, Xen on Arch Linux 4.12.10 dom0 works.

I'm quite sure I already tried that kind of sequences with kernel
4.12.8, but with no results.
My sequence is UEFI -> systemd -> xen -> arch, with an NVidia Optimus Laptop.
Maybe it has something to do with the bbswitch kernel module (because
I switched to the dkms package more or less at the same time, to have
it with the LTS kernel).

However, now that I know how to boot successfully, I'll do more
accurate experiments from cold start.
:-)

M.

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Boot loop after upgrade

Michel D'HOOGE
Hi,

This morning, I directly tried a cold boot with Linux version 4.13.4-1 and that worked. So it seems the problem has been somewhat solved!

For those who forgot what the problem was: After an upgrade of the kernel, it was impossible to directly cold-boot into Xen. I had to first boot into a plain Linux and then reboot into Xen. I suspect the Nvidia Optimus hardware to be the culprit, but I have no evidences so far.

Michel

_______________________________________________
Xen-users mailing list
[hidden email]
https://lists.xen.org/xen-users