Crashes on image file backed VMs

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Crashes on image file backed VMs

Andre Fucs
Hi

I am running Xen 4.5.1 (arch built from AUR) and noticed that none of my Linux or Windows VMs backed by image files (qcow2, raw, you name it) will crash after a few seconds to minutes:

sudo coredumpctl info 2756
           PID: 2756 (qemu-system-i38)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 7 (BUS)
     Timestamp: Fri 2015-12-18 21:44:25 AEDT (12min ago)
  Command Line: /usr/lib/xen/bin/qemu-system-i386 -xen-domid 6 -chardev socket,id=libxl-cmd,path=/run/xen/qmp-libxl-6,server,nowait -mon chardev=libxl-cmd,mode=control -nodefaults -name test.hvm -vnc 127.0.0.1:0,to=99 -display none -serial pty -device cirrus-vga,vgamem_mb=8 -boot order=cd -smp 2,maxcpus=2 -device rtl8139,id=nic0,netdev=net0,mac=00:aa:aa:aa:aa:aa -netdev type=tap,id=net0,ifname=vif6.0-emu,script=no,downscript=no -machine xenfv -m 760 -drive file=/mnt/test.raw,if=ide,index=0,media=disk,format=raw,cache=writeback
    Executable: /usr/lib/xen/bin/qemu-system-i386
 Control Group: /user.slice/user-1000.slice/session-c2.scope
          Unit: session-c2.scope
         Slice: user-1000.slice
       Session: c2
     Owner UID: 1000 (xafucs)
       Boot ID: 923836bc18c248a084ed286a6c9c7d5d
    Machine ID: 0812fdb213c0401e8e53e442256a04e0
      Hostname: xdemo
       Message: Process 2756 (qemu-system-i38) of user 0 dumped core.

                Stack trace of thread 2756:
                #0  0x000055b2a1e155d3 blk_handle_requests (qemu-system-i386)
                #1  0x000055b2a1d99fb3 aio_bh_poll (qemu-system-i386)
                #2  0x000055b2a1d99bdc aio_poll (qemu-system-i386)
                #3  0x000055b2a1d99de3 aio_ctx_dispatch (qemu-system-i386)
                #4  0x00007fef67d30dc7 g_main_context_dispatch (libglib-2.0.so.0)
                #5  0x000055b2a1eba35e glib_pollfds_poll (qemu-system-i386)
                #6  0x000055b2a1f2bab2 main_loop (qemu-system-i386)
                #7  0x00007fef6420f610 __libc_start_main (libc.so.6)
                #8  0x000055b2a1d997b9 _start (qemu-system-i386)

                Stack trace of thread 2761:
                #0  0x00007fef6607faf7 do_sigwait (libpthread.so.0)
                #1  0x00007fef6607fb6d sigwait (libpthread.so.0)
                #2  0x000055b2a1f30fcc qemu_dummy_cpu_thread_fn (qemu-system-i386)
                #3  0x00007fef660764a4 start_thread (libpthread.so.0)
                #4  0x00007fef642d813d __clone (libc.so.6)

                Stack trace of thread 2762:
                #0  0x00007fef6607faf7 do_sigwait (libpthread.so.0)
                #1  0x00007fef6607fb6d sigwait (libpthread.so.0)
                #2  0x000055b2a1f30fcc qemu_dummy_cpu_thread_fn (qemu-system-i386)
                #3  0x00007fef660764a4 start_thread (libpthread.so.0)
                #4  0x00007fef642d813d __clone (libc.so.6)

                Stack trace of thread 2964:
                #0  0x00007fef6607e4a5 do_futex_wait (libpthread.so.0)
                #1  0x00007fef6607e56f __new_sem_wait_slow (libpthread.so.0)
                #2  0x00007fef6607e622 sem_timedwait (libpthread.so.0)
                #3  0x000055b2a200341e qemu_sem_timedwait (qemu-system-i386)
                #4  0x000055b2a1f02a8c worker_thread (qemu-system-i386)
                #5  0x00007fef660764a4 start_thread (libpthread.so.0)
                #6  0x00007fef642d813d __clone (libc.so.6)

                Stack trace of thread 2963:
                #0  0x00007fef6607e4a5 do_futex_wait (libpthread.so.0)
                #1  0x00007fef6607e56f __new_sem_wait_slow (libpthread.so.0)
                #2  0x00007fef6607e622 sem_timedwait (libpthread.so.0)
                #3  0x000055b2a200341e qemu_sem_timedwait (qemu-system-i386)
                #4  0x000055b2a1f02a8c worker_thread (qemu-system-i386)
                #5  0x00007fef660764a4 start_thread (libpthread.so.0)
                #6  0x00007fef642d813d __clone (libc.so.6)

                Stack trace of thread 2758:
                #0  0x00007fef6607efad read (libpthread.so.0)
                #1  0x00007fef65b2655a read_all (libxenstore.so.3.0)
                #2  0x00007fef65b26676 read_message (libxenstore.so.3.0)
                #3  0x00007fef65b26f86 read_thread (libxenstore.so.3.0)
                #4  0x00007fef660764a4 start_thread (libpthread.so.0)
                #5  0x00007fef642d813d __clone (libc.so.6)

                Stack trace of thread 2961:
                #0  0x00007fef6607e4a5 do_futex_wait (libpthread.so.0)
                #1  0x00007fef6607e56f __new_sem_wait_slow (libpthread.so.0)
                #2  0x00007fef6607e622 sem_timedwait (libpthread.so.0)
                #3  0x000055b2a200341e qemu_sem_timedwait (qemu-system-i386)
                #4  0x000055b2a1f02a8c worker_thread (qemu-system-i386)
                #5  0x00007fef660764a4 start_thread (libpthread.so.0)
                #6  0x00007fef642d813d __clone (libc.so.6)

                Stack trace of thread 2769:
                #0  0x00007fef6607c07f pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x000055b2a20032d4 qemu_cond_wait (qemu-system-i386)
                #2  0x000055b2a1f1c587 vnc_worker_thread_loop (qemu-system-i386)
                #3  0x000055b2a1f1c967 vnc_worker_thread (qemu-system-i386)
                #4  0x00007fef660764a4 start_thread (libpthread.so.0)
                #5  0x00007fef642d813d __clone (libc.so.6)

                Stack trace of thread 2958:
                #0  0x00007fef6607e4a5 do_futex_wait (libpthread.so.0)
                #1  0x00007fef6607e56f __new_sem_wait_slow (libpthread.so.0)
                #2  0x00007fef6607e622 sem_timedwait (libpthread.so.0)
                #3  0x000055b2a200341e qemu_sem_timedwait (qemu-system-i386)
                #4  0x000055b2a1f02a8c worker_thread (qemu-system-i386)
                #5  0x00007fef660764a4 start_thread (libpthread.so.0)
                #6  0x00007fef642d813d __clone (libc.so.6)


has anyone seem something like that?

_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Crashes on image file backed VMs

Ian Campbell-10
On Fri, 2015-12-18 at 11:02 +0000, Andre Fucs wrote:

>                 Stack trace of thread 2756:
>                 #0  0x000055b2a1e155d3 blk_handle_requests (qemu-system-i386)
>                 #1  0x000055b2a1d99fb3 aio_bh_poll (qemu-system-i386)
>                 #2  0x000055b2a1d99bdc aio_poll (qemu-system-i386)
>                 #3  0x000055b2a1d99de3 aio_ctx_dispatch (qemu-system-i386)
>                 #4  0x00007fef67d30dc7 g_main_context_dispatch (libglib-2.0.so.0)
>                 #5  0x000055b2a1eba35e glib_pollfds_poll (qemu-system-i386)
>                 #6  0x000055b2a1f2bab2 main_loop (qemu-system-i386)
>                 #7  0x00007fef6420f610 __libc_start_main (libc.so.6)
>                 #8  0x000055b2a1d997b9 _start (qemu-system-i386)

This thread looks like the culprit (the rest are all basically idling)

If you run gdb on the binary + core combination you ought to be able to get
more info, like line numbers and perhaps even local variable states which
would maybe point towards the culprit. I don't know if you will also need
to install any debug symbols packages on Arch or not.

One the face of it it doesn't seem to be Xen specific, i.e. it looks like a
QEMU issue (albeit in the version of QEMU shipped along with the Xen
release).

> has anyone seem something like that?

FWIW I don't recall any similar reports and our automated tests include
qcow and they seem pretty happy on the 4.5 branch:
http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-xl-qcow2/xen-4.5-testing.html
(the failures at the bottom, resolved in flight62168, were a machine
specific timeout, not a crash like this one)

Those tests are passing on that branch after bbbd29a25d09 AKA RELEASE-
4.5.2~37, which does suggest you might want to try updating from 4.5.1,
just in case. (These tests are new enough not to have tested 4.5.1)

Ian.

_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Crashes on image file backed VMs

Andre Fucs
Ian,

Thanks for the reply.

coredumpctl gbd PID_OF_PROCESS printed the following:


GNU gdb (GDB) 7.10.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/xen/bin/qemu-system-i386...done.

warning: core file may not match specified executable file.
[New LWP 10134]
[New LWP 10282]
[New LWP 10147]
[New LWP 10295]
[New LWP 10138]
[New LWP 10285]
[New LWP 10284]
[New LWP 10279]
[New LWP 10146]
[New LWP 10281]
[New LWP 10301]
[New LWP 10307]
[New LWP 10292]
[New LWP 10293]
[New LWP 10306]
[New LWP 10286]
[New LWP 10308]
[New LWP 10304]
[New LWP 10300]
[New LWP 10287]
[New LWP 10303]
[New LWP 10290]
[New LWP 10289]
[New LWP 10291]
[New LWP 10288]
[New LWP 10298]
[New LWP 10294]
[New LWP 10302]
[New LWP 10299]
[New LWP 10296]
[New LWP 10280]
[New LWP 10278]
[New LWP 10136]
[New LWP 10140]
[New LWP 10283]
[New LWP 10297]
[New LWP 10305]

warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/usr/lib/xen/bin/qemu-system-i386 -xen-domid 11 -chardev                                                                              socket,id=libxl-cmd,pa'.
Program terminated with signal SIGBUS, Bus error.
#0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0)
    at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/hw/block/xen_disk.c:699
699     rp = blkdev->rings.common.sring->req_prod;
[Current thread is 1 (Thread 0x7f68ed0909c0 (LWP 10134))]
(gdb) bt
#0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/hw/block/xen_disk.c:699
#1  blk_bh (opaque=0x55b277c49ab0) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/hw/block/xen_disk.c:738
#2  0x000055b2769e62d3 in aio_bh_poll (ctx=ctx@entry=0x55b277c34600) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/async.c:81
#3  0x000055b2769e5efc in aio_poll (ctx=0x55b277c34600, blocking=blocking@entry=false) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/aio-posix.c:188
#4  0x000055b2769e6103 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
    at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/async.c:211
#5  0x00007f68ebf98dc7 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#6  0x000055b276b066b1 in glib_pollfds_poll () at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:190
#7  os_host_main_loop_wait (timeout=<optimized out>) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:235
#8  main_loop_wait (nonblocking=<optimized out>) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:484
#9  0x000055b276b77e09 in main_loop () at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/vl.c:2056
#10 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/vl.c:4535
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f68ed0909c0 (LWP 10134))]
#0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/hw/block/xen_disk.c:699
699     rp = blkdev->rings.common.sring->req_prod;



_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Crashes on image file backed VMs

Ian Campbell-10
On Sat, 2015-12-19 at 11:18 +0000, Andre Fucs wrote:
> Ian,
>
> Thanks for the reply.
>
> coredumpctl gbd PID_OF_PROCESS printed the following:

A SIGBUS in QEMU's blk_handle_requests, how exciting!

Copying xen-devel and some relevant maintainers, start of thread is
http://lists.xen.org/archives/html/xen-users/2015-12/msg00073.html
From that, this is Xen 4.5.1 on arch Linux.

Andre, could you take a look under /var/log/xen for logs relating to a
domain to which this has happened, in particular the qemu log. It would
also be useful to see the corresponding guest cfg file I expect.

Ian.

> [...]

> warning: Could not load shared library symbols for linux-vdso.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/usr/lib/libthread_db.so.1".
> Core was generated by `/usr/lib/xen/bin/qemu-system-i386 -xen-domid 11
> -chardev                                                                
>              socket,id=libxl-cmd,pa'.
> Program terminated with signal SIGBUS, Bus error.
> #0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0)
>     at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-
> xen/hw/block/xen_disk.c:699
> 699     rp = blkdev->rings.common.sring->req_prod;
> [Current thread is 1 (Thread 0x7f68ed0909c0 (LWP 10134))]
> (gdb) bt
> #0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0) at
> /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/hw/block/xen_disk.c:699
> #1  blk_bh (opaque=0x55b277c49ab0) at /home/xafucs/xen/src/xen-
> 4.5.1/tools/qemu-xen/hw/block/xen_disk.c:738
> #2  0x000055b2769e62d3 in aio_bh_poll (ctx=ctx@entry=0x55b277c34600) at
> /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/async.c:81
> #3  0x000055b2769e5efc in aio_poll (ctx=0x55b277c34600, blocking=blocking
> @entry=false) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/aio-
> posix.c:188
> #4  0x000055b2769e6103 in aio_ctx_dispatch (source=<optimized out>,
> callback=<optimized out>, user_data=<optimized out>)
>     at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/async.c:211
> #5  0x00007f68ebf98dc7 in g_main_context_dispatch () from
> /usr/lib/libglib-2.0.so.0
> #6  0x000055b276b066b1 in glib_pollfds_poll () at
> /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:190
> #7  os_host_main_loop_wait (timeout=<optimized out>) at
> /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:235
> #8  main_loop_wait (nonblocking=<optimized out>) at
> /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:484
> #9  0x000055b276b77e09 in main_loop () at /home/xafucs/xen/src/xen-
> 4.5.1/tools/qemu-xen/vl.c:2056
> #10 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized
> out>) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/vl.c:4535
> (gdb) thread 1
> [Switching to thread 1 (Thread 0x7f68ed0909c0 (LWP 10134))]
> #0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0) at
> /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/hw/block/xen_disk.c:699
> 699     rp = blkdev->rings.common.sring->req_prod;
>
>

_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xen.org/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: Crashes on image file backed VMs

Stefano Stabellini-3
On Mon, 4 Jan 2016, Ian Campbell wrote:

> On Sat, 2015-12-19 at 11:18 +0000, Andre Fucs wrote:
> > Ian,
> >
> > Thanks for the reply.
> >
> > coredumpctl gbd PID_OF_PROCESS printed the following:
>
> A SIGBUS in QEMU's blk_handle_requests, how exciting!
>
> Copying xen-devel and some relevant maintainers, start of thread is
> http://lists.xen.org/archives/html/xen-users/2015-12/msg00073.html
> >From that, this is Xen 4.5.1 on arch Linux.
>
> Andre, could you take a look under /var/log/xen for logs relating to a
> domain to which this has happened, in particular the qemu log. It would
> also be useful to see the corresponding guest cfg file I expect.
It's good that you can reproduce the bug.

What kernel are you using in Dom0?
What underlying storage are you using for the guest VMs (local disk,
nfs, iscsi, etc)?

Can you reproduce the bug with a more recent QEMU? For example:

git://git.qemu.org/qemu.git v2.5.0

no need to update Xen for this test, just compile QEMU separately, I
just do:

./configure --enable-xen --target-list=i386-softmmu --disable-kvm
make
cp i386-softmmu/qemu-system-i386 /usr/lib/xen/bin

Thanks for your help!


> > [...]
>
> > warning: Could not load shared library symbols for linux-vdso.so.1.
> > Do you need "set solib-search-path" or "set sysroot"?
> > [Thread debugging using libthread_db enabled]
> > Using host libthread_db library "/usr/lib/libthread_db.so.1".
> > Core was generated by `/usr/lib/xen/bin/qemu-system-i386 -xen-domid 11
> > -chardev                                                                
> >              socket,id=libxl-cmd,pa'.
> > Program terminated with signal SIGBUS, Bus error.
> > #0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0)
> >     at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-
> > xen/hw/block/xen_disk.c:699
> > 699     rp = blkdev->rings.common.sring->req_prod;
> > [Current thread is 1 (Thread 0x7f68ed0909c0 (LWP 10134))]
> > (gdb) bt
> > #0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0) at
> > /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/hw/block/xen_disk.c:699
> > #1  blk_bh (opaque=0x55b277c49ab0) at /home/xafucs/xen/src/xen-
> > 4.5.1/tools/qemu-xen/hw/block/xen_disk.c:738
> > #2  0x000055b2769e62d3 in aio_bh_poll (ctx=ctx@entry=0x55b277c34600) at
> > /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/async.c:81
> > #3  0x000055b2769e5efc in aio_poll (ctx=0x55b277c34600, blocking=blocking
> > @entry=false) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/aio-
> > posix.c:188
> > #4  0x000055b2769e6103 in aio_ctx_dispatch (source=<optimized out>,
> > callback=<optimized out>, user_data=<optimized out>)
> >     at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/async.c:211
> > #5  0x00007f68ebf98dc7 in g_main_context_dispatch () from
> > /usr/lib/libglib-2.0.so.0
> > #6  0x000055b276b066b1 in glib_pollfds_poll () at
> > /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:190
> > #7  os_host_main_loop_wait (timeout=<optimized out>) at
> > /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:235
> > #8  main_loop_wait (nonblocking=<optimized out>) at
> > /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/main-loop.c:484
> > #9  0x000055b276b77e09 in main_loop () at /home/xafucs/xen/src/xen-
> > 4.5.1/tools/qemu-xen/vl.c:2056
> > #10 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized
> > out>) at /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/vl.c:4535
> > (gdb) thread 1
> > [Switching to thread 1 (Thread 0x7f68ed0909c0 (LWP 10134))]
> > #0  0x000055b276a618fd in blk_handle_requests (blkdev=0x55b277c49ab0) at
> > /home/xafucs/xen/src/xen-4.5.1/tools/qemu-xen/hw/block/xen_disk.c:699
> > 699     rp = blkdev->rings.common.sring->req_prod;
> >
> >
>
_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xen.org/xen-users