heavy write crashing, was: process limit

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

heavy write crashing, was: process limit

Tim Freeman
On Fri, 06 Jan 2006 15:51:18 -0600
Charles Duffy <[hidden email]> wrote:

> Anand wrote:
> > Yes it makes sense, however i was wondering if something like this is
> > possible or not. Lets say a heavy disk io process goes wild and keeps
> > on writing to disk and span multiple processes, the dom0 can come to a
> > grinding halt. (for that matter there is no way to do disk io
> > scheduling like cpu scheduling :( )
>
> IIRC, disk I/O scheduling is a TODO for Xen and should be supported in
> the future.
>
> That said, have you actually seen this case (where the Dom0 comes to a
> complete halt, I/O blocked on account of disk usage by the DomUs)?

You can experience some nasty hangs with Xen 3.0.0 when running certain stress
invocations in domU, particularly with the --hdd option: "spawn N workers
spinning on write()/unlink()" (default is 1GB write()).

This is just some casual experience:

In a loopback mounted domU, I ran "stress -v --hdd 1" and it will almost
immediately kill the machine.

I increased dom0's RAM to try and help dom0's kernel with the writing and after
giving dom0 a whole gigabyte, it lasted longer but still locks up.

I gave dom0 much more CPU (almost half), kept its high RAM allocation, and it
did a lot better in this situation, lasted a lot longer, but eventually the
system became unresponsive.  Here, console toggling from the keyboard of the
real machine still worked up until the time my patience ran out...

Tim


_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xensource.com/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: heavy write crashing, was: process limit

Anand Gupta
On 1/12/06, Tim Freeman <[hidden email]> wrote:
You can experience some nasty hangs with Xen 3.0.0 when running certain stress
invocations in domU, particularly with the --hdd option: "spawn N workers
spinning on write()/unlink()" (default is 1GB write()).

This is just some casual experience:

In a loopback mounted domU, I ran "stress -v --hdd 1" and it will almost
immediately kill the machine.

I increased dom0's RAM to try and help dom0's kernel with the writing and after
giving dom0 a whole gigabyte, it lasted longer but still locks up.

I gave dom0 much more CPU (almost half), kept its high RAM allocation, and it
did a lot better in this situation, lasted a lot longer, but eventually the
system became unresponsive.  Here, console toggling from the keyboard of the
real machine still worked up until the time my patience ran out...


So my guess wasn't wrong after all, a process which tries to do heavy IO even with cpu scheduling could cause the entire machine to go down. This could be prevented if we could somehow do disk io management between the domains.

--

regards,

Anand
_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xensource.com/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: heavy write crashing, was: process limit

Tim Freeman
On Sat, 14 Jan 2006 03:01:13 +0530
Anand <[hidden email]> wrote:

> On 1/12/06, Tim Freeman <[hidden email]> wrote:
> >
> > You can experience some nasty hangs with Xen 3.0.0 when running certain
> > stress
> > invocations in domU, particularly with the --hdd option: "spawn N workers
> > spinning on write()/unlink()" (default is 1GB write()).
> >
> > This is just some casual experience:
> >
> > In a loopback mounted domU, I ran "stress -v --hdd 1" and it will almost
> > immediately kill the machine.
> >
> > I increased dom0's RAM to try and help dom0's kernel with the writing and
> > after
> > giving dom0 a whole gigabyte, it lasted longer but still locks up.
> >
> > I gave dom0 much more CPU (almost half), kept its high RAM allocation, and
> > it
> > did a lot better in this situation, lasted a lot longer, but eventually
> > the
> > system became unresponsive.  Here, console toggling from the keyboard of
> > the
> > real machine still worked up until the time my patience ran out...
> >
> >
> So my guess wasn't wrong after all, a process which tries to do heavy IO
> even with cpu scheduling could cause the entire machine to go down. This
> could be prevented if we could somehow do disk io management between the
> domains.

Yes, the disk I/O scheduling entry on the Xen roadmap is very interesting to us!

http://www.cl.cam.ac.uk/Research/SRG/netos/xen/roadmap.html

It's listed under the "Enhanced QoS features" bullet.  I wonder if "providing
better tools" means an implementation from scratch or a mix of current tools
integrated with Xen?  

Has there been much thinking on any directions this is going to go in?

Thanks!
Tim




_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xensource.com/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: heavy write crashing, was: process limit

Anand Gupta
On 1/14/06, Tim Freeman <[hidden email]> wrote:
Yes, the disk I/O scheduling entry on the Xen roadmap is very interesting to us!

http://www.cl.cam.ac.uk/Research/SRG/netos/xen/roadmap.html

It's listed under the "Enhanced QoS features" bullet.  I wonder if "providing
better tools" means an implementation from scratch or a mix of current tools
integrated with Xen?

I read that as well. This is something which is really required since any IO intensive process can create havoc for the entire host and other domU's.

I noticed during the domU bootup, the bootup messages contains info about IO schedulers however they don't seem to be working as of now. You can pass elevator=io_scheduler_name to the kernel to change the io scheduler (i posted this information in another message to the list).

My guess is they should be using the linux IO scheduler code directly since its something which is already available. On top some more features could be added which can help restrict IO on some domains.

--

regards,

Anand
_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xensource.com/xen-users
Reply | Threaded
Open this post in threaded view
|

Re: heavy write crashing, was: process limit

Tim Freeman
On Sat, 14 Jan 2006 03:12:45 +0530
Anand <[hidden email]> wrote:

> On 1/14/06, Tim Freeman <[hidden email]> wrote:
> >
> > Yes, the disk I/O scheduling entry on the Xen roadmap is very interesting
> > to us!
> >
> > http://www.cl.cam.ac.uk/Research/SRG/netos/xen/roadmap.html
> >
> > It's listed under the "Enhanced QoS features" bullet.  I wonder if
> > "providing
> > better tools" means an implementation from scratch or a mix of current
> > tools
> > integrated with Xen?
>
>
> I read that as well. This is something which is really required since any IO
> intensive process can create havoc for the entire host and other domU's.

If using loopback images, which means for many datacenters and deployments this
is not so much an issue.  With us and grid computing, it is more of an issue,
unless a site has network filesystems that are faster than local disk (not that
hard to come by this situation).

Tim


_______________________________________________
Xen-users mailing list
[hidden email]
http://lists.xensource.com/xen-users