[Xen-devel] Essay on an important Xen decision (long)

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[Xen-devel] Essay on an important Xen decision (long)

Dan Magenheimer
A fundamental architectural decision has to be made for
Xen regarding handling of physical/machine memory; at a high
level, the question is:

        Should Xen drivers be made more flexible to accommodate
        different approaches to managing physical memory, or
        should other architectures be required to conform to
        the Xen/x86 model?

A more detailed description of the specific decision is below.
The Xen/ia64 community would like to make this decision soon --
possibly at the Xen summit -- as next steps of Xen/ia64
functionality are significantly affected.  Since either choice
has an impact on common code and on future Xen architecture,
this decision must involve core Xen developers and the broader
Xen community rather than just Xen/ia64 developers.

While this may seem to be a trivial matter, such fundamental
choices often have a way of pre-selecting future design and
implementation directions that can have major negative or positive
impacts -- possibly unexpected -- on different parties.  For example,
a decision might make a Xen developers' life easier but create
headaches for a distro or a Linux maintainer.  If nothing else,
discussing fundamental decision points often helps to
bring out and codify/document hidden assumptions about
the future.

This is a lengthy document but I hope to touch on most of
the various issues and tradeoffs.  Understanding -- or, at
a minimum, reading -- this document should probably be
a prerequisite for involvement in discussions to resolve this.
I would encourage all readers to give the issues and tradeoffs
some thought as the "obvious x86" answer may not be the best
answer for the future of Xen.

First a little terminology and background:

In a virtualized environment, the resources of the physical
machine must subdivided and/or shared between multiple virtual
machines.  Like an OS manages memory for its applications, one of
the primary roles of a hypervisor is to provide the illusion to
each guest OS that it owns some amount of "RAM" in the system.
Thus there are two kinds of physical memory addresses: the
addresses that a guest believes to be physical addresses and
the addresses that actually refer to RAM (e.g. bus addresses).
The literature (and Xen) confusingly labels these as "physical"
addresses and "machine" addresses.  In a virtualized environment,
there must be some way of maintaining the relationship -- or
"mapping" -- between physical addresses and machine addresses.

In Xen (across all architectures), there are currently three
different approaches for mapping physical addresses to machine
addresses:

1) P==M: The guest is given a subset of machine memory that it
   can access "directly".  Accesses to machine memory addresses
   outside of this range must somehow be restricted (but not
   necessarily disallowed) by Xen.

2) guest-aware p!=m (P2M): The guest is given max_pages of
   contiguous physical memory starting at zero and the knowledge
   that physical addresses are different than machine addresses.
   The guest must understand the difference between a physical
   address and a machine address and utilize the correct one in
   different situations.

3) virtual physical (VP): The guest is given max_pages of
   contiguous physical memory starting at zero.  Xen provides
   the illusion to the guest that this is machine memory;
   any physical-to-machine translation required for functional
   correctness is handled invisibly by Xen.  VP cannot be used
   by guests that directly program DMA-based I/O devices
   because a DMA device requires a machine address and, by
   definition, the guest knows only about physical addresses.

Xen/x86 and Xen/x86_64 use P2M, but switch to VP (aka "shadow
mode") for an unprivileged guest when a migration is underway.
Xen/ia64 currently uses P==M for domain0 and VP for unprivileged
guests.  Xen/ppc intends to use VP only.

There is an architectural proposal to change Xen/ia64 so that
domain0 uses P2M instead of P==M.  We will call this choice P2M
and the choice to stay on the current path P==M.

Here's what I think are the key issues/tradeoffs:

XEN CODE IMPACT

Some Xen drivers, such as the blkif driver, have been "converted"
to accommodate P==M. Others have not.  For example, the balloon driver
currently assumes domain0 is P2M and thus does not currently work
on Xen/ia64 or Xen/ppc.  The word "converted" is quoted because
nobody is particularly satisfied with the current state of the
converted drivers.  Many apparently significant function calls are
define'd out of existence by macros.  Other code does radically
different things depending on the architecture or on whether it
is being executed by dom0 or an unprivileged domain.  And a few
ifdef's are sprinkled about.  In short, what's done works but is
an ugly hack.  Some believe that the best way to solve this mess
is for other architectures to do things more like Xen/x86.  Others
believe there is an advantage to defining clear abstractions and
making the drivers truly more architecture-independent.

P2M will require some rewriting of existing Xen/ia64 core code and the
addition of significant changes to Xenlinux/ia64 code but will allow
much easier porting of Xen's balloon/networking/migration drivers
and also enable some simplifying changes in the Xen block driver.
It is fair to guess that it will take at least several weeks/months
to rewrite and debug the core and Xenlinux code to get Xen/ia64 back
to where it is today, but future driver work will be much faster.
Fewer differences from Xen/x86 means less maintenance work for Xen
core and Xen/ia64 developers.  I'd imagine also that more code will
be shared between Xen/VT-i and Xen/VT-x.

P==M will require Xen's balloon/networking/migration drivers to
evolve to incorporate non-P2M models.  This can be done, but is most
likely to end up (at least in the short term) as a collection of
unpalatable hacks like with the Xen block driver.  However, making
Xen drivers more tolerant of different approaches may be a good
thing in the long run for Xen.

XENLINUX IMPACT

Today's operating systems are not implemented with an understanding
that a physical address and a machine address might be different.
Building this awareness into an OS requires non-trivial source
code change.  For example, Xenlinux/x86 maintains a "p2m" mapping
table for quick translation and provides a "m2p" hypercall to keep
Xen in sync.  OS code that manipulates physical addresses must be
modified to access/manage this table and make hypercalls when
appropriate.  Macros can hide much of the complexity but much OS/driver
code exists that does not use standard macros.  There is some
disagreement on how extensive are the required source code changes,
and how difficult it will be to maintain these changes across future
versions of guest OS's.  One illustrative example however:  In
paravirtualizing Xenlinux/ia64, seven header files are changed;
it is closer to 40 for Xenlinux/x86.

Related, some would assert that pushing a small number of changes into
Linux (or any OS, open source or not) is far easier that pushing a
large number of changes into Linux.  Until all the Xen/x86 changes are
in, it remains to be seen whether this is true or not.  There is
a reasonable concern that the broad review required for such
an extensive set of changes will involve a large number of people
with a large number of agendas and force a number of Xen design
issues to be revisited -- at least clearly justified if not changed.
This is especially true if Xen's foes have any influence in the
process.

Transparent paravirtualization (also called "shared binary") is the
ability for the same binary to be used both as a Xen guest and
natively on real hardware.  Xenlinux/ia64 currently support this;
indeed, ignoring a couple of existing bugs, the same Xenlinux/ia64
binary can be used natively, and as domain0 and as an unprivileged
domain. There have been proposals to do the same for Xenlinux/x86,
but the degree of code changed is much much higher.  There is debate
about the cost/benefit of transparent paravirtualization, but the
primary beneficiaries -- distros and end customers -- are not very
well represented here.

With P2M, it is unlikely that Xenlinux/ia64 will ever again be
transparently paravirtualizable.  As with Xenlinux/x86, the changes
will probably be pushed into a subarch (mach-xen).  Since Linux/ia64
has a more diverse set of subarch's, there may be additional work
to ensure that Xen is orthogonal (and thus works with) all the
subarch's.

P==M would continue to allow transparent paravirtualization.
This plus the reduced number of changes should make it easier to
get Xen/ia64 support into Linux/ia64 (assuming Xen/x86 support
gets included in Linux/x86).

DRIVER DOMAINS

Driver domains are "coming soon" and support of driver domains is a
"must", however support for hybrid driver domains (i.e. domains that
utilize both backend and frontend drivers) is open to debate.  It can
be assumed however that all driver domains will require DMA access.

P2M should make driver domains easier to implement (once the initial
Xenlinux/ia64 work is completed) and able to support a broader range
of functionality.  P==M may disallow hybrid driver domains and
create other restrictions, though some creative person may be able
to solve these.

FUTURE XEN FEATURE SUPPORT

None of the approaches have been "design-tested" significantly for
support or compatibility with future Xen functionality such as
oversubscription or machine-memory hot-plug, nor for exotic
machine memory topologies such as NUMA or discontig (sparsely
populated).  Such functionalities and topologies are much more
likely to be encountered in high-end server architectures rather
than widely-available PCs and low-end servers.  There is some
debate as to whether the existing Xen memory architecture will easily
evolve to accommodate these future changes or if more fundamental
changes will be required.  Architectural decisions and restrictions
should be made with these uncertainties in mind.

Some believe that discovery and policy for machine memory will
eventually need to move out of Xen into domain0, leaving only
enforcement mechanism in Xen.  For example, oversubscription, NUMA
or hot-plug memory support are likely to be fairly complicated
and a commonly stated goal is to move unnecessary complexity out
of Xen.  And the plethora of recent changes in Linux/ia64
involving machine memory models indicates there are still many
unknowns.  P==M more easily supports a model where domain0
owns ALL of machine memory *except* a small amount reserved for
and protected by Xen itself.  If this is all true, Xen/x86 may
eventually need to move to a dom0 P==M model, in which case it
would be silly for Xen/ia64 to move to P2M and then back to P==M.

Others think these features will be easy to implement in Xen and,
with minor changes, entirely compatible with P2M.  And that
P2M is the once and future model for domain0.

SUMMARY

I'm sure there are more issues and tradeoffs that will come up
in discussion, but let me summarize these:

Move domain0 to P2M:
+ Fewer differences in Xen drivers between Xen/x86 and Xen/ia64
+ Fewer differences in Xen drivers between Xen/VT-x and Xen/VT-i
+ Easier to implement remaining Xen drivers for Xen/ia64
- Major changes may require months for Xen/ia64 to regain stability
- Many more changes to Xenlinux/ia64; more difficulty pushing upstream
- No attempt to make Xen more resilient for future architectures

Leave domain0 as P==M:
+ Fewer changes in Xenlinux; easier to push upstream
+ Making Xen more flexible is a good thing
? May provide better foundation for future features (oversubscr, NUMA)
- More restrictions on driver domains
- More hacks required for some Xen drivers, or
- More work to better abstract and define a portable driver
  architecture abstract

_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Xen-devel] Essay on an important Xen decision (long)

Mark Williamson
Dan,

Thanks for the summary, it's nice to see all the arguments presented together.

> 3) virtual physical (VP): The guest is given max_pages of
>    contiguous physical memory starting at zero.  Xen provides
>    the illusion to the guest that this is machine memory;
>    any physical-to-machine translation required for functional
>    correctness is handled invisibly by Xen.  VP cannot be used
>    by guests that directly program DMA-based I/O devices
>    because a DMA device requires a machine address and, by
>    definition, the guest knows only about physical addresses.
>
> Xen/x86 and Xen/x86_64 use P2M, but switch to VP (aka "shadow
> mode") for an unprivileged guest when a migration is underway.
> Xen/ia64 currently uses P==M for domain0 and VP for unprivileged
> guests.  Xen/ppc intends to use VP only.

NB. the shadow mode for migration (logdirty) doesn't actually virtualise the
physical <-> machine mapping - a paravirt guest on x86 always knows where all
its pages are in machine memory.  All that's being hidden in this case is
that the pagetables are being shadowed (so that pages can be transparently
write protected).

> Driver domains are "coming soon" and support of driver domains is a
> "must", however support for hybrid driver domains (i.e. domains that
> utilize both backend and frontend drivers) is open to debate.  It can
> be assumed however that all driver domains will require DMA access.
>
> P2M should make driver domains easier to implement (once the initial
> Xenlinux/ia64 work is completed) and able to support a broader range
> of functionality.  P==M may disallow hybrid driver domains and
> create other restrictions, though some creative person may be able
> to solve these.

I'd think that driver domains themselves would be quite attractive on IA64 -
for big boxes, it allows you to partition the hardware devices *and*
potentially improve uptime by isolating driver faults.

For what you call "hybrid" domains, there are people using this for virtual
DMZ functionality...  I guess it'd be nice to enable it.  Presumably the
problem is that the backend does some sort of P-to-M translation itself?

Do you have a plan for how you would implement P==M driver domains?

Cheers,
Mark

> FUTURE XEN FEATURE SUPPORT
>
> None of the approaches have been "design-tested" significantly for
> support or compatibility with future Xen functionality such as
> oversubscription or machine-memory hot-plug, nor for exotic
> machine memory topologies such as NUMA or discontig (sparsely
> populated).  Such functionalities and topologies are much more
> likely to be encountered in high-end server architectures rather
> than widely-available PCs and low-end servers.  There is some
> debate as to whether the existing Xen memory architecture will easily
> evolve to accommodate these future changes or if more fundamental
> changes will be required.  Architectural decisions and restrictions
> should be made with these uncertainties in mind.
>
> Some believe that discovery and policy for machine memory will
> eventually need to move out of Xen into domain0, leaving only
> enforcement mechanism in Xen.  For example, oversubscription, NUMA
> or hot-plug memory support are likely to be fairly complicated
> and a commonly stated goal is to move unnecessary complexity out
> of Xen.  And the plethora of recent changes in Linux/ia64
> involving machine memory models indicates there are still many
> unknowns.  P==M more easily supports a model where domain0
> owns ALL of machine memory *except* a small amount reserved for
> and protected by Xen itself.  If this is all true, Xen/x86 may
> eventually need to move to a dom0 P==M model, in which case it
> would be silly for Xen/ia64 to move to P2M and then back to P==M.
>
> Others think these features will be easy to implement in Xen and,
> with minor changes, entirely compatible with P2M.  And that
> P2M is the once and future model for domain0.
>
> SUMMARY
>
> I'm sure there are more issues and tradeoffs that will come up
> in discussion, but let me summarize these:
>
> Move domain0 to P2M:
> + Fewer differences in Xen drivers between Xen/x86 and Xen/ia64
> + Fewer differences in Xen drivers between Xen/VT-x and Xen/VT-i
> + Easier to implement remaining Xen drivers for Xen/ia64
> - Major changes may require months for Xen/ia64 to regain stability
> - Many more changes to Xenlinux/ia64; more difficulty pushing upstream
> - No attempt to make Xen more resilient for future architectures
>
> Leave domain0 as P==M:
> + Fewer changes in Xenlinux; easier to push upstream
> + Making Xen more flexible is a good thing
> ? May provide better foundation for future features (oversubscr, NUMA)
> - More restrictions on driver domains
> - More hacks required for some Xen drivers, or
> - More work to better abstract and define a portable driver
>   architecture abstract
>
> _______________________________________________
> Xen-devel mailing list
> [hidden email]
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Xen-devel] Essay on an important Xen decision (long)

Hollis Blanchard-2
In reply to this post by Dan Magenheimer
On Tue, 2006-01-10 at 11:26 -0800, Magenheimer, Dan (HP Labs Fort
Collins) wrote:

>
> 1) P==M: The guest is given a subset of machine memory that it
>    can access "directly".  Accesses to machine memory addresses
>    outside of this range must somehow be restricted (but not
>    necessarily disallowed) by Xen.
>
> 2) guest-aware p!=m (P2M): The guest is given max_pages of
>    contiguous physical memory starting at zero and the knowledge
>    that physical addresses are different than machine addresses.
>    The guest must understand the difference between a physical
>    address and a machine address and utilize the correct one in
>    different situations.
>
> 3) virtual physical (VP): The guest is given max_pages of
>    contiguous physical memory starting at zero.  Xen provides
>    the illusion to the guest that this is machine memory;
>    any physical-to-machine translation required for functional
>    correctness is handled invisibly by Xen.  VP cannot be used
>    by guests that directly program DMA-based I/O devices
>    because a DMA device requires a machine address and, by
>    definition, the guest knows only about physical addresses.
>
> Xen/x86 and Xen/x86_64 use P2M, but switch to VP (aka "shadow
> mode") for an unprivileged guest when a migration is underway.
> Xen/ia64 currently uses P==M for domain0 and VP for unprivileged
> guests.  Xen/ppc intends to use VP only.
>
> There is an architectural proposal to change Xen/ia64 so that
> domain0 uses P2M instead of P==M.  We will call this choice P2M
> and the choice to stay on the current path P==M.

So ia64 dom0 physical 0 is machine 0? Where does Xen live in machine
space?

PowerPC exception handlers are architecturally hardcoded to the first
couple pages of memory, so Xen needs to live there. Linux expects it is
booting at 0 of course, so dom0 runs in an offset physical address
space.

The trouble then comes when dom0 needs to access IO or domU memory;
obviously dom0 must have some awareness of the machine space.
Accordingly, I'm thinking I'm going to need to install p2m tables in
dom0, and once they're there, why not have domU use them too?

--
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

RE: [Xen-devel] Essay on an important Xen decision (long)

Dan Magenheimer
In reply to this post by Dan Magenheimer
> NB. the shadow mode for migration (logdirty) doesn't actually
> virtualise the
> physical <-> machine mapping - a paravirt guest on x86 always
> knows where all
> its pages are in machine memory.  All that's being hidden in
> this case is
> that the pagetables are being shadowed (so that pages can be
> transparently
> write protected).

Thanks for the clarification!

> I'd think that driver domains themselves would be quite
> attractive on IA64 -
> for big boxes, it allows you to partition the hardware devices *and*
> potentially improve uptime by isolating driver faults.

Probably true, but I think most "big box" customers are looking
for partition isolation beyond what is possible with Xen (at
least near-term).

> For what you call "hybrid" domains, there are people using
> this for virtual
> DMZ functionality...  I guess it'd be nice to enable it.  
> Presumably the
> problem is that the backend does some sort of P-to-M
> translation itself?
>
> Do you have a plan for how you would implement P==M driver domains?

Only roughly.  Detailed design and implementation was to wait
until after driver domain support gets back into Xen/x86 (and until
after this P?M decision is made).

Dan

_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

RE: [Xen-devel] Essay on an important Xen decision (long)

Dan Magenheimer
In reply to this post by Dan Magenheimer
> So ia64 dom0 physical 0 is machine 0? Where does Xen live in machine
> space?
>
> PowerPC exception handlers are architecturally hardcoded to the first
> couple pages of memory, so Xen needs to live there. Linux
> expects it is
> booting at 0 of course, so dom0 runs in an offset physical address
> space.

On ia64, Xen (and Linux when booting natively) is relocatable.
Machine address 0 is not special on ia64 like it is on PowerPC.
 
> The trouble then comes when dom0 needs to access IO or domU memory;
> obviously dom0 must have some awareness of the machine space.
> Accordingly, I'm thinking I'm going to need to install p2m tables in
> dom0, and once they're there, why not have domU use them too?

On ia64, machine memory is exposed to a native OS via EFI (firmware)
tables.  (I think these are similar to e820 on x86 machines and
don't know how this is done on PowerPC.)  When Xen/ia64 starts domain0
(or a domU), it passes a faked EFI table.  This table is faked
differently for domain0 and domU's.  One solution, for example,
would be for Xen to "give" all machine memory to dom0, protecting
only a small portion for itself.  Then when other domains are
created, all the memory for domUs would be "ballooned" from dom0.

Per the previous exchange with Anthony, there are many advantages
to being able to move memory around invisibly to domains, which
is easy with VP and much harder with P2M.  The current debate on
Xen/ia64 is just for domain0 but it could expand...


_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

RE: [Xen-devel] Essay on an important Xen decision (long)

Tian, Kevin
In reply to this post by Dan Magenheimer
>From: Magenheimer, Dan
>Sent: 2006年1月11日 3:26

Hi, Dan,
        Good background for discussion.

>[...]
>an ugly hack.  Some believe that the best way to solve this mess
>is for other architectures to do things more like Xen/x86.  Others
>believe there is an advantage to defining clear abstractions and
>making the drivers truly more architecture-independent.

I would say two options above don't conflict actually. ;-) Move to Xen/x86 for things really common with clearer abstraction for architecture difference. We need carefully differentiate which part of mess really comes from arch reason, and which part is common but simply missed due to early quick bring-up requirement. I don't think this is enough cared by far. Xen, as a well-formed product, needs to have common policies and common features on all architectures. Maybe, to implement same features can be more difficult and even bring some performance impact on some architecture, but it's a must-to-have requirement from customer's point of view if customer acknowledges it. Just raise it here as an important factor when considering the final solution cross-architecture.

>[...]
>XENLINUX IMPACT
>
>Xen in sync.  OS code that manipulates physical addresses must be
>modified to access/manage this table and make hypercalls when
>appropriate.  Macros can hide much of the complexity but much OS/driver
>code exists that does not use standard macros.  There is some

This seems to be an issue for driver modules to be re-compiled... ;-(

>Transparent paravirtualization (also called "shared binary") is the
>ability for the same binary to be used both as a Xen guest and
>natively on real hardware.  Xenlinux/ia64 currently support this;
>indeed, ignoring a couple of existing bugs, the same Xenlinux/ia64
>binary can be used natively, and as domain0 and as an unprivileged
>domain. There have been proposals to do the same for Xenlinux/x86,
>but the degree of code changed is much much higher.  There is debate
>about the cost/benefit of transparent paravirtualization, but the
>primary beneficiaries -- distros and end customers -- are not very
>well represented here.

Transparent is welcomed, which however doesn't mean conservative self-restriction upon modification to xenlinux. Transparent with good performance is the goal to pursue, though xenlinux/x86 does need more efforts to make it happen.

>
>With P2M, it is unlikely that Xenlinux/ia64 will ever again be
>transparently paravirtualizable.  As with Xenlinux/x86, the changes
>will probably be pushed into a subarch (mach-xen).  

First sub-arch, and further a configurable feature later with negligible impact to native running? ;-)

>[...]
>
>Some believe that discovery and policy for machine memory will
>eventually need to move out of Xen into domain0, leaving only
>enforcement mechanism in Xen.  For example, oversubscription, NUMA
>or hot-plug memory support are likely to be fairly complicated
>and a commonly stated goal is to move unnecessary complexity out
>of Xen.  And the plethora of recent changes in Linux/ia64
>involving machine memory models indicates there are still many
>unknowns.  P==M more easily supports a model where domain0
>owns ALL of machine memory *except* a small amount reserved for
>and protected by Xen itself.  If this is all true, Xen/x86 may
>eventually need to move to a dom0 P==M model, in which case it
>would be silly for Xen/ia64 to move to P2M and then back to P==M.

I don't think it's a good design choice by complete takeover to dom0. Moving ownership to dom0 doesn’t mean a simple move, since memory sub-system is the core/base of Xen. Extra context switches are added for any page related operation. Also by P==M model, how do you ensure a scalable allocation environment after a long run? Any activities within dom0 which consumes Physical frames, thus actually eats Machine frames. Security will be another issue though I can't come out a clear example immediately...

>
>SUMMARY
>[...]

This summary is good.

Thanks,
Kevin

_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Xen-devel] Essay on an important Xen decision (long)

Harry Butterworth
In reply to this post by Dan Magenheimer
On Tue, 2006-01-10 at 11:26 -0800, Magenheimer, Dan (HP Labs Fort
Collins) wrote:
> A fundamental architectural decision has to be made for
> Xen regarding handling of physical/machine memory; at a high
> level, the question is:
>
> Should Xen drivers be made more flexible to accommodate
> different approaches to managing physical memory, or
> should other architectures be required to conform to
> the Xen/x86 model?

I believe the right approach is to decouple the driver implementation
from the memory management architecture by defining a high level API to
build the drivers on.  The API should be expressed in terms of the
operations that the drivers need to perform rather than in terms of the
underlying primitives that are actually used to perform those
operations.

Such an API would allow decisions about memory management to be made
independent of the drivers and would allow the memory management
architecture to be changed relatively easily at a later date since the
resulting damage would be contained within the core library that
implemented the driver infrastructure API.

I think this is the right approach because:

o - Decoupling the drivers from the memory management architecture
reduces the cost of future memory management architecture changes and
keeps our options open, so is a lower risk approach than choosing a
memory management architecture now and trying to stick with it.

o - A good high level driver infrastructure API will clean up the
drivers considerably.

o - Containing the code which performs low-level memory manipulations
within a core driver infrastructure library written by an expert will
result in higher overall quality across all the drivers.

o - As a driver author, given a high level driver infrastructure API
which decouples me from the memory management architecture, the choice
of P==M, P2M or VP is no longer my concern.

I have made a first attempt at defining a high level driver
infrastructure API for general use by xen split drivers.  This is the
xenidc API and, whilst it is designed for general use, it currently has
one client: the split USB driver.

I believe that xenidc completely decouples its clients from the memory
management architecture such that, for example, there should be no
changes required in the USB driver code when porting it from x86 to ia64
and PPC (this will be true whether or not the memory management
architecture for those platforms is changed to be more like x86).

All required changes ought to be contained within the xenidc
implementation and therefore would only need to be implemented once for
all clients of xenidc.

The choice of a common memory management architecture or different
memory management architectures across platforms or different options
for memory management architectures for a particular platform or
different options for memory management architecture at run-time for
transparent virtualization can all be contained within the xenidc
implementation.

In addition to decoupling the client driver code from the memory
management architecture, the xenidc API provides:

o - Convenient inter-domain communication primitives which encapsulate
the rather complicated state machine required for correct set-up and
tear down of inter-domain communication channels for (un)loadable driver
modules.

o - A convenient inter-domain bulk transport.

o - An up-front-reservation resource management strategy.

o - Driver forwards-compatibility with a network transparent xenidc
implementation.

I have attached the latest xenidc patch which includes documentation of
the xenidc API (added by the patch to the Xen interface document).

I have also attached the latest USB patch as an example of a client of
the xenidc API.

(Since the last time I posted these patches I have fixed a couple of
compiler warnings for the X86_64 build).

A few points to note:

o - xenidc is an infrastructure for the Xen-specific split drivers.
Xenidc doesn't directly address the issue of making the native drivers
work correctly under virtualization but does allow you to do that
however you like across different architectures whilst maintaining
common code for all the split drivers.

o - This is just a first attempt which I wrote mainly to decouple the
USB driver from churn in the underlying infrastructure.  The API is
generally useful but only covers the operations that were actually
required for the USB driver.  There is already enough in the API to base
other drivers on it but the API would need to be fleshed out with some
different kinds of operations before it would be possible to implement
all drivers with the same efficient primitives that are used today.

o - Unfortunately I didn't get funding to attend the Xen summit so I
won't be there to present on Xenidc.  I'm not concerned about whether
xenidc gets accepted as-is but I do hope it will be useful as an example
of the kind of API that we could have.  I'll be happy to answer any
questions on the list.

Harry.

latest-xenidc-patch.gz (55K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: [Xen-devel] Essay on an important Xen decision (long)

Hollis Blanchard-2
In reply to this post by Dan Magenheimer
On Tue, 2006-01-10 at 16:39 -0800, Magenheimer, Dan (HP Labs Fort
Collins) wrote:

> > So ia64 dom0 physical 0 is machine 0? Where does Xen live in machine
> > space?
> >
> > PowerPC exception handlers are architecturally hardcoded to the first
> > couple pages of memory, so Xen needs to live there. Linux
> > expects it is
> > booting at 0 of course, so dom0 runs in an offset physical address
> > space.
>
> On ia64, Xen (and Linux when booting natively) is relocatable.
> Machine address 0 is not special on ia64 like it is on PowerPC.

Right, so P==M for dom0 (or any domain) will not work on PowerPC.
 
> Per the previous exchange with Anthony, there are many advantages
> to being able to move memory around invisibly to domains, which
> is easy with VP and much harder with P2M.  The current debate on
> Xen/ia64 is just for domain0 but it could expand...

As far as I can see, dom0 must be aware of the machine address space, so
that means P2M for PowerPC. dom0 is a special case: do you really need
to worry about migrating dom0, or memory compacting with other domains?

As for the question of domU being VP or P2M, I see no reason it
shouldn't be VP. IO-capable domUs (driver domains) could be VP with
proper IOMMU support. The PowerPC PAPR and Xen/ia64 implementations
demonstrate that this works...

--
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

RE: [Xen-devel] Essay on an important Xen decision (long)

Dan Magenheimer
In reply to this post by Dan Magenheimer
> > On ia64, Xen (and Linux when booting natively) is relocatable.
> > Machine address 0 is not special on ia64 like it is on PowerPC.
>
> Right, so P==M for dom0 (or any domain) will not work on PowerPC.

Are machine addresses 0-n the only range that are special?
And can one safely assume that DMA will never occur in this
range?  If so, then a single "special" mapping in the hypervisor
could get around this.  While I suppose this is more P~=M than
strictly P==M, it would seem a reasonable alternative to major Linux
changes.
 

> > Per the previous exchange with Anthony, there are many advantages
> > to being able to move memory around invisibly to domains, which
> > is easy with VP and much harder with P2M.  The current debate on
> > Xen/ia64 is just for domain0 but it could expand...
>
> As far as I can see, dom0 must be aware of the machine
> address space, so
> that means P2M for PowerPC. dom0 is a special case: do you really need
> to worry about migrating dom0, or memory compacting with
> other domains?

No, migrating dom0 or any driver domain with direct device
access is unreasonable, at least unless all device access
is virtualized (e.g. Infiniband?).  I view domain0 as closer to
a semi-privileged extension of Xen.

Not sure what you mean by memory compacting...

> As for the question of domU being VP or P2M, I see no reason it
> shouldn't be VP. IO-capable domUs (driver domains) could be VP with
> proper IOMMU support. The PowerPC PAPR and Xen/ia64 implementations
> demonstrate that this works...

Ignoring the page table problems on x86 (which Vmware demonstrates
is more of a performance issue than a functional issue), if DMA can
be invisibly handled, I think everyone agrees that VP has significant
advantages over either P==M or P2M.

But to clarify, Xen/ia64 domU is currently VP only because it doesn't
do DMA. Driver domains will complicate this.

Dan




_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel