Skip to content

Commit

Permalink
iommu: IOMMU Groups
Browse files Browse the repository at this point in the history
IOMMU device groups are currently a rather vague associative notion
with assembly required by the user or user level driver provider to
do anything useful.  This patch intends to grow the IOMMU group concept
into something a bit more consumable.

To do this, we first create an object representing the group, struct
iommu_group.  This structure is allocated (iommu_group_alloc) and
filled (iommu_group_add_device) by the iommu driver.  The iommu driver
is free to add devices to the group using it's own set of policies.
This allows inclusion of devices based on physical hardware or topology
limitations of the platform, as well as soft requirements, such as
multi-function trust levels or peer-to-peer protection of the
interconnects.  Each device may only belong to a single iommu group,
which is linked from struct device.iommu_group.  IOMMU groups are
maintained using kobject reference counting, allowing for automatic
removal of empty, unreferenced groups.  It is the responsibility of
the iommu driver to remove devices from the group
(iommu_group_remove_device).

IOMMU groups also include a userspace representation in sysfs under
/sys/kernel/iommu_groups.  When allocated, each group is given a
dynamically assign ID (int).  The ID is managed by the core IOMMU group
code to support multiple heterogeneous iommu drivers, which could
potentially collide in group naming/numbering.  This also keeps group
IDs to small, easily managed values.  A directory is created under
/sys/kernel/iommu_groups for each group.  A further subdirectory named
"devices" contains links to each device within the group.  The iommu_group
file in the device's sysfs directory, which formerly contained a group
number when read, is now a link to the iommu group.  Example:

$ ls -l /sys/kernel/iommu_groups/26/devices/
total 0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:00:1e.0 ->
		../../../../devices/pci0000:00/0000:00:1e.0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.0 ->
		../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.1 ->
		../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.1

$ ls -l  /sys/kernel/iommu_groups/26/devices/*/iommu_group
[truncating perms/owner/timestamp]
/sys/kernel/iommu_groups/26/devices/0000:00:1e.0/iommu_group ->
					../../../kernel/iommu_groups/26
/sys/kernel/iommu_groups/26/devices/0000:06:0d.0/iommu_group ->
					../../../../kernel/iommu_groups/26
/sys/kernel/iommu_groups/26/devices/0000:06:0d.1/iommu_group ->
					../../../../kernel/iommu_groups/26

Groups also include several exported functions for use by user level
driver providers, for example VFIO.  These include:

iommu_group_get(): Acquires a reference to a group from a device
iommu_group_put(): Releases reference
iommu_group_for_each_dev(): Iterates over group devices using callback
iommu_group_[un]register_notifier(): Allows notification of device add
        and remove operations relevant to the group
iommu_group_id(): Return the group number

This patch also extends the IOMMU API to allow attaching groups to
domains.  This is currently a simple wrapper for iterating through
devices within a group, but it's expected that the IOMMU API may
eventually make groups a more integral part of domains.

Groups intentionally do not try to manage group ownership.  A user
level driver provider must independently acquire ownership for each
device within a group before making use of the group as a whole.
This may change in the future if group usage becomes more pervasive
across both DMA and IOMMU ops.

Groups intentionally do not provide a mechanism for driver locking
or otherwise manipulating driver matching/probing of devices within
the group.  Such interfaces are generic to devices and beyond the
scope of IOMMU groups.  If implemented, user level providers have
ready access via iommu_group_for_each_dev and group notifiers.

iommu_device_group() is removed here as it has no users.  The
replacement is:

	group = iommu_group_get(dev);
	id = iommu_group_id(group);
	iommu_group_put(group);

AMD-Vi & Intel VT-d support re-added in following patches.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
  • Loading branch information
awilliam authored and Joerg Roedel committed Jun 25, 2012
1 parent 74416e1 commit d72e31c
Show file tree
Hide file tree
Showing 5 changed files with 663 additions and 103 deletions.
14 changes: 14 additions & 0 deletions Documentation/ABI/testing/sysfs-kernel-iommu_groups
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
What: /sys/kernel/iommu_groups/
Date: May 2012
KernelVersion: v3.5
Contact: Alex Williamson <alex.williamson@redhat.com>
Description: /sys/kernel/iommu_groups/ contains a number of sub-
directories, each representing an IOMMU group. The
name of the sub-directory matches the iommu_group_id()
for the group, which is an integer value. Within each
subdirectory is another directory named "devices" with
links to the sysfs devices contained in this group.
The group directory also optionally contains a "name"
file if the IOMMU driver has chosen to register a more
common name for the group.
Users:
21 changes: 0 additions & 21 deletions drivers/iommu/amd_iommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -3227,26 +3227,6 @@ static int amd_iommu_domain_has_cap(struct iommu_domain *domain,
return 0;
}

static int amd_iommu_device_group(struct device *dev, unsigned int *groupid)
{
struct iommu_dev_data *dev_data = dev->archdata.iommu;
struct pci_dev *pdev = to_pci_dev(dev);
u16 devid;

if (!dev_data)
return -ENODEV;

if (pdev->is_virtfn || !iommu_group_mf)
devid = dev_data->devid;
else
devid = calc_devid(pdev->bus->number,
PCI_DEVFN(PCI_SLOT(pdev->devfn), 0));

*groupid = amd_iommu_alias_table[devid];

return 0;
}

static struct iommu_ops amd_iommu_ops = {
.domain_init = amd_iommu_domain_init,
.domain_destroy = amd_iommu_domain_destroy,
Expand All @@ -3256,7 +3236,6 @@ static struct iommu_ops amd_iommu_ops = {
.unmap = amd_iommu_unmap,
.iova_to_phys = amd_iommu_iova_to_phys,
.domain_has_cap = amd_iommu_domain_has_cap,
.device_group = amd_iommu_device_group,
.pgsize_bitmap = AMD_IOMMU_PGSIZES,
};

Expand Down
49 changes: 0 additions & 49 deletions drivers/iommu/intel-iommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -4090,54 +4090,6 @@ static int intel_iommu_domain_has_cap(struct iommu_domain *domain,
return 0;
}

/*
* Group numbers are arbitrary. Device with the same group number
* indicate the iommu cannot differentiate between them. To avoid
* tracking used groups we just use the seg|bus|devfn of the lowest
* level we're able to differentiate devices
*/
static int intel_iommu_device_group(struct device *dev, unsigned int *groupid)
{
struct pci_dev *pdev = to_pci_dev(dev);
struct pci_dev *bridge;
union {
struct {
u8 devfn;
u8 bus;
u16 segment;
} pci;
u32 group;
} id;

if (iommu_no_mapping(dev))
return -ENODEV;

id.pci.segment = pci_domain_nr(pdev->bus);
id.pci.bus = pdev->bus->number;
id.pci.devfn = pdev->devfn;

if (!device_to_iommu(id.pci.segment, id.pci.bus, id.pci.devfn))
return -ENODEV;

bridge = pci_find_upstream_pcie_bridge(pdev);
if (bridge) {
if (pci_is_pcie(bridge)) {
id.pci.bus = bridge->subordinate->number;
id.pci.devfn = 0;
} else {
id.pci.bus = bridge->bus->number;
id.pci.devfn = bridge->devfn;
}
}

if (!pdev->is_virtfn && iommu_group_mf)
id.pci.devfn = PCI_DEVFN(PCI_SLOT(id.pci.devfn), 0);

*groupid = id.group;

return 0;
}

static struct iommu_ops intel_iommu_ops = {
.domain_init = intel_iommu_domain_init,
.domain_destroy = intel_iommu_domain_destroy,
Expand All @@ -4147,7 +4099,6 @@ static struct iommu_ops intel_iommu_ops = {
.unmap = intel_iommu_unmap,
.iova_to_phys = intel_iommu_iova_to_phys,
.domain_has_cap = intel_iommu_domain_has_cap,
.device_group = intel_iommu_device_group,
.pgsize_bitmap = INTEL_IOMMU_PGSIZES,
};

Expand Down
Loading

0 comments on commit d72e31c

Please sign in to comment.