Wednesday, September 12, 2007

Allocating physically contiguous memory in Solaris

Solaris doesn't easily expose an API to allocate on-demand physically contiguous, custom aligned memory. It has no calls like BSD's contigmalloc() or Darwin's IOMallocContiguous(). Here's a code snippet on how to do it:

struct ddi_dma_attr g_SolarisX86PhysMemLimits = 
{


DMA_ATTR_V0, /* Version Number */
(
uint64_t)0, /* lower limit */
(
uint64_t)0xffffffff, /* high limit (32-bit PA, 4G) */

(
uint64_t)0xffffffff, /* counter limit */
(
uint64_t)MMU_PAGESIZE, /* alignment */

(
uint64_t)MMU_PAGESIZE, /* burst size */
(
uint64_t)MMU_PAGESIZE, /* effective DMA size */

(
uint64_t)0xffffffff, /* max DMA xfer size */
(
uint64_t)0xffffffff, /* segment boundary */

1
, /* scatter-gather list length */
1
, /* device granularity */
0
/* bus-specific flags */

};


caddr_t kernVirtAddr;
int
rc = i_ddi_mem_alloc(NULL, &g_SolarisX86PhysMemLimits, sizeInBytes,
1
, 0, NULL, &kernVirtAddr, NULL, NULL);

The prototype of i_ddi_mem_alloc is:
i_ddi_mem_alloc(dev_info_t *dip, ddi_dma_attr_t *attr,
size_t length, int cansleep, int flags,
ddi_device_acc_attr_t *accattrp, caddr_t *kaddrp,
size_t *real_length, ddi_acc_hdl_t *ap)

So obviously, Sun intended this to be called only from within drivers, but we can safely pass NULL to those arguments we don't have as we have done in our above example. Don't forget to pass a non-zero value for cansleep unless you have a need for not allowing the allocation to be a non-waiting one.

The above code allocates physically contiguous, page-aligned memory. This can be used for DMA transfers or for other specific tasks that has these requirements.

The key to getting this to be contiguous is the scatter gather list length member of the structure. Setting this to 1 forces the kernel to allocate 1 physically contiguous block of memory!

The high and low limit in my example maxes out at 4 GB (0xffffffff). If you're building on 64-bit platforms or for Blade servers etc., you can safely go beyond this.

I thought I'd document this little 'trick' if you will as none of this as far as I know is really documented publically, you must dig deep into the bowels of the OpenSolaris kernel to find them. Even the allocation call isn't a documented one, but one that isn't likely to change in the kernel.

For freeing memory allocated you use would use:
i_ddi_mem_free(kernVirtAddr, NULL)


Warning:
For almost all your needs you can probably go through the well document DDI functions of Solaris. Don't use what I've suggested unless you really know what you are doing, and you are in a situation where the exposed DDI functions are not sufficient.

12 comments:

somnath said...

Hey but this will return a kernel virtual address in kernvirtaddr?
So how do i actually get the physicaladdress for this virt addr? (have to use a ddi_dma_addr_bind_handle() i guess?)
Is there something like a virt_to_bus() or a _pa() like in linux in solaris?

Teknomancer said...

So how do i actually get the physicaladdress for this virt addr?

To convert this kernel virtual address to a physical address you first need to get hold of the page frame number. Then once you have that you can multiply it with the x86 memory page size to arrive at the correct physical address.

Teknomancer said...

Hey, but wait a minute...

Are you sure you want to be using these undocumented calls?

I needed them for a specific purpose where I had to allocate-on-demand and not just from mmap() calls from userland.

Personally you really should first try using ddi_* functions for DMA needs.

somnath said...

Thanks for the info , yes i am indeed using the DDI calls ,but was
running into some problem while writing my driver ..just wanted to
make sure my DMA mem allocation was right..im still running into the
same problems so guess its just fine ...can go back to using the DDI
calls

just want to make sure my DMA setup is OK for more than 2 pages
..something like write into the VA and watch the contents of the PA
using debugger ..wondering how to do that in solaris?

Teknomancer said...

use hat_getpfnum and pass in the kernel address space's HAT, and your virtual address.

That will give you the page frame number. Multiply that with x86 page size.

somnath said...

Err..pardon me if this is a stupid question ,but how do i get my kernel address space's HAT ?

Teknomancer said...

It's kas.a_hat

somnath said...

Hi
Thanks a lot,ya pretty much figured that out myself soon after i posted it..There is one thing tho,is there any particular reason you are using MMU_PAGE_SIZE for 3 of the attributes in the dma_attrs structure? any known issues on x86-64?
Reason im asking this is ,im allocating a 'physically contiguous block' using the ddi_dma* calls and it seems (not been able to nail it down to it 100%) that the memory block is not physically contigous beyond 1 page ...(memory block i have preallocated in my driver is 256K) ..any ideas on this ?
Thanks in advance

Teknomancer said...

The MMU_PAGESIZE represent the physical page size of the hardware MMU while PAGESIZE represents the logical page size used by the kernel. They are identical in Solaris. Both ought to be the x86 page size of 4096 bytes.

I require page aligned addresses hence the alignment must be a multiple of the page size. The minimum effective DMA and burst size can probably be changed just the burst size a multiple of 2 or preferable page size.

Of course I did not use this memory for DMA transfers to a real device so your mileage may vary with the specifications of the device. I remember reading that prior to enabling DMA for the device your driver must check the burst sizes using ddi_dma_burstsizes.

As for the x86-64, yes this snippet works on it.

Care to show how you arrived at your final physical address? Perhaps an alignment issue.

Anonymous said...

Thanks for the info, but:

>use hat_getpfnum and pass in the

>kernel address space's HAT, and

>your virtual address.

>That will give you the page frame

>number. Multiply that with x86 page >size.

hat_getpfnum has been completely removed from Solaris 10 x86. This method will not work.

So while I can now allocate contiguous memory, I am still having trouble getting at the physical address of it. Similiar to the other poster, I also need it for dma'ing from the pci address (physical memory) to the device on the other side of the bus.

Teknomancer said...

hat_getpfnum has not been removed from Solaris 10 x86.

Try using contig_alloc(), then use hat_getpfnum to obtain the page frame number and simply << MMU_PAGESHIFT bytes to obtain the physical address.

There is definitely code using hat_getpfnum on Solaris 10 x86.

som said...

Hi,
Do you have an idea what is the equivalent of the get_order() of Linux in Solaris 10? i.e given a size ,it should return the order/no: of pages ? Is there one or should i have to write it by hand?