nvk: the kernel changes needed

The initial NVK (nouveau vulkan) experimental driver has been merged into mesa master[1], and although there's lots of work to be done before it's application ready, the main reason it was merged was because the initial kernel work needed was merged into drm-misc-next[2] and will then go to drm-next for the 6.6 merge window. (This work is separate from the GSP firmware enablement required for reclocking, that is a parallel development, needed to make nvk useable). Faith at Collabora will have a blog post about the Mesa side, this is more about the kernel journey.

What was needed in the kernel?

The nouveau kernel API was written 10 years or more ago, and was designed around OpenGL at the time. There were two major restrictions in the current uAPI that made it unsuitable for Vulkan.

  1. buffer objects (physical memory allocations) were allocated 1:1 with virtual memory allocations for a file descriptor. This meant the kernel managed the virtual address space. For proper Vulkan support, the bo allocation and vm allocation have to be separate, and userspace should control the virtual address space.
  2. Command submission didn't use sync objects. The nouveau command submission wasn't wired up to the modern sync objects. These are pretty much a requirement for Vulkan fencing and semaphores to work properly.

How to implement these?

When we kicked off the nvk idea I made a first pass at implementing a new user API, to allow the above features. I took at look at how the GPU VMA management was done in current drivers and realized that there was a scope for a common component to manage the GPU VA space. I did a hacky implementation of some common code and a nouveau implementation. Luckily at the time, Danilo Krummrich had joined my team at Red Hat and needed more kernel development experience in GPU drivers. I handed my sketchy implementation to Danilo and let him run with it. He spent a lot of time learning and writing copious code. His GPU VA manager code was merged into drm-misc-next last week and his nouveau code landed today.

What is the GPU VA manager?

The idea behind the GPU VA manager is that there is no need for every driver to implement something that should essentially not be a hardware specific problem. The manager is designed to track VA allocations from userspace, and keep track of what GEM objects they are currently bound to. The implementation went through a few twists and turns and experiments. 

For a long period we considered using maple tree as the core of it, but we hit a number of messy interactions between the dma-fence locking and memory allocations required to add new nodes to the maple tree. The dma-fence critical section is a hard requirement to make others deal with. In the end Danilo used an rbtree to track things. We will revisit if we can deal with maple tree again in the future. 

We had a long discussion and a couple of implement it both ways and see, on whether we needed to track empty sparse VMA ranges in the manager or not,  nouveau wanted these but generically we weren't sure they were helpful, but that also affected the uAPI as it needed explicit operations to create/drop these. In the end we started tracking these in the driver and left the core VA manager cleaner.

Now the code is in tree we will start to push future drivers to use it instead of spinning their own.

What changes are needed for nouveau?

Now that the VAs are being tracked, the nouveau API needed two new entrypoints. Since BO allocation will no longer create a VM, a new API is needed to bind BO allocations with VM addresses. This is called the VM_BIND API. It has two variants

  1. a synchronous version that immediately maps a BO to a VM and is used for the common allocation paths.
  2. an asynchronous version that is modeled after the Vulkan sparse API, and takes in/out sync objects, which use the drm scheduler to schedule the vm/bo binding.
The VM BIND backend then does all the page table manipulation required.
 
The second API added was an EXEC call. This takes in/out sync objects and a set of addresses that point to command buffers to execute. This uses the drm scheduler to deal with the synchronization and hands the firmware the command buffer address to execute.
Internally for nouveau this meant having to add support for the drm scheduler, adding new internal page table manipulation APIs, and wiring up the GPU VA. 

Shoutouts:

My input was the sketchy sketch at the start, and doing the userspace changes to the nvk codebase to allow testing.

The biggest shoutout to Danilo, who took a sketchy sketch of what things should look like, created a real implementation, did all the experimental ideas I threw at him, and threw them and others back at me, negotiated with other drivers to use the common code, and built a great foundational piece of drm kernel infrastructure.

Faith at Collabora who has done the bulk of the work on nvk did a code review at the end and pointed out some missing pieces of the API and the optimisations it enables.

Karol at Red Hat on the main nvk driver and Ben at Red Hat for nouveau advice on how things worked, while he smashed away at the GSP rock.

(and anyone else who has contributed to nvk, nouveau and even NVIDIA for some bits :-)

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24326

[2] https://cgit.freedesktop.org/drm-misc/log/

Comments

  1. great to see things moving forward. can't wait for the gsp code to land and finally be able to test the whole (somewhat, I'm pointing at gsp here) open stack of a proper driver that should've been there baked by NVidia from the start. mad props to you all. kudos!

    ReplyDelete
  2. Thank you and everyone involved for for pushing forward the difficult state of open-source NVIDIA drivers.

    Can we get more information about the rather critcal GSP firmware integration ? Maybe Ben is willing to write a post like this about it ?

    Do I understand GSP will also boost the existing OpenGL driver ?

    My current understanding is Linux boot design is in desperate need of changing the existing aproach to "load all the firmware in the world before mounting root fs". So would it be ok to boot NVIDIA GPUs in VESA mode (or just with the existing (old) firmware used by nouveau), and load the GSP firmware afterwards ?

    Anyway keep up the amazing work -- can't wait to hear more on where this will go next !

    ReplyDelete

Post a Comment

Popular posts from this blog

Fedora 38 LLVM vs Team Fortress 2 (TF2)

tinygrad + rusticl + aco: why not?