android/docs/ANDROID-QEMU-PIPE.TXT - platform/external/qemu - Git at Google

 Android QEMU FAST PIPES
 =======================

 Introduction:
 -------------

 The Android emulator implements a special virtual device used to provide
 _very_ fast communication channels between the guest system and the
 emulator itself.

 From the guest, usage is simply as follows:

   1/ Open the /dev/qemu_pipe device for read+write

      NOTE: Starting with Linux 3.10, the device was renamed as
            /dev/goldfish_pipe but behaves exactly in the same way.

   2/ Write a zero-terminated string describing which service you want to
      connect.

   3/ Simply use read() and write() to communicate with the service.

 In other words:

    fd = open("/dev/qemu_pipe", O_RDWR);
    const char* pipeName = "<pipename>";
    ret = write(fd, pipeName, strlen(pipeName)+1);
    if (ret < 0) {
        // error
    }
    ... ready to go

 Where <pipename> is the name of a specific emulator service you want to use.
 Supported service names are listed later in this document.


 Implementation details:
 -----------------------

 In the emulator source tree:

     ./hw/android/goldfish/pipe.c implements the virtual driver.

     ./hw/android/goldfish/pipe.h provides the interface that must be
     implemented by any emulator pipe service.

     ./android/hw-pipe-net.c contains the implementation of the network pipe
     services (i.e. 'tcp' and 'unix'). See below for details.

 In the kernel source tree:

     drivers/misc/qemupipe/qemu_pipe.c contains the driver source code
     that will be accessible as /dev/qemu_pipe within the guest.


 Device / Driver Protocol details:
 ---------------------------------

 The device and driver use an I/O memory page and an IRQ to communicate.

   - The driver writes to various I/O registers to send commands to the
     device.

   - The device raises an IRQ to instruct the driver that certain events
     occured.

   - The driver reads I/O registers to get the status of its latest command,
     or the list of events that occured in case of interrupt.

 Each opened file descriptor to /dev/qemu_pipe in the guest corresponds to a
 32-bit 'channel' value allocated by the driver.

 The following is a description of the various commands sent by the driver
 to the device. Variable names beginning with REG_ correspond to 32-bit I/O
 registers:

   0/ Channel and address values:

      Each communication channel is identified by a unique non-zero value
      which is either 32-bit or 64-bit, depending on the guest CPU
      architecture.

      The channel value sent from the kernel to the emulator with:

         void write_channel(channel) {
         #if 64BIT_GUEST_CPU
           REG_CHANNEL_HIGH = (channel >> 32);
         #endif
           REG_CHANNEL = (channel & 0xffffffffU);
         }

     Similarly, when passing a kernel address to the emulator:

         void write_address(buffer_address) {
         #if 64BIT_GUEST_CPU
           REG_ADDRESS_HIGH = (buffer_address >> 32);
         #endif
           REG_ADDRESS = (buffer_address & 0xffffffffU);
         }

   1/ Creating a new channel:

      Used by the driver to indicate that the guest just opened /dev/qemu_pipe
      that will be identified by a named '<channel>':

         write_channel(<channel>)
         REG_CMD = CMD_OPEN

      IMPORTANT: <channel> should never be 0

   2/ Closing a channel:

      Used by the driver to indicate that the guest called 'close' on the
      channel file descriptor.

         write_channel(<channel>)
         REG_CMD = CMD_CLOSE

   3/ Writing data to the channel:

      Corresponds to when the guest does a write() or writev() on the
      channel's file descriptor. This command is used to send a single
      memory buffer:

         write_channel(<channel>)
         write_address(<buffer-address>)
         REG_SIZE    = <buffer-size>
         REG_CMD     = CMD_WRITE_BUFFER

         status = REG_STATUS

     NOTE: The <buffer-address> is the *GUEST* buffer address, not the
           physical/kernel one.

     IMPORTANT: The buffer sent through this command SHALL ALWAYS be entirely
                contained inside a single page of guest memory. This is
                enforced to simplify both the driver and the device.

                When a write() spans several pages of guest memory, the
                driver will issue several CMD_WRITE_BUFFER commands in
                succession, transparently to the client.

     The value returned by REG_STATUS should be:

        > 0    The number of bytes that were written to the pipe
        0      To indicate end-of-stream status
        < 0    A negative error code (see below).

     On important error code is PIPE_ERROR_AGAIN, used to indicate that
     writes can't be performed yet. See CMD_WAKE_ON_WRITE for more.

   4/ Reading data from the channel:

     Corresponds to when the guest does a read() or readv() on the
     channel's file descriptor.

         write_channel(<channel>)
         write_address(<buffer-address>)
         REG_SIZE    = <buffer-size>
         REG_CMD     = CMD_READ_BUFFER

         status = REG_STATUS

     Same restrictions on buffer addresses/lengths and same set of error
     codes.

   5/ Waiting for write ability:

     If CMD_WRITE_BUFFER returns PIPE_ERROR_AGAIN, and the file descriptor
     is not in non-blocking mode, the driver must put the client task on a
     wait queue until the pipe service can accept data again.

     Before this, the driver will do:

         write_channel(<channel>)
         REG_CMD     = CMD_WAKE_ON_WRITE

     To indicate to the virtual device that it is waiting and should be woken
     up when the pipe becomes writable again. How this is done is explained
     later.

   6/ Waiting for read ability:

     This is the same than CMD_WAKE_ON_WRITE, but for readability instead.

         write_channel(<channel>)
         REG_CMD     = CMD_WAKE_ON_READ

   7/ Polling for write-able/read-able state:

     The following command is used by the driver to implement the select(),
     poll() and epoll() system calls where a pipe channel is involved.

         write_channel(<channel>)
         REG_CMD     = CMD_POLL
         mask = REG_STATUS

     The mask value returned by REG_STATUS is a mix of bit-flags for
     which events are available / have occured since the last call.
     See PIPE_POLL_READ / PIPE_POLL_WRITE / PIPE_POLL_CLOSED.

   8/ Signaling events to the driver:

     The device can signal events to the driver by raising its IRQ.
     The driver's interrupt handler will then have to read a list of
     (channel,mask) pairs, terminated by a single 0 value for the channel.

     In other words, the driver's interrupt handler will do:

         for (;;) {
             channel = REG_CHANNEL
             if (channel == 0)  // END OF LIST
                 break;

             mask = REG_WAKES  // BIT FLAGS OF EVENTS
             ... process events
         }

     The events reported through this list are simply:

        PIPE_WAKE_READ   :: the channel is now readable.
        PIPE_WAKE_WRITE  :: the channel is now writable.
        PIPE_WAKE_CLOSED :: the pipe service closed the connection.

     The PIPE_WAKE_READ and PIPE_WAKE_WRITE are only reported for a given
     channel if CMD_WAKE_ON_READ or CMD_WAKE_ON_WRITE (respectively) were
     issued for it.

     PIPE_WAKE_CLOSED can be signaled at any time.


   9/ Faster read/writes through parameter blocks:

     Recent Goldfish kernels implement a faster way to perform reads and writes
     that perform a single I/O write per operation (which is useful when
     emulating x86 system through KVM or HAX).

     This uses the following structure known to both the virtual device and
     the kernel, defined in $QEMU/hw/android/goldfish/pipe.h:

     For 32-bit guest CPUs:

         struct access_params {
             uint32_t channel;
             uint32_t size;
             uint32_t address;
             uint32_t cmd;
             uint32_t result;
             /* reserved for future extension */
             uint32_t flags;
         };

     And the 64-bit variant:

         struct access_params_64 {
             uint64_t channel;
             uint32_t size;
             uint64_t address;
             uint32_t cmd;
             uint32_t result;
             /* reserved for future extension */
             uint32_t flags;
         };

     This is simply a way to pack several parameters into a single structure.
     Preliminary, e.g. at boot time, the kernel will allocate one such structure
     and pass its physical address with:

        PARAMS_ADDR_LOW  = (params & 0xffffffff);
        PARAMS_ADDR_HIGH = (params >> 32) & 0xffffffff;

     Then for each operation, it will do something like:

         params.channel = channel;
         params.address = buffer;
         params.size = buffer_size;
         params.cmd = CMD_WRITE_BUFFER (or CMD_READ_BUFFER)

         REG_ACCESS_PARAMS = <any>

         status = params.status

     The write to REG_ACCESS_PARAMS will trigger the operation, i.e. QEMU will
     read the content of the params block, use its fields to perform the
     operation then write back the return value into params.status.

   10/ v2 pipe: faster read/writes through command buffers and buffer lists

     Batch access_params still only perform one buffer transfer at a time. This is OK
     for applications that only use one page of memory in each transfer, but for
     many latency/throughput-sensitive applications like OpenGL and ADB push/pull,
     the one buffer/page per operation is not sufficient.

     Ideally, if the guest wants to transfer a buffer, it should be done in
     something as resembling one step as possible.

     But, what can make this difficult is that guest pages tend to fragment all
     over the place and not be physically contiguous.

     The v2 Goldfish pipe driver and device change the register set and
     device/pipe structs to allow for more buffers to be transferred per loop of
     goldfish_pipe_read_write, increasing performance for all pipe users
     in throughput-limited situations.

     There are two key new structures to consider.

     1. The v2 pipe adds a struct goldfish_pipe_command that represents the delivery of
     multiple buffers in one I/O transaction:

     /* A per-pipe command structure, shared with the host */
     struct goldfish_pipe_command {
     	s32 cmd;		/* PipeCmdCode, guest -> host */
     	s32 id;			/* pipe id, guest -> host */
     	s32 status;		/* command execution status, host -> guest */
     	s32 reserved;	/* to pad to 64-bit boundary */
     	union {
     		/* Parameters for PIPE_CMD_{READ,WRITE} */
     		struct {
     			u64 ptrs[MAX_BUFFERS_PER_COMMAND]; 	/* buffer pointers, guest -> host */
     			u32 sizes[MAX_BUFFERS_PER_COMMAND];	/* buffer sizes, guest -> host */
     			u32 buffers_count;					/* number of buffers, guest -> host */
     			s32 consumed_size;					/* number of consumed bytes, host -> guest */
     		} rw_params;
     	};
     };

     For each pipe fd, there is a corresponding goldfish_pipe_command structure
     that can hold up to MAX_BUFFERS_PER_COMMAND (concretely, order of 10^2 so
     far) buffers. The buffers comprising the data on the guest are tracked in
     rw_params. Some effort is taken to ensure that the list is short as
     possible (i.e., merge physically contiguous pages).

     2. Previously, for host-triggered transfers, we walked through a linked
     list of pipes that had pending actions (PIPE_WAKE_***), called the
     "signaled pipes" list, and the interrupt handler would process roughly 1-3
     such pipes each time a transfer was occurring. This is not ideal because a)
     we are doing work in the interrupt handler and b) there can be a lot of
     transitions between running the kernel driver in the guest versus running
     the virtual device in the host (transitions are slow).

     Hence, pipe v2 keeps a set of pipes to process from host IRQ raises:

     /* Device-level set of buffers shared with the host */
     struct goldfish_pipe_dev_buffers {
     	struct open_command_param open_command_params;
     	struct signalled_pipe_buffer signalled_pipe_buffers[MAX_SIGNALLED_PIPES];
     };

     and each interrupt handler can then process up to MAX_SIGNALLED_PIPES at a
     time.  In addition, the work of waking up the signaled pipes is left to a
     bottom-half handler and not done in interrupt context.

   11/ goldfish_dma: directly accessing memory on guest from host

     Sometimes, for applications that need both very high throughput and very
     low latency, such as video playback, it can be useful to skip host/guest
     transitions and gathering discontiguous buffers and write directly memory
     that is also visible on the host machine.

     To do this, goldfish_dma allows any pipe fd to have ioctls to allocate
     a contiguous physical region in kernel space, and to mmap() from the guest
     userspace into that physical region. These regions can also be shared
     across processes, such as with gralloc.

     The goldfish_dma kernel driver adds struct goldfish_dma_context to
     all struct goldfish_pipe instances:

     struct goldfish_dma_context {
         struct kref kref; /* kref to safely free dma region */
         struct device* pdev_dev; /* pointer to feed to dma_***_coherent */
         void* dma_vaddr; /* kernel vaddr of dma region */
         dma_addr_t dma_bus_addr; /* kernel dma_addr_t of dma region */
         u64 dma_size; /* size of dma region */
     	u64 phys_begin; /* paddr of dma region */
     	unsigned long pfn; /* pfn of dma region */
     	u64 phys_end; /* paddr of dma region + dma_size */
     	bool locked; /* marks whether the region is currently in use */
         unsigned long refcount; /* For debugging purposes only. */
         struct goldfish_pipe_command* command_buffer; /* For safe freeing of DMA regions,
         we need another pipe command buffer that lives longer than the pipe instance */
     };

     Interface (guest side):

     The guest user calls goldfish_dma_alloc (ioctls) and then mmap() on a
     goldfish pipe fd, which means that it wants high-speed access to
     host-visible memory.

     The guest can then write into the pointer returned by mmap(), and these
     writes become immediately visible on the host without BQL or context
     switching.

     The main data structure tracking state is struct goldfish_dma_context,
     which is included as an extra pointer field in struct goldfish_pipe.  Each
     such context is associated with possibly one physical address and size
     describing the allocated DMA region, and only one allocation is allowed for
     each pipe fd. Further allocations require more open()'s of pipe fd's.

     dma_alloc_coherent() is used to obtain contiguous physical memory regions,
     and we allocate and interact with this region on both guest and host
     through the following ioctls:

     - LOCK: Lock the DMA region for data access.
     - UNLOCK: Unlock the DMA region. This may also be done from the host
       through WAKE_ON_UNLOCK_DMA.
     - CREATE_REGION: initialize size info for a DMA region.
     - GETOFF: send physical address to guest driver.
     - (UN)MAPHOST: uses goldfish_pipe_cmd to tell the host to (un)map to the
       guest physical address associated with the current dma context. This
       makes the physically contiguous memory (in)visible to the host.

     Guest userspace obtains a pointer to the DMA memory using mmap(),
     which also lazily allocates the memory with dma_alloc_coherent().
     (On last pipe close(), the region is freed).

     The mmaped() region can handle very high bandwidth transfers, and pipe
     operations can be used at the same time to handle synchronization and command
     communication.

     The virtual device has these hooks into the kernel driver:

     1. UNLOCK_DMA: This unlocks the DMA region from the host.
     2. MAP/UNMAP_HOST: This is a pipe command used to tell the host machine
     to obtain a host void* pointing at a particular guest RAM physical address.
     It is triggered when a new DMA buffer is allocated in the guest kernel
     with the ioctl(), and makes it so the host can also see into the same region.

     The host will then call cpu_physical_memory_(un)map and track all currently
     allocated regions in a global data structure (DmaMap). If we take
     snapshots, we will need to remap those regions, something also handled by
     DmaMap.

     2. UNLOCK_DMA: This pipe command signals the guest kernel that the
     host is done processing the DMA region. In most simple use cases, the guest
     will first lock its DMA region and write to it, assuming that
     the unlock will be triggered by the host after the processing is done.
     UNLOCK_DMA is how the host triggers this unlocking.

 Available services:
 -------------------

   tcp:<port>

      Open a TCP socket to a given localhost port. This provides a very fast
      pass-through that doesn't depend on the very slow internal emulator
      NAT router. Note that you can only use the file descriptor with read()
      and write() though, send() and recv() will return an ENOTSOCK error,
      as well as any socket ioctl().

      For security reasons, it is not possible to connect to non-localhost
      ports.

   unix:<path>

      Open a Unix-domain socket on the host.

   opengles

      Connects to the OpenGL ES emulation process. For now, the implementation
      is equivalent to tcp:22468, but this may change in the future.

   qemud

      Connects to the QEMUD service inside the emulator. This replaces the
      connection that was performed through /dev/ttyS1 in older Android platform
      releases. See $QEMU/docs/ANDROID-QEMUD.TXT for details.
	Android QEMU FAST PIPES
	=======================

	Introduction:
	-------------

	The Android emulator implements a special virtual device used to provide
	_very_ fast communication channels between the guest system and the
	emulator itself.

	From the guest, usage is simply as follows:

	1/ Open the /dev/qemu_pipe device for read+write

	NOTE: Starting with Linux 3.10, the device was renamed as
	/dev/goldfish_pipe but behaves exactly in the same way.

	2/ Write a zero-terminated string describing which service you want to
	connect.

	3/ Simply use read() and write() to communicate with the service.

	In other words:

	fd = open("/dev/qemu_pipe", O_RDWR);
	const char* pipeName = "<pipename>";
	ret = write(fd, pipeName, strlen(pipeName)+1);
	if (ret < 0) {
	// error
	}
	... ready to go

	Where <pipename> is the name of a specific emulator service you want to use.
	Supported service names are listed later in this document.


	Implementation details:
	-----------------------

	In the emulator source tree:

	./hw/android/goldfish/pipe.c implements the virtual driver.

	./hw/android/goldfish/pipe.h provides the interface that must be
	implemented by any emulator pipe service.

	./android/hw-pipe-net.c contains the implementation of the network pipe
	services (i.e. 'tcp' and 'unix'). See below for details.

	In the kernel source tree:

	drivers/misc/qemupipe/qemu_pipe.c contains the driver source code
	that will be accessible as /dev/qemu_pipe within the guest.


	Device / Driver Protocol details:
	---------------------------------

	The device and driver use an I/O memory page and an IRQ to communicate.

	- The driver writes to various I/O registers to send commands to the
	device.

	- The device raises an IRQ to instruct the driver that certain events
	occured.

	- The driver reads I/O registers to get the status of its latest command,
	or the list of events that occured in case of interrupt.

	Each opened file descriptor to /dev/qemu_pipe in the guest corresponds to a
	32-bit 'channel' value allocated by the driver.

	The following is a description of the various commands sent by the driver
	to the device. Variable names beginning with REG_ correspond to 32-bit I/O
	registers:

	0/ Channel and address values:

	Each communication channel is identified by a unique non-zero value
	which is either 32-bit or 64-bit, depending on the guest CPU
	architecture.

	The channel value sent from the kernel to the emulator with:

	void write_channel(channel) {
	#if 64BIT_GUEST_CPU
	REG_CHANNEL_HIGH = (channel >> 32);
	#endif
	REG_CHANNEL = (channel & 0xffffffffU);
	}

	Similarly, when passing a kernel address to the emulator:

	void write_address(buffer_address) {
	#if 64BIT_GUEST_CPU
	REG_ADDRESS_HIGH = (buffer_address >> 32);
	#endif
	REG_ADDRESS = (buffer_address & 0xffffffffU);
	}

	1/ Creating a new channel:

	Used by the driver to indicate that the guest just opened /dev/qemu_pipe
	that will be identified by a named '<channel>':

	write_channel(<channel>)
	REG_CMD = CMD_OPEN

	IMPORTANT: <channel> should never be 0

	2/ Closing a channel:

	Used by the driver to indicate that the guest called 'close' on the
	channel file descriptor.

	write_channel(<channel>)
	REG_CMD = CMD_CLOSE

	3/ Writing data to the channel:

	Corresponds to when the guest does a write() or writev() on the
	channel's file descriptor. This command is used to send a single
	memory buffer:

	write_channel(<channel>)
	write_address(<buffer-address>)
	REG_SIZE = <buffer-size>
	REG_CMD = CMD_WRITE_BUFFER

	status = REG_STATUS

	NOTE: The <buffer-address> is the GUEST buffer address, not the
	physical/kernel one.

	IMPORTANT: The buffer sent through this command SHALL ALWAYS be entirely
	contained inside a single page of guest memory. This is
	enforced to simplify both the driver and the device.

	When a write() spans several pages of guest memory, the
	driver will issue several CMD_WRITE_BUFFER commands in
	succession, transparently to the client.

	The value returned by REG_STATUS should be:

	> 0 The number of bytes that were written to the pipe
	0 To indicate end-of-stream status
	< 0 A negative error code (see below).

	On important error code is PIPE_ERROR_AGAIN, used to indicate that
	writes can't be performed yet. See CMD_WAKE_ON_WRITE for more.

	4/ Reading data from the channel:

	Corresponds to when the guest does a read() or readv() on the
	channel's file descriptor.

	write_channel(<channel>)
	write_address(<buffer-address>)
	REG_SIZE = <buffer-size>
	REG_CMD = CMD_READ_BUFFER

	status = REG_STATUS

	Same restrictions on buffer addresses/lengths and same set of error
	codes.

	5/ Waiting for write ability:

	If CMD_WRITE_BUFFER returns PIPE_ERROR_AGAIN, and the file descriptor
	is not in non-blocking mode, the driver must put the client task on a
	wait queue until the pipe service can accept data again.

	Before this, the driver will do:

	write_channel(<channel>)
	REG_CMD = CMD_WAKE_ON_WRITE

	To indicate to the virtual device that it is waiting and should be woken
	up when the pipe becomes writable again. How this is done is explained
	later.

	6/ Waiting for read ability:

	This is the same than CMD_WAKE_ON_WRITE, but for readability instead.

	write_channel(<channel>)
	REG_CMD = CMD_WAKE_ON_READ

	7/ Polling for write-able/read-able state:

	The following command is used by the driver to implement the select(),
	poll() and epoll() system calls where a pipe channel is involved.

	write_channel(<channel>)
	REG_CMD = CMD_POLL
	mask = REG_STATUS

	The mask value returned by REG_STATUS is a mix of bit-flags for
	which events are available / have occured since the last call.
	See PIPE_POLL_READ / PIPE_POLL_WRITE / PIPE_POLL_CLOSED.

	8/ Signaling events to the driver:

	The device can signal events to the driver by raising its IRQ.
	The driver's interrupt handler will then have to read a list of
	(channel,mask) pairs, terminated by a single 0 value for the channel.

	In other words, the driver's interrupt handler will do:

	for (;;) {
	channel = REG_CHANNEL
	if (channel == 0) // END OF LIST
	break;

	mask = REG_WAKES // BIT FLAGS OF EVENTS
	... process events
	}

	The events reported through this list are simply:

	PIPE_WAKE_READ :: the channel is now readable.
	PIPE_WAKE_WRITE :: the channel is now writable.
	PIPE_WAKE_CLOSED :: the pipe service closed the connection.

	The PIPE_WAKE_READ and PIPE_WAKE_WRITE are only reported for a given
	channel if CMD_WAKE_ON_READ or CMD_WAKE_ON_WRITE (respectively) were
	issued for it.

	PIPE_WAKE_CLOSED can be signaled at any time.


	9/ Faster read/writes through parameter blocks:

	Recent Goldfish kernels implement a faster way to perform reads and writes
	that perform a single I/O write per operation (which is useful when
	emulating x86 system through KVM or HAX).

	This uses the following structure known to both the virtual device and
	the kernel, defined in $QEMU/hw/android/goldfish/pipe.h:

	For 32-bit guest CPUs:

	struct access_params {
	uint32_t channel;
	uint32_t size;
	uint32_t address;
	uint32_t cmd;
	uint32_t result;
	/* reserved for future extension */
	uint32_t flags;
	};

	And the 64-bit variant:

	struct access_params_64 {
	uint64_t channel;
	uint32_t size;
	uint64_t address;
	uint32_t cmd;
	uint32_t result;
	/* reserved for future extension */
	uint32_t flags;
	};

	This is simply a way to pack several parameters into a single structure.
	Preliminary, e.g. at boot time, the kernel will allocate one such structure
	and pass its physical address with:

	PARAMS_ADDR_LOW = (params & 0xffffffff);
	PARAMS_ADDR_HIGH = (params >> 32) & 0xffffffff;

	Then for each operation, it will do something like:

	params.channel = channel;
	params.address = buffer;
	params.size = buffer_size;
	params.cmd = CMD_WRITE_BUFFER (or CMD_READ_BUFFER)

	REG_ACCESS_PARAMS = <any>

	status = params.status

	The write to REG_ACCESS_PARAMS will trigger the operation, i.e. QEMU will
	read the content of the params block, use its fields to perform the
	operation then write back the return value into params.status.

	10/ v2 pipe: faster read/writes through command buffers and buffer lists

	Batch access_params still only perform one buffer transfer at a time. This is OK
	for applications that only use one page of memory in each transfer, but for
	many latency/throughput-sensitive applications like OpenGL and ADB push/pull,
	the one buffer/page per operation is not sufficient.

	Ideally, if the guest wants to transfer a buffer, it should be done in
	something as resembling one step as possible.

	But, what can make this difficult is that guest pages tend to fragment all
	over the place and not be physically contiguous.

	The v2 Goldfish pipe driver and device change the register set and
	device/pipe structs to allow for more buffers to be transferred per loop of
	goldfish_pipe_read_write, increasing performance for all pipe users
	in throughput-limited situations.

	There are two key new structures to consider.

	1. The v2 pipe adds a struct goldfish_pipe_command that represents the delivery of
	multiple buffers in one I/O transaction:

	/* A per-pipe command structure, shared with the host */
	struct goldfish_pipe_command {
	s32 cmd; /* PipeCmdCode, guest -> host */
	s32 id; /* pipe id, guest -> host */
	s32 status; /* command execution status, host -> guest */
	s32 reserved; /* to pad to 64-bit boundary */
	union {
	/* Parameters for PIPE_CMD_{READ,WRITE} */
	struct {
	u64 ptrs[MAX_BUFFERS_PER_COMMAND]; /* buffer pointers, guest -> host */
	u32 sizes[MAX_BUFFERS_PER_COMMAND]; /* buffer sizes, guest -> host */
	u32 buffers_count; /* number of buffers, guest -> host */
	s32 consumed_size; /* number of consumed bytes, host -> guest */
	} rw_params;
	};
	};

	For each pipe fd, there is a corresponding goldfish_pipe_command structure
	that can hold up to MAX_BUFFERS_PER_COMMAND (concretely, order of 10^2 so
	far) buffers. The buffers comprising the data on the guest are tracked in
	rw_params. Some effort is taken to ensure that the list is short as
	possible (i.e., merge physically contiguous pages).

	2. Previously, for host-triggered transfers, we walked through a linked
	list of pipes that had pending actions (PIPE_WAKE_***), called the
	"signaled pipes" list, and the interrupt handler would process roughly 1-3
	such pipes each time a transfer was occurring. This is not ideal because a)
	we are doing work in the interrupt handler and b) there can be a lot of
	transitions between running the kernel driver in the guest versus running
	the virtual device in the host (transitions are slow).

	Hence, pipe v2 keeps a set of pipes to process from host IRQ raises:

	/* Device-level set of buffers shared with the host */
	struct goldfish_pipe_dev_buffers {
	struct open_command_param open_command_params;
	struct signalled_pipe_buffer signalled_pipe_buffers[MAX_SIGNALLED_PIPES];
	};

	and each interrupt handler can then process up to MAX_SIGNALLED_PIPES at a
	time. In addition, the work of waking up the signaled pipes is left to a
	bottom-half handler and not done in interrupt context.

	11/ goldfish_dma: directly accessing memory on guest from host

	Sometimes, for applications that need both very high throughput and very
	low latency, such as video playback, it can be useful to skip host/guest
	transitions and gathering discontiguous buffers and write directly memory
	that is also visible on the host machine.

	To do this, goldfish_dma allows any pipe fd to have ioctls to allocate
	a contiguous physical region in kernel space, and to mmap() from the guest
	userspace into that physical region. These regions can also be shared
	across processes, such as with gralloc.

	The goldfish_dma kernel driver adds struct goldfish_dma_context to
	all struct goldfish_pipe instances:

	struct goldfish_dma_context {
	struct kref kref; /* kref to safely free dma region */
	struct device* pdev_dev; /* pointer to feed to dma_**_coherent /
	void* dma_vaddr; /* kernel vaddr of dma region */
	dma_addr_t dma_bus_addr; /* kernel dma_addr_t of dma region */
	u64 dma_size; /* size of dma region */
	u64 phys_begin; /* paddr of dma region */
	unsigned long pfn; /* pfn of dma region */
	u64 phys_end; /* paddr of dma region + dma_size */
	bool locked; /* marks whether the region is currently in use */
	unsigned long refcount; /* For debugging purposes only. */
	struct goldfish_pipe_command* command_buffer; /* For safe freeing of DMA regions,
	we need another pipe command buffer that lives longer than the pipe instance */
	};

	Interface (guest side):

	The guest user calls goldfish_dma_alloc (ioctls) and then mmap() on a
	goldfish pipe fd, which means that it wants high-speed access to
	host-visible memory.

	The guest can then write into the pointer returned by mmap(), and these
	writes become immediately visible on the host without BQL or context
	switching.

	The main data structure tracking state is struct goldfish_dma_context,
	which is included as an extra pointer field in struct goldfish_pipe. Each
	such context is associated with possibly one physical address and size
	describing the allocated DMA region, and only one allocation is allowed for
	each pipe fd. Further allocations require more open()'s of pipe fd's.

	dma_alloc_coherent() is used to obtain contiguous physical memory regions,
	and we allocate and interact with this region on both guest and host
	through the following ioctls:

	- LOCK: Lock the DMA region for data access.
	- UNLOCK: Unlock the DMA region. This may also be done from the host
	through WAKE_ON_UNLOCK_DMA.
	- CREATE_REGION: initialize size info for a DMA region.
	- GETOFF: send physical address to guest driver.
	- (UN)MAPHOST: uses goldfish_pipe_cmd to tell the host to (un)map to the
	guest physical address associated with the current dma context. This
	makes the physically contiguous memory (in)visible to the host.

	Guest userspace obtains a pointer to the DMA memory using mmap(),
	which also lazily allocates the memory with dma_alloc_coherent().
	(On last pipe close(), the region is freed).

	The mmaped() region can handle very high bandwidth transfers, and pipe
	operations can be used at the same time to handle synchronization and command
	communication.

	The virtual device has these hooks into the kernel driver:

	1. UNLOCK_DMA: This unlocks the DMA region from the host.
	2. MAP/UNMAP_HOST: This is a pipe command used to tell the host machine
	to obtain a host void* pointing at a particular guest RAM physical address.
	It is triggered when a new DMA buffer is allocated in the guest kernel
	with the ioctl(), and makes it so the host can also see into the same region.

	The host will then call cpu_physical_memory_(un)map and track all currently
	allocated regions in a global data structure (DmaMap). If we take
	snapshots, we will need to remap those regions, something also handled by
	DmaMap.

	2. UNLOCK_DMA: This pipe command signals the guest kernel that the
	host is done processing the DMA region. In most simple use cases, the guest
	will first lock its DMA region and write to it, assuming that
	the unlock will be triggered by the host after the processing is done.
	UNLOCK_DMA is how the host triggers this unlocking.

	Available services:
	-------------------

	tcp:<port>

	Open a TCP socket to a given localhost port. This provides a very fast
	pass-through that doesn't depend on the very slow internal emulator
	NAT router. Note that you can only use the file descriptor with read()
	and write() though, send() and recv() will return an ENOTSOCK error,
	as well as any socket ioctl().

	For security reasons, it is not possible to connect to non-localhost
	ports.

	unix:<path>

	Open a Unix-domain socket on the host.

	opengles

	Connects to the OpenGL ES emulation process. For now, the implementation
	is equivalent to tcp:22468, but this may change in the future.

	qemud

	Connects to the QEMUD service inside the emulator. This replaces the
	connection that was performed through /dev/ttyS1 in older Android platform
	releases. See $QEMU/docs/ANDROID-QEMUD.TXT for details.