virCommand: docs for usage of new command APIs

* docs/internals/command.html.in: New file. * docs/Makefile.am: Build new docs. * docs/subsite.xsl: New glue file. * docs/internals.html.in, docs/sitemap.html.in: Update glue.
2025-04-26 07:04:42 +00:00 · 2010-05-25 14:14:46 +01:00 · 2010-05-25 14:14:46 +01:00 · a317c50a04
commit a317c50a04
parent f16ad06fb2
5 changed files with 598 additions and 1 deletions
--- a/docs/Makefile.am
+++ b/docs/Makefile.am
@ -60,7 +60,8 @@ gif = \
  architecture.gif \
  node.gif

-dot_html_in = $(notdir $(wildcard $(srcdir)/*.html.in)) todo.html.in
+dot_html_in = $(notdir $(wildcard $(srcdir)/*.html.in)) todo.html.in \
+      $(patsubst $(srcdir)/%,%,$(wildcard $(srcdir)/internals/*.html.in))
 dot_html = $(dot_html_in:%.html.in=%.html)

 patches = $(wildcard api_extension/*.patch)
@ -113,6 +114,14 @@ todo:
 %.png: %.fig
 	convert -rotate 90 $< $@

+internals/%.html.tmp: internals/%.html.in subsite.xsl page.xsl sitemap.html.in
+	@if [ -x $(XSLTPROC) ] ; then \
+	  echo "Generating $@"; \
+	  name=`echo $@ | sed -e 's/.tmp//'`; \
+	  $(XSLTPROC) --stringparam pagename $$name --nonet --html \
+	    $(top_srcdir)/docs/subsite.xsl $< > $@ \
+	    || { rm $@ && exit 1; }; fi
+
 %.html.tmp: %.html.in site.xsl page.xsl sitemap.html.in
 	@if [ -x $(XSLTPROC) ] ; then \
 	  echo "Generating $@"; \
--- a/docs/internals.html.in
+++ b/docs/internals.html.in
@ -7,5 +7,14 @@
      internals, adding new public APIs, new hypervisor drivers or extending
      the libvirtd daemon code.
    </p>
+
+    <ul>
+      <li>Introduction to basic rules and guidelines for <a href="hacking.html">hacking<a>
+	    on libvirt code</li>
+      <li>Guide to adding <a href="api_extension.html">public APIs<a></li>
+      <li>Approach for <a href="internals/command.html">spawning commands</a> from
+	libvirt driver code</li>
+    </ul>
+
  </body>
 </html>
--- a/docs/internals/command.html.in
+++ b/docs/internals/command.html.in
@ -0,0 +1,550 @@
+<html>
+  <body>
+    <h1>Spawning processes / commands from libvirt drivers</h1>
+
+    <ul id="toc"></ul>
+
+    <p>
+      This page describes the usage of libvirt APIs for
+      spawning processes / commands from libvirt drivers.
+      All code is required to use these APIs
+    </p>
+
+    <h2><a name="posix">Problems with standard POSIX APIs</a></h2>
+
+    <p>
+      The POSIX specification includes a number of APIs for
+      spawning processes / commands, but they suffer from
+      a number of flaws
+    </p>
+
+    <ul>
+      <li><code>fork+exec</code>: The lowest &amp; most flexible
+	level, but very hard to use correctly / safely. It
+	is easy to leak file descriptors, have unexpected
+	signal handler behaviour and not handle edge cases.
+	Furthermore, it is not portable to mingw.
+	</li>
+      <li><code>system</code>: Convenient if you don't care
+	about capturing command output, but has the serious
+	downside that the command string is interpreted by
+	the shell. This makes it very dangerous to use, because
+	improperly validated user input can lead to exploits
+	via shell meta characters.
+      </li>
+      <li><code>popen</code>: Inherits the flaws of
+	<code>system</code>, and has no option for bi-directional
+	communication.
+      </li>
+      <li><code>posix_spawn</code>: A half-way house between
+	simplicity of system() and the flexibility of fork+exec.
+	It does not allow for a couple of important features
+	though, such as running a hook between the fork+exec
+	stage, or closing all open file descriptors.</li>
+    </ul>
+
+    <p>
+      Due to the problems mentioned with each of these,
+      libvirt driver code <strong>must not use</strong> any
+      of the above APIs. Historically libvirt provided a
+      higher level API known as virExec. This was wrapper
+      around fork+exec, in a similar style to posix_spawn,
+      but with a few more features.
+    </p>
+
+    <p>
+      This wrapper still suffered from a number of problems.
+      Handling command cleanup via waitpid() is overly
+      complex &amp; error prone for most usage. Building up the
+      argv[] + env[] string arrays is quite cumbersome and
+      error prone, particularly wrt memory leak / OOM handling.
+    </p>
+
+    <h2><a name="api">The libvirt command execution API</a></h2>
+
+    <p>
+      There is now a high level API that provides a safe and
+      flexible way to spawn commands, which prevents the most
+      common errors &amp; is easy to code against.  This
+      code is provided in the <code>src/util/command.h</code>
+      header which can be imported using <code>#include "command.h"</code>
+    </p>
+
+    <h3><a name="initial">Defining commands in libvirt</a></h3>
+
+    <p>
+      The first step is to declare what command is to be
+      executed. The command name can be either a fully
+      qualified path, or a bare command name. In the latter
+      case it will be resolved wrt the <code>$PATH</code>
+      environment variable.
+    </p>
+
+<pre>
+  virCommandPtr cmd = virCommandNew("/usr/bin/dnsmasq");
+</pre>
+
+    <p>
+      There is no need to check for allocation failure after
+      <code>virCommandNew</code>. This will be detected and
+      reported at a later time.
+    </p>
+
+    <h3><a name="args">Adding arguments to the command</a></h3>
+
+    <p>
+      There are a number of APIs for adding arguments to a
+      command. To add a direct string arg
+    </p>
+
+<pre>
+  virCommandAddArg(cmd, "-strict-order");
+</pre>
+
+    <p>
+      If an argument takes an attached value of the form
+      <code>-arg=val</code>, then this can be done using
+    </p>
+
+<pre>
+  virCommandAddArgPair(cmd, "--conf-file", "/etc/dnsmasq.conf");
+</pre>
+
+    <p>
+      If an argument needs to be formatted as if by
+      <code>printf</code>:
+    </p>
+
+<pre>
+  virCommandAddArgFormat(cmd, "%d", count);
+</pre>
+
+    <p>
+      To add a entire NULL terminated array of arguments in one go,
+      there are two options.
+    </p>
+
+<pre>
+  const char *const args[] = {
+      "--strict-order", "--except-interface", "lo", NULL
+  };
+  virCommandAddArgSet(cmd, args);
+  virCommandAddArgList(cmd, "--domain", "localdomain", NULL);
+</pre>
+
+    <p>
+      This can also be done at the time of initial construction of
+      the <code>virCommandPtr</code> object:
+    </p>
+
+<pre>
+  const char *const args[] = {
+      "/usr/bin/dnsmasq",
+      "--strict-order", "--except-interface",
+      "lo", "--domain", "localdomain", NULL
+  };
+  virCommandPtr cmd1 = virCommandNewArgs(cmd, args);
+  virCommandPtr cmd2 = virCommandNewArgList("/usr/bin/dnsmasq",
+                                            "--domain", "localdomain", NULL);
+</pre>
+
+    <h3><a name="env">Setting up the environment</a></h3>
+
+    <p>
+      By default a command will inherit all environment variables
+      from the current process. Generally this is not desirable
+      and a customized environment will be more suitable. Any
+      customization done via the following APIs will prevent
+      inheritance of any existing environment variables unless
+      explicitly allowed. The first step is usually to pass through
+      a small number of variables from the current process.
+    </p>
+
+<pre>
+  virCommandAddEnvPassCommon(cmd);
+</pre>
+
+    <p>
+      This has now set up a clean environment for the child, passing
+      through <code>PATH</code>, <code>LD_PRELOAD</code>,
+      <code>LD_LIBRARY_PATH</code>, <code>HOME</code>,
+      <code>USER</code>, <code>LOGNAME</code> and <code>TMPDIR</code>.
+      Furthermore it will explicitly set <code>LC_ALL=C</code> to
+      avoid unexpected localization of command output. Further
+      variables can be passed through from parent explicitly:
+    </p>
+
+<pre>
+  virCommandAddEnvPass(cmd, "DISPLAY");
+  virCommandAddEnvPass(cmd, "XAUTHORITY");
+</pre>
+
+    <p>
+      To define an environment variable in the child with an
+      separate key / value:
+    </p>
+
+<pre>
+  virCommandAddEnvPair(cmd, "TERM", "xterm");
+</pre>
+
+    <p>
+      If the key/value pair is pre-formatted in the right
+      format, it can be set directly
+    </p>
+
+<pre>
+  virCommandAddEnvString(cmd, "TERM=xterm");
+</pre>
+
+    <h3><a name="misc">Miscellaneous other options</a></h3>
+
+    <p>
+      Normally the spawned command will retain the current
+      process and process group as its parent. If the current
+      process dies, the child will then (usually) be terminated
+      too. If this cleanup is not desired, then the command
+      should be marked as daemonized:
+    </p>
+
+<pre>
+  virCommandDaemonize(cmd);
+</pre>
+
+    <p>
+      When daemonizing a command, the PID visible from the
+      caller will be that of the intermediate process, not
+      the actual damonized command. If the PID of the real
+      command is required then a pidfile can be requested
+    </p>
+
+<pre>
+  virCommandSetPidFile(cmd, "/var/run/dnsmasq.pid");
+</pre>
+
+    <p>
+      This PID file is guaranteed to be written before
+      the intermediate process exits.
+    </p>
+
+    <h3><a name="privs">Reducing command privileges</a></h3>
+
+    <p>
+      Normally a command will inherit all privileges of
+      the current process. To restrict what a command can
+      do, it is possible to request that all its capabilities
+      are cleared. With this done it will only be able to
+      access resources for which it has explicit DAC permissions
+    </p>
+
+<pre>
+  virCommandClearCaps(cmd);
+</pre>
+
+    <h3><a name="fds">Managing file handles</a></h3>
+
+    <p>
+      To prevent unintended resource leaks to child processes, the
+      child defaults to closing all open file handles, and setting
+      stdin/out/err to <code>/dev/null</code>.  It is possible to
+      allow an open file handle to be passed into the child, while
+      controlling whether that handle remains open in the parent or
+      guaranteeing that the handle will be closed in the parent after
+      either virCommandRun or virCommandFree.
+    </p>
+
+<pre>
+  int sharedfd = open("cmd.log", "w+");
+  int childfd = open("conf.txt", "r");
+  virCommandPreserveFD(cmd, sharedfd);
+  virCommandTransferFD(cmd, childfd);
+  if (VIR_CLOSE(sharedfd) &lt; 0)
+      goto cleanup;
+</pre>
+
+    <p>
+      With this, both file descriptors sharedfd and childfd in the
+      current process remain open as the same file descriptors in the
+      child. Meanwhile, after the child is spawned, sharedfd remains
+      open in the parent, while childfd is closed.  For stdin/out/err
+      it is usually necessary to map a file handle. To attach file
+      descriptor 7 in the current process to stdin in the child:
+    </p>
+
+<pre>
+  virCommandSetInputFD(cmd, 7);
+</pre>
+
+    <p>
+      Equivalently to redirect stdout or stderr in the child,
+      pass in a pointer to the desired handle
+    </p>
+
+<pre>
+  int outfd = open("out.log", "w+");
+  int errfd = open("err.log", "w+");
+  virCommandSetOutputFD(cmd, &amp;outfd);
+  virCommandSetErrorFD(cmd, &amp;errfd);
+</pre>
+
+    <p>
+      Alternatively it is possible to request that a pipe be
+      created to fetch stdout/err in the parent, by initializing
+      the FD to -1.
+    </p>
+
+<pre>
+  int outfd = -1;
+  int errfd = -1
+  virCommandSetOutputFD(cmd, &amp;outfd);
+  virCommandSetErrorFD(cmd, &amp;errfd);
+</pre>
+
+    <p>
+      Once the command is running, <code>outfd</code>
+      and <code>errfd</code> will be initialized with
+      valid file handles that can be read from.  It is
+      permissible to pass the same pointer for both outfd
+      and errfd, in which case both standard streams in
+      the child will share the same fd in the parent.
+    </p>
+
+    <p>
+      Normally, file descriptors opened to collect output from a child
+      process perform blocking I/O, but the parent process can request
+      non-blocking mode:
+    </p>
+
+<pre>
+  virCommandNonblockingFDs(cmd);
+</pre>
+
+    <h3><a name="buffers">Feeding &amp; capturing strings to/from the child</a></h3>
+
+    <p>
+      Often dealing with file handles for stdin/out/err
+      is unnecessarily complex. It is possible to specify
+      a string buffer to act as the data source for the
+      child's stdin, if there are no embedded NUL bytes,
+      and if the command will be run with virCommandRun:
+    </p>
+
+<pre>
+  const char *input = "Hello World\n";
+  virCommandSetInputBuffer(cmd, input);
+</pre>
+
+    <p>
+      Similarly it is possible to request that the child's
+      stdout/err be redirected into a string buffer, if the
+      output is not expected to contain NUL bytes, and if
+      the command will be run with virCommandRun:
+    </p>
+
+<pre>
+  char *output = NULL, *errors = NULL;
+  virCommandSetOutputBuffer(cmd, &amp;output);
+  virCommandSetErrorBuffer(cmd, &amp;errors);
+</pre>
+
+    <p>
+      Once the command has finished executing, these buffers
+      will contain the output. It is the callers responsibility
+      to free these buffers.
+    </p>
+
+    <h3><a name="directory">Setting working directory</a></h3>
+
+    <p>
+      Daemonized commands are always run with "/" as the current
+      working directory.  All other commands default to running in the
+      same working directory as the parent process, but an alternate
+      directory can be specified:
+    </p>
+
+<pre>
+  virCommandSetWorkingDirectory(cmd, LOCALSTATEDIR);
+</pre>
+
+    <h3><a name="hooks">Any additional hooks</a></h3>
+
+    <p>
+      If anything else is needed, it is possible to request a hook
+      function that is called in the child after the fork, as the
+      last thing before changing directories, dropping capabilities,
+      and executing the new process.  If hook(opaque) returns
+      non-zero, then the child process will not be run.
+    </p>
+
+<pre>
+  virCommandSetPreExecHook(cmd, hook, opaque);
+</pre>
+
+    <h3><a name="logging">Logging commands</a></h3>
+
+    <p>
+      Sometimes, it is desirable to log what command will be run, or
+      even to use virCommand solely for creation of a single
+      consolidated string without running anything.
+    </p>
+
+<pre>
+  int logfd = ...;
+  char *timestamp = virTimestamp();
+  char *string = NULL;
+
+  dprintf(logfd, "%s: ", timestamp);
+  VIR_FREE(timestamp);
+  virCommandWriteArgLog(cmd, logfd);
+
+  string = virCommandToString(cmd);
+  if (string)
+      VIR_DEBUG("about to run %s", string);
+  VIR_FREE(string);
+  if (virCommandRun(cmd) &lt; 0)
+      return -1;
+</pre>
+
+    <h3><a name="sync">Running commands synchronously</a></h3>
+
+    <p>
+      For most commands, the desired behaviour is to spawn
+      the command, wait for it to complete &amp; exit and then
+      check that its exit status is zero
+    </p>
+
+<pre>
+  if (virCommandRun(cmd, NULL) &lt; 0)
+     return -1;
+</pre>
+
+    <p>
+      <strong>Note:</strong> if the command has been daemonized
+      this will only block &amp; wait for the intermediate process,
+      not the real command. <code>virCommandRun</code> will
+      report on any errors that have occured upon this point
+      with all previous API calls. If the command fails to
+      run, or exits with non-zero status an error will be
+      reported via normal libvirt error infrastructure. If a
+      non-zero exit status can represent a success condition,
+      it is possible to request the exit status and perform
+      that check manually instead of letting <code>virCommandRun</code>
+      raise the error
+    </p>
+
+<pre>
+  int status;
+  if (virCommandRun(cmd, &amp;status) &lt; 0)
+     return -1;
+
+  if (WEXITSTATUS(status) ...) {
+    ...do stuff...
+  }
+</pre>
+
+    <h3><a name="async">Running commands asynchronously</a></h3>
+
+    <p>
+      In certain complex scenarios, particularly special
+      I/O handling is required for the child's stdin/err/out
+      it will be necessary to run the command asynchronously
+      and wait for completion separately.
+    </p>
+
+<pre>
+  pid_t pid;
+  if (virCommandRunAsync(cmd, &amp;pid) &lt; 0)
+     return -1;
+
+  ... do something while pid is running ...
+
+  int status;
+  if (virCommandWait(cmd, &amp;status) &lt; 0)
+     return -1;
+
+  if (WEXITSTATUS(status)...) {
+     ..do stuff..
+  }
+</pre>
+
+    <p>
+      As with <code>virCommandRun</code>, the <code>status</code>
+      arg for <code>virCommandWait</code> can be omitted, in which
+      case it will validate that exit status is zero and raise an
+      error if not.
+    </p>
+
+
+    <h3><a name="release">Releasing resources</a></h3>
+
+    <p>
+      Once the command has been executed, or if execution
+      has been abandoned, it is necessary to release
+      resources associated with the <code>virCommandPtr</code>
+      object. This is done with:
+    </p>
+
+<pre>
+  virCommandFree(cmd);
+</pre>
+
+    <p>
+      There is no need to check if <code>cmd</code> is NULL
+      before calling <code>virCommandFree</code>. This scenario
+      is handled automatically. If the command is still running,
+      it will be forcably killed and cleaned up (via waitpid).
+    </p>
+
+    <h2><a name="example">Complete examples</a></h2>
+
+    <p>
+      This shows a complete example usage of the APIs roughly
+      using the libvirt source src/util/hooks.c
+    </p>
+
+<pre>
+int runhook(const char *drvstr, const char *id,
+            const char *opstr, const char *subopstr,
+            const char *extra) {
+  int ret;
+  char *path;
+  virCommandPtr cmd;
+
+  ret = virBuildPath(&amp;path, LIBVIRT_HOOK_DIR, drvstr);
+  if ((ret &lt; 0) || (path == NULL)) {
+      virHookReportError(VIR_ERR_INTERNAL_ERROR,
+                         _("Failed to build path for %s hook"),
+                         drvstr);
+      return -1;
+  }
+
+  cmd = virCommandNew(path);
+  VIR_FREE(path);
+
+  virCommandAddEnvPassCommon(cmd);
+
+  virCommandAddArgList(cmd, id, opstr, subopstr, extra, NULL);
+
+  virCommandSetInputBuffer(cmd, input);
+
+  ret = virCommandRun(cmd, NULL);
+
+  virCommandFree(cmd);
+
+  return ret;
+}
+</pre>
+
+    <p>
+      In this example, the command is being run synchronously.
+      A pre-formatted string is being fed to the command as
+      its stdin. The command takes four arguments, and has a
+      minimal set of environment variables passed down. In
+      this example, the code does not require any error checking.
+      All errors are reported by the <code>virCommandRun</code>
+      method, and the exit status from this is returned to
+      the caller to handle as desired.
+    </p>
+
+  </body>
+</html>
--- a/docs/sitemap.html.in
+++ b/docs/sitemap.html.in
@ -260,6 +260,10 @@
                <a href="api_extension.html">API extensions</a>
                <span>Adding new public libvirt APIs</span>
              </li>
+              <li>
+                <a href="internals/command.html">Spawning commands</a>
+                <span>Spawning commands from libvirt driver code</span>
+              </li>
            </ul>
          </li>
          <li>
--- a/docs/subsite.xsl
+++ b/docs/subsite.xsl
@ -0,0 +1,25 @@
+<?xml version="1.0"?>
+<xsl:stylesheet
+  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
+  xmlns:exsl="http://exslt.org/common"
+  exclude-result-prefixes="xsl exsl"
+  version="1.0">
+
+  <xsl:import href="page.xsl"/>
+
+  <xsl:output
+    method="xml"
+    encoding="UTF-8"
+    indent="yes"
+    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
+    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/>
+
+  <xsl:variable name="href_base" select="'../'"/>
+
+  <xsl:template match="/">
+    <xsl:apply-templates select="." mode="page">
+      <xsl:with-param name="pagename" select="$pagename"/>
+    </xsl:apply-templates>
+  </xsl:template>
+
+</xsl:stylesheet>