|  | =================================================================== | 
|  | How To Add Your Build Configuration To LLVM Buildbot Infrastructure | 
|  | =================================================================== | 
|  |  | 
|  | Introduction | 
|  | ============ | 
|  |  | 
|  | This document contains information about adding a build configuration and | 
|  | buildbot-worker to private worker builder to LLVM Buildbot Infrastructure. | 
|  |  | 
|  | Buildmasters | 
|  | ============ | 
|  |  | 
|  | There are two buildmasters running. | 
|  |  | 
|  | * The main buildmaster at `<https://lab.llvm.org/buildbot>`_. All builders | 
|  | attached to this machine will notify commit authors every time they break | 
|  | the build. | 
|  | * The staging buildmaster at `<https://lab.llvm.org/staging>`_. All builders | 
|  | attached to this machine will be completely silent by default when the build | 
|  | is broken. This buildmaster is reconfigured every two hours with any new | 
|  | commits from the llvm-zorg repository. | 
|  |  | 
|  | In order to remain connected to the main buildmaster (and thus notify | 
|  | developers of failures), a builbot must: | 
|  |  | 
|  | * Be building a supported configuration.  Builders for experimental backends | 
|  | should generally be attached to staging buildmaster. | 
|  | * Be able to keep up with new commits to the main branch, or at a minimum | 
|  | recover to tip of tree within a couple of days of falling behind. | 
|  |  | 
|  | Additionally, we encourage all bot owners to point their bots towards the | 
|  | staging master during maintenance windows, instability troubleshooting, and | 
|  | such. | 
|  |  | 
|  | Roles & Expectations | 
|  | ==================== | 
|  |  | 
|  | Each buildbot has an owner who is the responsible party for addressing problems | 
|  | which arise with said buildbot.  We generally expect the bot owner to be | 
|  | reasonably responsive. | 
|  |  | 
|  | For some bots, the ownership responsibility is split between a "resource owner" | 
|  | who provides the underlying machine resource, and a "configuration owner" who | 
|  | maintains the build configuration.  Generally, operational responsibility lies | 
|  | with the "config owner".  We do expect "resource owners" - who are generally | 
|  | the contact listed in a workers attributes - to proxy requests to the relevant | 
|  | "config owner" in a timely manner. | 
|  |  | 
|  | Most issues with a buildbot should be addressed directly with a bot owner | 
|  | via email.  Please CC `Galina Kistanova <mailto:gkistanova@gmail.com>`_. | 
|  |  | 
|  | Steps To Add Builder To LLVM Buildbot | 
|  | ===================================== | 
|  | Volunteers can provide their build machines to work as build workers to | 
|  | public LLVM Buildbot. | 
|  |  | 
|  | Here are the steps you can follow to do so: | 
|  |  | 
|  | #. Check the existing build configurations to make sure the one you are | 
|  | interested in is not covered yet or gets built on your computer much | 
|  | faster than on the existing one. We prefer faster builds so developers | 
|  | will get feedback sooner after changes get committed. | 
|  |  | 
|  | #. The computer you will be registering with the LLVM buildbot | 
|  | infrastructure should have all dependencies installed and be able to | 
|  | build your configuration successfully. Please check what degree | 
|  | of parallelism (-j param) would give the fastest build.  You can build | 
|  | multiple configurations on one computer. | 
|  |  | 
|  | #. Install buildbot-worker (currently we are using buildbot version 2.8.4). | 
|  | This specific version can be installed using ``pip``, with a command such | 
|  | as ``pip3 install buildbot-worker==2.8.4``. | 
|  |  | 
|  | #. Create a designated user account, your buildbot-worker will be running under, | 
|  | and set appropriate permissions. | 
|  |  | 
|  | #. Choose the buildbot-worker root directory (all builds will be placed under | 
|  | it), buildbot-worker access name and password the build master will be using | 
|  | to authenticate your buildbot-worker. | 
|  |  | 
|  | #. Create a buildbot-worker in context of that buildbot-worker account. Point it | 
|  | to the **lab.llvm.org** port **9994** (see `Buildbot documentation, | 
|  | Creating a worker | 
|  | <http://docs.buildbot.net/current/tutorial/firstrun.html#creating-a-worker>`_ | 
|  | for more details) by running the following command: | 
|  |  | 
|  | .. code-block:: bash | 
|  |  | 
|  | $ buildbot-worker create-worker <buildbot-worker-root-directory> \ | 
|  | lab.llvm.org:9994 \ | 
|  | <buildbot-worker-access-name> \ | 
|  | <buildbot-worker-access-password> | 
|  |  | 
|  | Only once a new worker is stable, and | 
|  | approval from Galina has been received (see last step) should it | 
|  | be pointed at the main buildmaster. | 
|  |  | 
|  | Now start the worker: | 
|  |  | 
|  | .. code-block:: bash | 
|  |  | 
|  | $ buildbot-worker start <buildbot-worker-root-directory> | 
|  |  | 
|  | This will cause your new worker to connect to the staging buildmaster | 
|  | which is silent by default. | 
|  |  | 
|  | Try this once then check the log file | 
|  | ``<buildbot-worker-root-directory>/worker/twistd.log``. If your settings | 
|  | are correct you will see a refused connection. This is good and expected, | 
|  | as the credentials have not been established on both ends. Now stop the | 
|  | worker and proceed to the next steps. | 
|  |  | 
|  | #. Fill the buildbot-worker description and admin name/e-mail.  Here is an | 
|  | example of the buildbot-worker description:: | 
|  |  | 
|  | Windows 7 x64 | 
|  | Core i7 (2.66GHz), 16GB of RAM | 
|  |  | 
|  | g++.exe (TDM-1 mingw32) 4.4.0 | 
|  | GNU Binutils 2.19.1 | 
|  | cmake version 2.8.4 | 
|  | Microsoft(R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86 | 
|  |  | 
|  | See `here <http://docs.buildbot.net/current/manual/installation/worker.html>`_ | 
|  | for which files to edit. | 
|  |  | 
|  | #. Send a patch which adds your build worker and your builder to | 
|  | `zorg <https://github.com/llvm/llvm-zorg>`_. Use the typical LLVM | 
|  | `workflow <https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_. | 
|  |  | 
|  | * workers are added to ``buildbot/osuosl/master/config/workers.py`` | 
|  | * builders are added to ``buildbot/osuosl/master/config/builders.py`` | 
|  |  | 
|  | Please make sure your builder name and its builddir are unique through the | 
|  | file. | 
|  |  | 
|  | All new builders should default to using the "'collapseRequests': False" | 
|  | configuration.  This causes the builder to build each commit individually | 
|  | and not merge build requests.  To maximize quality of feedback to developers, | 
|  | we *strongly prefer* builders to be configured not to collapse requests. | 
|  | This flag should be removed only after all reasonable efforts have been | 
|  | exhausted to improve build times such that the builder can keep up with | 
|  | commit flow. | 
|  |  | 
|  | It is possible to allow email addresses to unconditionally receive | 
|  | notifications on build failure; for this you'll need to add an | 
|  | ``InformativeMailNotifier`` to ``buildbot/osuosl/master/config/status.py``. | 
|  | This is particularly useful for the staging buildmaster which is silent | 
|  | otherwise. | 
|  |  | 
|  | #. Send the buildbot-worker access name and the access password directly to | 
|  | `Galina Kistanova <mailto:gkistanova@gmail.com>`_, and wait until she | 
|  | lets you know that your changes are applied and buildmaster is | 
|  | reconfigured. | 
|  |  | 
|  | #. Make sure you can start the buildbot-worker and successfully connect | 
|  | to the silent buildmaster. Then set up your buildbot-worker to start | 
|  | automatically at the start up time.  See the buildbot documentation | 
|  | for help.  You may want to restart your computer to see if it works. | 
|  |  | 
|  | #. Check the status of your buildbot-worker on the `Waterfall Display (Staging) | 
|  | <http://lab.llvm.org/staging/#/waterfall>`_ to make sure it is | 
|  | connected, and the `Workers Display (Staging) | 
|  | <http://lab.llvm.org/staging/#/workers>`_ to see if administrator | 
|  | contact and worker information are correct. | 
|  |  | 
|  | #. At this point, you have a working builder connected to the staging | 
|  | buildmaster.  You can now make sure it is reliably green and keeps | 
|  | up with the build queue.  No notifications will be sent, so you can | 
|  | keep an unstable builder connected to staging indefinitely. | 
|  |  | 
|  | #. (Optional) Once the builder is stable on the staging buildmaster with | 
|  | several days of green history, you can choose to move it to the production | 
|  | buildmaster to enable developer notifications.  Please email `Galina | 
|  | Kistanova <mailto:gkistanova@gmail.com>`_ for review and approval. | 
|  |  | 
|  | To move a worker to production (once approved), stop your worker, edit the | 
|  | buildbot.tac file to change the port number from 9994 to 9990 and start it | 
|  | again. | 
|  |  | 
|  | Best Practices for Configuring a Fast Builder | 
|  | ============================================= | 
|  |  | 
|  | As mentioned above, we generally have a strong preference for | 
|  | builders which can build every commit as they come in.  This section | 
|  | includes best practices and some recommendations as to how to achieve | 
|  | that end. | 
|  |  | 
|  | The goal | 
|  | In 2020, the monorepo had just under 35 thousand commits.  This works | 
|  | out to an average of 4 commits per hour.  Already, we can see that a | 
|  | builder must cycle in less than 15 minutes to have a hope of being | 
|  | useful.  However, those commits are not uniformly distributed.  They | 
|  | tend to cluster strongly during US working hours.  Looking at a couple | 
|  | of recent (Nov 2021) working days, we routinely see ~10 commits per | 
|  | hour during peek times, with occasional spikes as high as ~15 commits | 
|  | per hour.  Thus, as a rule of thumb, we should plan for our builder to | 
|  | complete ~10-15 builds an hour. | 
|  |  | 
|  | Resource Appropriately | 
|  | At 10-15 builds per hour, we need to complete a new build on average every | 
|  | 4 to 6 minutes.  For anything except the fastest of hardware/build configs, | 
|  | this is going to be well beyond the ability of a single machine.  In buildbot | 
|  | terms, we likely going to need multiple workers to build requests in parallel | 
|  | under a single builder configuration.  For some rough back of the envelope | 
|  | numbers, if your build config takes e.g. 30 minutes, you will need something | 
|  | on the order of 5-8 workers.  If your build config takes ~2 hours, you'll | 
|  | need something on the order of 20-30 workers.  The rest of this section | 
|  | focuses on how to reduce cycle times. | 
|  |  | 
|  | Restrict what you build and test | 
|  | Think hard about why you're setting up a bot, and restrict your build | 
|  | configuration as much as you can.  Basic functionality is probably | 
|  | already covered by other bots, and you don't need to duplicate that | 
|  | testing.  You only need to be building and testing the *unique* parts | 
|  | of the configuration.  (e.g. For a multi-stage clang builder, you probably | 
|  | don't need to be enabling every target or building all the various utilities.) | 
|  |  | 
|  | It can sometimes be worthwhile splitting a single builder into two or more, | 
|  | if you have multiple distinct purposes for the same builder.  As an example, | 
|  | if you want to both a) confirm that all of LLVM builds with your host | 
|  | compiler, and b) want to do a multi-stage clang build on your target, you | 
|  | may be better off with two separate bots.  Splitting increases resource | 
|  | consumption, but makes it easy for each bot to keep up with commit flow. | 
|  | Additionally, splitting bots may assist in triage by narrowing attention to | 
|  | relevant parts of the failing configuration. | 
|  |  | 
|  | In general, we recommend Release build types with Assertions enabled.  This | 
|  | generally provides a good balance between build times and bug detection for | 
|  | most buildbots.  There may be room for including some debug info (e.g. with | 
|  | `-gmlt`), but in general the balance between debug info quality and build | 
|  | times is a delicate one. | 
|  |  | 
|  | Use Ninja & LLD | 
|  | Ninja really does help build times over Make, particularly for highly | 
|  | parallel builds.  LLD helps to reduce both link times and memory usage | 
|  | during linking significantly.  With a build machine with sufficient | 
|  | parallelism, link times tend to dominate critical path of the build, and are | 
|  | thus worth optimizing. | 
|  |  | 
|  | Use CCache and NOT incremental builds | 
|  | Using ccache materially improves average build times.  Incremental builds | 
|  | can be slightly faster, but introduce the risk of build corruption due to | 
|  | e.g. state changes, etc...  At this point, the recommendation is not to | 
|  | use incremental builds and instead use ccache as the latter captures the | 
|  | majority of the benefit with less risk of false positives. | 
|  |  | 
|  | One of the non-obvious benefits of using ccache is that it makes the | 
|  | builder less sensitive to which projects are being monitored vs built. | 
|  | If a change triggers a build request, but doesn't change the build output | 
|  | (e.g. doc changes, python utility changes, etc..), the build will entirely | 
|  | hit in cache and the build request will complete in just the testing time. | 
|  |  | 
|  | With multiple workers, it is tempting to try to configure a shared cache | 
|  | between the workers.  Experience to date indicates this is difficult to | 
|  | well, and that having local per-worker caches gets most of the benefit | 
|  | anyways.  We don't currently recommend shared caches. | 
|  |  | 
|  | CCache does depend on the builder hardware having sufficient IO to access | 
|  | the cache with reasonable access times - i.e. a fast disk, or enough memory | 
|  | for a RAM cache, etc..  For builders without, incremental may be your best | 
|  | option, but is likely to require higher ongoing involvement from the | 
|  | sponsor. | 
|  |  | 
|  | Enable batch builds | 
|  | As a last resort, you can configure your builder to batch build requests. | 
|  | This makes the build failure notifications markedly less actionable, and | 
|  | should only be done once all other reasonable measures have been taken. | 
|  |  | 
|  | Leave it on the staging buildmaster | 
|  | While most of this section has been biased towards builders intended for | 
|  | the main buildmaster, it is worth highlighting that builders can run | 
|  | indefinitely on the staging buildmaster.  Such a builder may still be | 
|  | useful for the sponsoring organization, without concern of negatively | 
|  | impacting the broader community.  The sponsoring organization simply | 
|  | has to take on the responsibility of all bisection and triage. | 
|  |  | 
|  |  |