docs/ci/index.rst - platform/external/mesa3d - Git at Google

 Continuous Integration
 ======================

 GitLab CI
 ---------

 GitLab provides a convenient framework for running commands in response to git pushes.
 We use it to test merge requests (MRs) before merging them (pre-merge testing),
 as well as post-merge testing, for everything that hits ``master``
 (this is necessary because we still allow commits to be pushed outside of MRs,
 and even then the MR CI runs in the forked repository, which might have been
 modified and thus is unreliable).

 The CI runs a number of tests, from trivial build-testing to complex GPU rendering:

 - Build testing for a number of build systems, configurations and platforms
 - Sanity checks (``meson test`` & ``scons check``)
 - Some drivers (softpipe, llvmpipe, freedreno and panfrost) are also tested
   using `VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__
 - Replay of application traces

 A typical run takes between 20 and 30 minutes, although it can go up very quickly
 if the GitLab runners are overwhelmed, which happens sometimes. When it does happen,
 not much can be done besides waiting it out, or cancel it.

 Due to limited resources, we currently do not run the CI automatically
 on every push; instead, we only run it automatically once the MR has
 been assigned to ``Marge``, our merge bot.

 If you're interested in the details, the main configuration file is ``.gitlab-ci.yml``,
 and it references a number of other files in ``.gitlab-ci/``.

 If the GitLab CI doesn't seem to be running on your fork (or MRs, as they run
 in the context of your fork), you should check the "Settings" of your fork.
 Under "CI / CD" → "General pipelines", make sure "Custom CI config path" is
 empty (or set to the default ``.gitlab-ci.yml``), and that the
 "Public pipelines" box is checked.

 If you're having issues with the GitLab CI, your best bet is to ask
 about it on ``#freedesktop`` on Freenode and tag `Daniel Stone
 <https://gitlab.freedesktop.org/daniels>`__ (``daniels`` on IRC) or
 `Eric Anholt <https://gitlab.freedesktop.org/anholt>`__ (``anholt`` on
 IRC).

 The three gitlab CI systems currently integrated are:


 .. toctree::
    :maxdepth: 1

    bare-metal
    LAVA
    docker

 Intel CI
 --------

 The Intel CI is not yet integrated into the GitLab CI.
 For now, special access must be manually given (file a issue in
 `the Intel CI configuration repo <https://gitlab.freedesktop.org/Mesa_CI/mesa_jenkins>`__
 if you think you or Mesa would benefit from you having access to the Intel CI).
 Results can be seen on `mesa-ci.01.org <https://mesa-ci.01.org>`__
 if you are *not* an Intel employee, but if you are you
 can access a better interface on
 `mesa-ci-results.jf.intel.com <http://mesa-ci-results.jf.intel.com>`__.

 The Intel CI runs a much larger array of tests, on a number of generations
 of Intel hardware and on multiple platforms (x11, wayland, drm & android),
 with the purpose of detecting regressions.
 Tests include
 `Crucible <https://gitlab.freedesktop.org/mesa/crucible>`__,
 `VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__,
 `dEQP <https://android.googlesource.com/platform/external/deqp>`__,
 `Piglit <https://gitlab.freedesktop.org/mesa/piglit>`__,
 `Skia <https://skia.googlesource.com/skia>`__,
 `VkRunner <https://github.com/Igalia/vkrunner>`__,
 `WebGL <https://github.com/KhronosGroup/WebGL>`__,
 and a few other tools.
 A typical run takes between 30 minutes and an hour.

 If you're having issues with the Intel CI, your best bet is to ask about
 it on ``#dri-devel`` on Freenode and tag `Clayton Craft
 <https://gitlab.freedesktop.org/craftyguy>`__ (``craftyguy`` on IRC) or
 `Nico Cortes <https://gitlab.freedesktop.org/ngcortes>`__ (``ngcortes``
 on IRC).

 .. _CI-farm-expectations:

 CI farm expectations
 --------------------

 To make sure that testing of one vendor's drivers doesn't block
 unrelated work by other vendors, we require that a given driver's test
 farm produces a spurious failure no more than once a week.  If every
 driver had CI and failed once a week, we would be seeing someone's
 code getting blocked on a spurious failure daily, which is an
 unacceptable cost to the project.

 Additionally, the test farm needs to be able to provide a short enough
 turnaround time that we can get our MRs through marge-bot without the
 pipeline backing up.  As a result, we require that the test farm be
 able to handle a whole pipeline's worth of jobs in less than 5 minutes
 (to compare, the build stage is about 10 minutes, if you could get all
 your jobs scheduled on the shared runners in time.).

 If a test farm is short the HW to provide these guarantees, consider
 dropping tests to reduce runtime.
 ``VK-GL-CTS/scripts/log/bottleneck_report.py`` can help you find what
 tests were slow in a ``results.qpa`` file.  Or, you can have a job with
 no ``parallel`` field set and:

 .. code-block:: yaml

     variables:
       CI_NODE_INDEX: 1
       CI_NODE_TOTAL: 10

 to just run 1/10th of the test list.

 If a HW CI farm goes offline (network dies and all CI pipelines end up
 stalled) or its runners are consistenly spuriously failing (disk
 full?), and the maintainer is not immediately available to fix the
 issue, please push through an MR disabling that farm's jobs by adding
 '.' to the front of the jobs names until the maintainer can bring
 things back up.  If this happens, the farm maintainer should provide a
 report to mesa-dev@lists.freedesktop.org after the fact explaining
 what happened and what the mitigation plan is for that failure next
 time.
	Continuous Integration
	======================

	GitLab CI
	---------

	GitLab provides a convenient framework for running commands in response to git pushes.
	We use it to test merge requests (MRs) before merging them (pre-merge testing),
	as well as post-merge testing, for everything that hits ``master``
	(this is necessary because we still allow commits to be pushed outside of MRs,
	and even then the MR CI runs in the forked repository, which might have been
	modified and thus is unreliable).

	The CI runs a number of tests, from trivial build-testing to complex GPU rendering:

	- Build testing for a number of build systems, configurations and platforms
	- Sanity checks (``meson test`` & ``scons check``)
	- Some drivers (softpipe, llvmpipe, freedreno and panfrost) are also tested
	using `VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__
	- Replay of application traces

	A typical run takes between 20 and 30 minutes, although it can go up very quickly
	if the GitLab runners are overwhelmed, which happens sometimes. When it does happen,
	not much can be done besides waiting it out, or cancel it.

	Due to limited resources, we currently do not run the CI automatically
	on every push; instead, we only run it automatically once the MR has
	been assigned to ``Marge``, our merge bot.

	If you're interested in the details, the main configuration file is ``.gitlab-ci.yml``,
	and it references a number of other files in ``.gitlab-ci/``.

	If the GitLab CI doesn't seem to be running on your fork (or MRs, as they run
	in the context of your fork), you should check the "Settings" of your fork.
	Under "CI / CD" → "General pipelines", make sure "Custom CI config path" is
	empty (or set to the default ``.gitlab-ci.yml``), and that the
	"Public pipelines" box is checked.

	If you're having issues with the GitLab CI, your best bet is to ask
	about it on ``#freedesktop`` on Freenode and tag `Daniel Stone
	<https://gitlab.freedesktop.org/daniels>`__ (``daniels`` on IRC) or
	`Eric Anholt <https://gitlab.freedesktop.org/anholt>`__ (``anholt`` on
	IRC).

	The three gitlab CI systems currently integrated are:


	.. toctree::
	:maxdepth: 1

	bare-metal
	LAVA
	docker

	Intel CI
	--------

	The Intel CI is not yet integrated into the GitLab CI.
	For now, special access must be manually given (file a issue in
	`the Intel CI configuration repo <https://gitlab.freedesktop.org/Mesa_CI/mesa_jenkins>`__
	if you think you or Mesa would benefit from you having access to the Intel CI).
	Results can be seen on `mesa-ci.01.org <https://mesa-ci.01.org>`__
	if you are not an Intel employee, but if you are you
	can access a better interface on
	`mesa-ci-results.jf.intel.com <http://mesa-ci-results.jf.intel.com>`__.

	The Intel CI runs a much larger array of tests, on a number of generations
	of Intel hardware and on multiple platforms (x11, wayland, drm & android),
	with the purpose of detecting regressions.
	Tests include
	`Crucible <https://gitlab.freedesktop.org/mesa/crucible>`__,
	`VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__,
	`dEQP <https://android.googlesource.com/platform/external/deqp>`__,
	`Piglit <https://gitlab.freedesktop.org/mesa/piglit>`__,
	`Skia <https://skia.googlesource.com/skia>`__,
	`VkRunner <https://github.com/Igalia/vkrunner>`__,
	`WebGL <https://github.com/KhronosGroup/WebGL>`__,
	and a few other tools.
	A typical run takes between 30 minutes and an hour.

	If you're having issues with the Intel CI, your best bet is to ask about
	it on ``#dri-devel`` on Freenode and tag `Clayton Craft
	<https://gitlab.freedesktop.org/craftyguy>`__ (``craftyguy`` on IRC) or
	`Nico Cortes <https://gitlab.freedesktop.org/ngcortes>`__ (``ngcortes``
	on IRC).

	.. _CI-farm-expectations:

	CI farm expectations
	--------------------

	To make sure that testing of one vendor's drivers doesn't block
	unrelated work by other vendors, we require that a given driver's test
	farm produces a spurious failure no more than once a week. If every
	driver had CI and failed once a week, we would be seeing someone's
	code getting blocked on a spurious failure daily, which is an
	unacceptable cost to the project.

	Additionally, the test farm needs to be able to provide a short enough
	turnaround time that we can get our MRs through marge-bot without the
	pipeline backing up. As a result, we require that the test farm be
	able to handle a whole pipeline's worth of jobs in less than 5 minutes
	(to compare, the build stage is about 10 minutes, if you could get all
	your jobs scheduled on the shared runners in time.).

	If a test farm is short the HW to provide these guarantees, consider
	dropping tests to reduce runtime.
	``VK-GL-CTS/scripts/log/bottleneck_report.py`` can help you find what
	tests were slow in a ``results.qpa`` file. Or, you can have a job with
	no ``parallel`` field set and:

	.. code-block:: yaml

	variables:
	CI_NODE_INDEX: 1
	CI_NODE_TOTAL: 10

	to just run 1/10th of the test list.

	If a HW CI farm goes offline (network dies and all CI pipelines end up
	stalled) or its runners are consistenly spuriously failing (disk
	full?), and the maintainer is not immediately available to fix the
	issue, please push through an MR disabling that farm's jobs by adding
	'.' to the front of the jobs names until the maintainer can bring
	things back up. If this happens, the farm maintainer should provide a
	report to mesa-dev@lists.freedesktop.org after the fact explaining
	what happened and what the mitigation plan is for that failure next
	time.