blob: ef11bb41560a2fe3dc4c81cf9edf72733593f6ba [file] [log] [blame]
<html devsite>
<head>
<title>VTS Dashboard Database</title>
<meta name="project_path" value="/_project.yaml" />
<meta name="book_path" value="/_book.yaml" />
</head>
<body>
<!--
Copyright 2017 The Android Open Source Project
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>
To support a continuous integration dashboard that is scalable, performant, and
flexible, the VTS Dashboard backend must be carefully designed with a strong
understanding of database functionality.
<a href="https://cloud.google.com/datastore/docs/" class="external">Google Cloud
Datastore</a> is a NoSQL database that offers transactional ACID guarantees and
eventual consistency as well as strong consistency within entity groups.
However, the structure is very different than SQLdatabases (and even Cloud
Bigtable); instead of tables, rows, and cells there are kinds, entities, and
properties.
</p>
<p>
The following sections outline the data structure and querying patterns for
creating an effective backend for the VTS Dashboard web service.
</p>
<h2 id=entities>Entities</h2>
<p>
The following entities store summaries and resources from VTS test runs:
</p>
<ul>
<li><strong>Test Entity</strong>. Stores metadata about test runs of a
particular test. Its key is the test name and its properties include the failure
count, passing count, and list of test case breakages from when the alert jobs
update it.</li>
<li><strong>Test Run Entity</strong>. Contains metadata from runs of a
particular test. It must store the test start and end timestamps, the test build
ID, the number of passing and failing test cases, the type of run (e.g.
pre-submit, post-submit, or local), a list of log links, the host machine name,
and coverage summary counts.</li>
<li><strong>Device Information Entity</strong>. Contains details about the
devices used during the test run. It includes the device build ID, product name,
build target, branch, and ABI information. This is stored separately from the
test run entity to support multi-device test runs in a one-to-many fashion.</li>
<li><strong>Profiling Point Run Entity</strong>. Summarizes the data gathered
for a particular profiling point within a test run. It describes the axis
labels, profiling point name, values, type, and regression mode of the profiling
data.</li>
<li><strong>Coverage Entity</strong>. Describes the coverage data gathered for
one file. It contains the Git project information, file path, and the list of
coverage counts per line in the source file.</li>
<li><strong>Test Case Run Entity</strong>. Describes the outcome of a particular
test case from a test run, including the test case name and its result.</li>
<li><strong>User Favorites Entity</strong>. Each user subscription can be
represented in an entity containing a reference to the test and the user ID
generated from the App Engine user service. This allows for efficient
bi-directional querying (i.e. for all users subscribed to a test and for all
tests favorited by a user).</li>
</ul>
<h2 id=entity-grouping>Entity grouping</h2>
<p>
Each test module represents the root of an entity group. Test run entities
are both children of this group and parents for device entities, profiling point
entities, and coverage entities relevant to the respective test and test run
ancestor.
</p>
<img src="images/treble_vts_dash_entity_ancestry.png">
<figcaption><strong>Figure 1</strong>. Test entity ancestry.</figcaption>
<p class="key-point"><strong>Key Point:</strong> When designing ancestry
relationships, you must balance the need to provide effective and consistent
querying mechanisms against the limitations enforced by the database.
</p>
<h3 id=benefits>Benefits</h3>
<p>
The consistency requirement ensures that future operations will not see the
effects of a transaction until it commits, and that transactions in the past are
visible to present operations. In Cloud Datastore, entity grouping creates
islands of strong read and write consistency within the group, which in this
case is all of test runs and data related to a test module. This offers the
following benefits:
</p>
<ul>
<li>Reads and updates to test module state by alert jobs can be treated as
atomic</li>
<li>Guaranteed consistent view of test case results within test modules</li>
<li>Faster querying within ancestry trees</li>
</ul>
<h3 id=limitations>Limitations</h3>
<p>
Writing to an entity group at a rate faster than one entity per second is not
advised as some writes may be rejected. As long as the alert jobs and the
uploading does not happen at a rate faster than one write per second, the
structure is solid and guarantees strong consistency.
</p>
<p>
Ultimately, the cap of one write per test module per second is reasonable because
test runs usually take at least one minute including the overhead of the VTS
framework; unless a test is consistently being executed simultaneously on more
than 60 different hosts, there cannot be a write bottleneck. This becomes even
more unlikely given that each module is part of a test plan which often takes
longer than one hour. Anomalies can easily be handled if hosts run the tests at
the same time, causing short bursts of writes to the same hosts (e.g. by
catching write errors and trying again).
</p>
<h3 id=scaling>Scaling considerations</h3>
<p>
A test run doesn't necessarily need to have the test as its parent (e.g. it
could take some other key and have test name, test start time as properties);
however, this will exchange strong consistency for eventual consistency. For
instance, the alert job may not see a mutually consistent snapshot of the most
recent test runs within a test module, which means that the global state may not
depict a fully accurate representation of sequence of test runs. This may also
impact the display of test runs within a single test module, which may not
necessarily be a consistent snapshot of the run sequence. Eventually the
snapshot will be consistent, but there are no guarantees the freshest data
will be.
</p>
<h2 id=test-cases>Test cases</h2>
<p>
Another potential bottleneck is large tests with many test cases. The two
operative constraints are the write throughput maximum within of an entity group
of one per second, along with a maximum transaction size of 500 entities.
</p>
<p>
One approach would be to specify a test case that has a test run as an ancestor
(similar to how coverage data, profiling data, and device information
are stored):
</p>
<img src="images/treble_vts_descend_not.png">
<figcaption><strong>Figure 2</strong>. Test Cases descend from Test Runs (NOT
RECOMMENDED).</figcaption>
<p>While this approach offers atomicity and consistency, it imposes strong
limitations on tests: If a transaction is limited to 500 entities, then a test
can have no more than 498 test cases (assuming no coverage or profiling data).
If a test were to exceed this, then a single transaction could not write all of
the test case results at once, and dividing the test cases into separate
transactions could exceed the maximum entity group write throughput of one
iteration per second. As this solution will not scale well without sacrificing
performance, it is not recommended.
</p>
<p>
However, instead of storing the test case results as children of the test run,
the test cases can be stored independently and their keys provided to the test
run (a test run contains a list of identifiers to its test cases entities):
</p>
<img src="images/treble_vts_descend.png">
<figcaption><strong>Figure 3</strong>. Test Cases stored independently
(RECOMMENDED).</figcaption>
<p>
At first glance, this may appear to break the strong consistency guarantee.
However, if the client has a test run entity and a list of test case
identifiers, it doesn't need to construct a query; it can instead directly get
the test cases by their identifiers, which is always guaranteed to be
consistent. This approach vastly alleviates the constraint on the number of test
cases a test run may have while gaining strong consistency without threatening
excessive writing within an entity group.
</p>
<h2 id=patterns>Data access patterns</h2>
<p>
The VTS Dashboard uses the following data access patterns:
</p>
<ul>
<li><strong>User favorites</strong>. Can be queried for by using an equality
filter on user favorites entities having the particular App Engine User object
as a property.</li>
<li><strong>Test listing</strong>. Simple query of test entities. To reduce
bandwidth to render the home page, a projection can be used on passing and
failing counts so as to omit the potentially long listing of failed test case
IDs and other metadata used by the alerting jobs.</li>
<li><strong>Test runs</strong>. Querying for test run entities requires a sort
on the key (timestamp) and possible filtering on the test run properties such as
build ID, passing count, etc. By performing an ancestor query with a test entity
key, the read is strongly consistent. At this point, all of the test case
results can be retrieved using the list of IDs stored in a test run property;
this also is guaranteed to be a strongly consistent outcome by the nature of
datastore get operations.</li>
<li><strong>Profiling and coverage data</strong>. Querying for profiling or
coverage data associated with a test can be done without also retrieving any
other test run data (such as other profiling/coverage data, test case data,
etc.). An ancestor query using the test test and test run entity keys will
retrieve all profiling points recorded during the test run; by also filtering on
the profiling point name or filename, a single profiling or coverage entity can
be retrieved. By the nature of ancestor queries, this operation is strongly
consistent.</li>
</ul>
<p>
For details on the UI and screenshots of these data patterns in action, see
<a href="ui.html">VTS Dashboard UI</a>.
</p>
</body>
</html>