gopls/internal/lsp/cache: make type-checking incremental

In this CL type-checked packages are made entirely independent of each
other, and package export data and indexes are stored in a file cache.
As a result, gopls uses significantly less memory, and (with a warm
cache) starts significantly faster. Other benchmarks have regressed
slightly due to the additional I/O and export data loading, but not
significantly so, and we have some ideas for how to further narrow or
even close the performance gap.

In the benchmarks below, based on the x/tools repository, we can see
that in-use memory was reduced by 88%, and startup time with a warm
cache by 65% (this is the best case where nothing has changed). Other
benchmarks regressed by 10-50%, much of which can be addressed by
improvements to the objectpath package (golang/go#51017), and by making
package data serialization asynchronous to type-checking.

Notably, we observe larger regressions in implementations, references,
and rename because the index implementations (by Alan Donovan) preceded
this change to type-checking, and so these benchmark statistics compare
in-memory index performance to on-disk index performance. Again, we can
optimize these if necessary by keeping certain index information in
memory, or by decoding more selectively.

name                              old in_use_bytes  new in_use_bytes  delta
InitialWorkspaceLoad/tools-12            432M ± 2%          50M ± 2%    -88.54%  (p=0.000 n=10+10)

name                              old time/op       new time/op       delta
StructCompletion/tools-12              27.2ms ± 5%       31.8ms ± 9%    +16.99%  (p=0.000 n=9+9)
ImportCompletion/tools-12              2.07ms ± 8%       2.21ms ± 6%     +6.64%  (p=0.004 n=9+9)
SliceCompletion/tools-12               29.0ms ± 5%       32.7ms ± 5%    +12.78%  (p=0.000 n=10+9)
FuncDeepCompletion/tools-12            39.6ms ± 6%       39.3ms ± 3%       ~     (p=0.853 n=10+10)
CompletionFollowingEdit/tools-12       72.7ms ± 7%      108.1ms ± 7%    +48.59%  (p=0.000 n=9+9)
Definition/tools-12                     525µs ± 6%        601µs ± 2%    +14.33%  (p=0.000 n=9+10)
DidChange/tools-12                     6.17ms ± 7%       6.77ms ± 2%     +9.64%  (p=0.000 n=10+10)
Hover/tools-12                         2.11ms ± 5%       2.61ms ± 3%    +23.87%  (p=0.000 n=10+10)
Implementations/tools-12               4.04ms ± 3%      60.19ms ± 3%  +1389.77%  (p=0.000 n=9+10)
InitialWorkspaceLoad/tools-12           3.84s ± 4%        1.33s ± 2%    -65.47%  (p=0.000 n=10+9)
References/tools-12                    9.72ms ± 6%      24.28ms ± 6%   +149.83%  (p=0.000 n=10+10)
Rename/tools-12                         121ms ± 8%        168ms ±12%    +38.92%  (p=0.000 n=10+10)
WorkspaceSymbols/tools-12              14.4ms ± 6%       15.6ms ± 3%     +8.76%  (p=0.000 n=9+10)

This CL is one step closer to the end* of a long journey to reduce
memory usage and statefulness in gopls, so that it can be more
performant and reliable.

Specifically, this CL implements a new type-checking pass that loads and
stores export data, cross references, serialized diagnostics, and method
set indexes in the file system. Concurrent type-checking passes may
share in-progress work, but after type-checking only active packages are
kept in memory. Consequently, there can be no global relationship
between type-checked packages. The work to break any dependence on
global relationships was done over a long time leading up to this CL.

In order to approach the previous type-checking performance, the
following new optimizations are made:
 - the global FileSet is completely removed: repeatedly importing from
   export data resulted in a tremendous amount of unnecessary token.File
   information, and so FileSets had to be scoped to packages
 - files are parsed as a batch and stored in the LRU cache implemented
   in the preceding CL
 - type-checking is also turned into a batch process, so that
   overlapping nodes in the package graph may be shared during large
   type-checking operations such as the initial workspace load

This new execution model enables several simplifications:
 - We no longer need to trim the AST before type-checking:
   TypeCheckMode and ParseExported are gone.
 - We no longer need to do careful bookkeeping around parsed files: all
   parsing uses the LRU parse cache.
 - It is no longer necessary to estimate cache heap usage in debug
   information.

There is still much more to do. This new model for gopls's execution
requires significant testing and experimentation. There may be new bugs
in the complicated new algorithms that enable this change, or bugs
related to the new reliance on export data (this may be the first time
export data for packages with type errors is significantly exercised).
There may be new environments where the new execution model does not
have the same beneficial effect. (On the other hand, there may be
some where it has an even more beneficial effect, such as resource
limited environments like dev containers.) There are also a lot of new
opportunities for optimization now that we are no longer tied to a rigid
structure of in-memory data.

*Furthermore, the following planned work is simply not done yet:
 - Implement precise pruning based on "deep" hash of imports.
 - Rewrite unimported completion, now that we no longer have cached
   import paths.

For golang/go#57987

Change-Id: Iedfc16656f79e314be448b892b710b9e63f72551
Reviewed-on: https://go-review.googlesource.com/c/tools/+/466975
Run-TryBot: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopls-CI: kokoro <noreply+kokoro@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
57 files changed
tree: 6fdeca77bd2b0aaf4b863edb11ae539e37b2ed08
  1. benchmark/
  2. blog/
  3. cmd/
  4. container/
  5. copyright/
  6. cover/
  7. go/
  8. godoc/
  9. gopls/
  10. imports/
  11. internal/
  12. playground/
  13. present/
  14. refactor/
  15. txtar/
  16. .gitattributes
  17. .gitignore
  18. .prettierrc
  19. codereview.cfg
  20. CONTRIBUTING.md
  21. go.mod
  22. go.sum
  23. LICENSE
  24. PATENTS
  25. README.md
README.md

Go Tools

PkgGoDev

This repository provides the golang.org/x/tools module, comprising various tools and packages mostly for static analysis of Go programs, some of which are listed below. Use the “Go reference” link above for more information about any package.

It also contains the golang.org/x/tools/gopls module, whose root package is a language-server protocol (LSP) server for Go. An LSP server analyses the source code of a project and responds to requests from a wide range of editors such as VSCode and Vim, allowing them to support IDE-like functionality.

Selected commands:

  • cmd/goimports formats a Go program like go fmt and additionally inserts import statements for any packages required by the file after it is edited.
  • cmd/callgraph prints the call graph of a Go program.
  • cmd/digraph is a utility for manipulating directed graphs in textual notation.
  • cmd/stringer generates declarations (including a String method) for “enum” types.
  • cmd/toolstash is a utility to simplify working with multiple versions of the Go toolchain.

These commands may be fetched with a command such as

go install golang.org/x/tools/cmd/goimports@latest

Selected packages:

  • go/ssa provides a static single-assignment form (SSA) intermediate representation (IR) for Go programs, similar to a typical compiler, for use by analysis tools.

  • go/packages provides a simple interface for loading, parsing, and type checking a complete Go program from source code.

  • go/analysis provides a framework for modular static analysis of Go programs.

  • go/callgraph provides call graphs of Go programs using a variety of algorithms with different trade-offs.

  • go/ast/inspector provides an optimized means of traversing a Go parse tree for use in analysis tools.

  • go/cfg provides a simple control-flow graph (CFG) for a Go function.

  • go/expect reads Go source files used as test inputs and interprets special comments within them as queries or assertions for testing.

  • go/gcexportdata and go/gccgoexportdata read and write the binary files containing type information used by the standard and gccgo compilers.

  • go/types/objectpath provides a stable naming scheme for named entities (“objects”) in the go/types API.

Numerous other packages provide more esoteric functionality.

Contributing

This repository uses Gerrit for code changes. To learn how to submit changes, see https://golang.org/doc/contribute.html.

The main issue tracker for the tools repository is located at https://github.com/golang/go/issues. Prefix your issue with “x/tools/(your subdir):” in the subject line, so it is easy to find.

JavaScript and CSS Formatting

This repository uses prettier to format JS and CSS files.

The version of prettier used is 1.18.2.

It is encouraged that all JS and CSS code be run through this before submitting a change. However, it is not a strict requirement enforced by CI.