| // Copyright 2016 The Kythe Authors. All rights reserved. |
| // |
| // Licensed under the Apache License, Version 2.0 (the "License"); |
| // you may not use this file except in compliance with the License. |
| // You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, software |
| // distributed under the License is distributed on an "AS IS" BASIS, |
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| // See the License for the specific language governing permissions and |
| // limitations under the License. |
| |
| = Indexing Generated Code |
| |
| :Revision: 1.0 |
| :toc2: |
| :toclevels: 3 |
| :priority: 999 |
| |
| Source code generators like link:https://www.gnu.org/software/flex/[Flex], |
| link:https://www.gnu.org/software/bison/[GNU Bison], and |
| link:http://www.swig.org/[SWIG] take a high-level description of a software |
| component and generate the code necessary to realize that component in a |
| lower-level or general-purpose programming language. Users browsing projects |
| that use these components usually want cross-references to take them |
| from use sites of a generated interface to the high-level code that brought |
| that interface into being. They do not normally want to see the generated |
| implementation, as this is often difficult (or uninteresting) to read. This |
| document describes how to encode information about generated code to permit |
| cross-language links. |
| |
| To make the discussion easier to understand let's pretend we are working with |
| two languages: SourceLang and TargetLang. SourceLang has `.source` file and TargetLang |
| has `.target` files. We also have a tool (generator) that can take generate |
| `foo.target` file from `foo.source` file. We have following components: |
| |
| * Source Indexer - Kythe indexer that takes `.source` files and outputs index |
| data. |
| * Target Indexer - Kythe indexer that takes `.target` files and outputs index |
| data. |
| * Generator - tool that produces `.target` files from `.source` files. |
| * Post processor - Kythe tool that takes all index data produced by all |
| indexers, processes it and outputs final Kythe graph that contains data |
| for both SourceLang and TargetLang. |
| |
| Now we want to teach Kythe how to create cross-references between generated |
| `foo.target` file and original `foo.source` file. The main idea is pretty simple: |
| Generator has to output extra data containing mapping of elements in `foo.target` |
| to the original elements from `foo.source`. Then when Target Indexer is indexing |
| `foo.target` it will use that mapping to output *generates* or *imputes* edges. |
| These edges connect nodes from `foo.target` with nodes in `foo.source`. |
| |
| Kythe doesn't require implementors to use one concrete approach for passing |
| mapping metadata and outputting *generates* and *imputes* edges. Below we |
| describe two different approaches, each has its own pros and cons. But in |
| both cases it is assumed that implementors can change Generator and Target |
| Indexer. If possible the *generates* approach is preferred as it requires less |
| post-processing work. |
| |
| TIP: You can find an example implementation at |
| link:https://github.com/kythe/kythe/tree/master/kythe/examples/proto[GitHub]. |
| The current sample web UI does not interpret the parts of the schema we will |
| use; this is a work in progress. |
| |
| == Java To JavaScript with *imputes* edges |
| |
| This approach is generic and works for any combination of SourceLang and |
| TargetLang. In this example we generate JavaScript files from Java file so |
| SourceLang is Java and TargetLang is JavaScript. Given `Color.java`: |
| |
| [source,java] |
| ------------------------------------------------------------------------------- |
| public enum Color { |
| RED; |
| } |
| ------------------------------------------------------------------------------- |
| |
| Generator produces `color.js`: |
| [source,javascript] |
| ------------------------------------------------------------------------------- |
| const Color = { |
| RED: 0, |
| }; |
| ------------------------------------------------------------------------------- |
| |
| === Changes to Generator |
| |
| To support cross-references betwen `color.js` and `Color.java` we need to update |
| Generator to output the following mapping data for `Color`, `RED` elements. |
| |
| [source,json] |
| ------------------------------------------------------------------------------- |
| { |
| "type": "kythe0", |
| "meta": [{ |
| "type": "anchor_anchor", |
| "source_begin": 13, |
| "source_end": 18, |
| "target_begin": 6, |
| "target_end": 11, |
| "edge": "/kythe/edge/imputes", |
| "source_vname": { |
| "corpus": "corpus", |
| "path": "path/to/Color.java" |
| } |
| }, { |
| "type": "anchor_anchor", |
| "source_begin": 22, |
| "source_end": 25, |
| "target_begin": 18, |
| "target_end": 21, |
| "edge": "/kythe/edge/imputes", |
| "source_vname": { |
| "corpus": "corpus", |
| "path": "path/to/Color.java" |
| } |
| }] |
| } |
| ------------------------------------------------------------------------------- |
| |
| This mapping has 2 `meta` entries. The first entry for `Color`, the second for |
| `RED`. Note: |
| |
| * Each entry doesn't contain names of elements. Each entry contains only |
| position of elements in the source (`Color.java`) and target (`color.js`) |
| files. |
| * Each position is defined as byte offset inside file and not as line/column. |
| This is required because in Kythe anchors are defined using byte offsets and |
| not line/column. In this example JavaScript indexer will process this |
| mapping and will need to output *anchor* for `Color.java` and indexer |
| doesn't have access to the `Color.java` file (it has access only to JS |
| files). Because of that JS indexer can't translate line/column to byte |
| offset. |
| * Entry doesn't contain vnames of elements in `Color.java` or `color.js` and |
| instead contains positions. VNames of nodes are internal details of each |
| indexer and subject to change. Generator usually a standalone tool that |
| doesn't know rules for producing vnames for specific language so it's |
| impossible for Generator to output vnames of nodes. If in your case |
| VNames are stable and well-specified you can use simpler approach |
| using *generates* described in `Protocol Buffer` section below. |
| |
| To pass this mapping to the JavaScript Indexer Generator will append it |
| as a comment at the last line of `color.js`: |
| |
| [source,javascript] |
| ------------------------------------------------------------------------------- |
| const Color = { |
| RED: 0, |
| }; |
| |
| // Kythe Indexing Metadata: |
| // {"type":"kythe0","meta":[{"type":"anchor_anchor","source_begin":13,... |
| ------------------------------------------------------------------------------- |
| |
| Inlining metadata inside `color.js` has benefit of avoiding passing extra |
| files to Indexer. All Indexer needs is to know that some JavaScript files can |
| contain metadata on the last line and parse it. |
| |
| One downside is that it adds noise to `color.js` but usually generated |
| files are invisible to developers so it's not a big concern. |
| |
| ==== Changes to JavaScript Indexer |
| |
| On JavaScript Indexer side we need to parse metadata and output *imputes* |
| edges. To parse metadata indexer can check last two lines of all `.js` files |
| and see if they contain `// Kythe Indexing Metadata:` and if so - parse |
| the last line as JSON. |
| |
| For each `meta` entry indexer should do the following: |
| |
| 1. Output an *anchor* using `source_begin` and `source_end`. `source_vname` |
| should be used as file containing the anchor. |
| 2. Find a JavaScript node that has *defines/binding* anchor with the same |
| `target_begin/end` position. |
| 3. Ouptut one *imputes* edge from the *anchor* created at step 1 to the node |
| found at step 2. |
| |
| Note that this only applies to meta entries with type `anchor_anchor`. For other |
| types structure might be different. See link:https://github.com/kythe/kythe/issues/3711[issue #3711]. |
| |
| Here is what JavaScript indexer outputs for the `Color` element using the |
| rules above: |
| |
| [kythe,dot,"JavaScript Indexer graph",0] |
| -------------------------------------------------------------------------------- |
| digraph G { |
| size="7,7"; |
| coloranchorjava [label="anchor\nColor.java:0:12-17"]; |
| redanchorjava [label="anchor\nColor.java:1:2-5"]; |
| coloranchorjs [label="anchor\ncolor.js:0:6-11"]; |
| redanchorjs [label="anchor\ncolor.js:0:2-5"]; |
| colornode [label="Color node\nin JS"]; |
| rednode [label="RED node\nin JS"]; |
| |
| coloranchorjs -> colornode [label = "defines/binding"]; |
| redanchorjs -> rednode [label = "defines/binding"]; |
| coloranchorjava -> colornode [label = "imputes"]; |
| redanchorjava -> rednode [label = "imputes"]; |
| } |
| -------------------------------------------------------------------------------- |
| |
| Output of Java Indexer looks like this: |
| |
| [kythe,dot,"Java Indexer graph",0] |
| -------------------------------------------------------------------------------- |
| digraph G { |
| size="7,7"; |
| coloranchorjava [label="anchor\nColor.java:0:12-17"]; |
| redanchorjava [label="anchor\nColor.java:1:2-5"]; |
| colornodejava [label="Color node\nin Java"]; |
| rednodejava [label="RED node\nin Java"]; |
| |
| coloranchorjava -> colornodejava [label = "defines/binding"]; |
| redanchorjava -> rednodejava [label = "defines/binding"]; |
| } |
| -------------------------------------------------------------------------------- |
| |
| === Post-processor |
| |
| Once Java and JavaScript Indexers finished their output is merged and |
| postprocessor finds all anchors that have both *defines/binding* and |
| *imputes* edges and creates *generates* edge: |
| |
| [kythe,dot,"Processed final graph",0] |
| -------------------------------------------------------------------------------- |
| digraph G { |
| size="7,7"; |
| coloranchorjava [label="anchor\nColor.java:0:12-17"]; |
| redanchorjava [label="anchor\nColor.java:1:2-5"]; |
| coloranchorjs [label="anchor\ncolor.js:0:6-11"]; |
| redanchorjs [label="anchor\ncolor.js:0:2-5"]; |
| colornode [label="Color node\nin JS"]; |
| rednode [label="RED node\nin JS"]; |
| colornodejava [label="Color node\nin Java"]; |
| rednodejava [label="RED node\nin Java"]; |
| |
| coloranchorjs -> colornode [label = "defines/binding"]; |
| redanchorjs -> rednode [label = "defines/binding"]; |
| coloranchorjava -> colornode [label = "imputes"]; |
| redanchorjava -> rednode [label = "imputes"]; |
| coloranchorjava -> colornodejava [label = "defines/binding"]; |
| redanchorjava -> rednodejava [label = "defines/binding"]; |
| colornodejava -> colornode [label = "generates"]; |
| rednodejava -> rednode [label = "generates"]; |
| } |
| -------------------------------------------------------------------------------- |
| |
| This is the end state. Now tools using Kythe graph can see that Color enum |
| in JS is generated by Color enum in Java and perform proper action (for example |
| IDE upon clicking on `Color` in JS file will go to the definition of `Color` |
| enum in java file. |
| |
| == Protocol Buffers with *generates* edges |
| |
| This approach is easier to implement compared to *imputes* approach described |
| above, but it requires tighter integration with Indexer and Generator. When |
| Generator outputs code it also adds a mapping as in the *imputes* approach, |
| but instead of mapping location to location it outputs VNames of nodes from |
| `foo.source`. It requires Generator to know exactly what VNames will be produced |
| by the Source Indexer. This approach is feasible when either VNames either |
| have simple stable form or Generator can reuse code from Source Indexer to |
| generate VNames. |
| |
| In this example we generate C++ files from Protocol buffer definitions. So |
| SourceLang is Protocol Buffers and TargetLang is C++. |
| |
| The Kythe project uses |
| link:https://developers.google.com/protocol-buffers/[protocol buffers] for |
| data interchange. The `protoc` compiler reads a domain-specific language |
| that describes messages and synthesizes code that serializes, deserializes, |
| and manipulates these messages. It can generate code in a number of different |
| target languages by swapping out backend components. These accept an encoding |
| of the message descriptions in the original source file and emit source text. |
| |
| [kythe,dot,"protoc architecture",0] |
| -------------------------------------------------------------------------------- |
| digraph G { |
| size="7,7"; |
| protosrc [label=".proto", shape=note]; |
| frontend [label="protoc", shape=rectangle]; |
| descriptor [label="descriptor", shape=note]; |
| backend [label="C++ language backend", shape=rectangle]; |
| ccsrc [label=".pb.h", shape=note]; |
| protosrc -> frontend; |
| frontend -> descriptor; |
| descriptor -> backend; |
| backend -> ccsrc; |
| } |
| -------------------------------------------------------------------------------- |
| |
| === Indexing `.proto` definitions |
| |
| `.proto` files are written in a domain-specific programming language for |
| describing various properties about messages and other data. It is interesting |
| to index these on their own, as messages in one `.proto` file may be used in |
| another `.proto` file. Here is a very simple example of the language: |
| |
| [source,c] |
| -------------------------------------------------------------------------------- |
| syntax = "proto3"; |
| package kythe.examples.proto.example; |
| |
| // A single proto message. |
| message Foo { |
| } |
| -------------------------------------------------------------------------------- |
| |
| This file describes the empty message `kythe.examples.proto.example.Foo` |
| using features from version 3 of the language. When run through `protoc` |
| with the appropriate options set, it will generate the interface `example.pb.h` |
| and the implementation `example.pb.cc`. These may be used to interact with |
| `Foo` messages in $$C++$$. |
| |
| As it turns out, `protoc` can be coerced into saving the descriptor that it |
| passes to its backends. Ordinarily, this descriptor would merely be an |
| abstract version of the `.proto` input file that discards syntax and records |
| only the details necessary to generate source code. If asked, `protoc` will |
| also keep track of source locations (`--include_source_info`) and data about |
| the `.proto` files that are (transitively) imported (`--include_imports`). |
| This information is sufficient to build a Kythe graph for a given `.proto` |
| definition file. It will become important later that every object that the |
| descriptor describes has an address, like "4.0", that corresponds (roughly) |
| to its position in the descriptor's AST. These addresses are used as keys into |
| the table that keeps track of source locations in the original `.proto` file. |
| |
| [kythe,dot,"protoc architecture with indexer",0] |
| -------------------------------------------------------------------------------- |
| digraph G { |
| size="7,7"; |
| protosrc [label=".proto", shape=note]; |
| frontend [label="protoc", shape=rectangle]; |
| descriptor [label="descriptor", shape=note]; |
| descriptorfile [label="FileDescriptorSet", shape=note, color=blue]; |
| indexer [label="Kythe proto_indexer", shape=rectangle, color=blue]; |
| backend [label="C++ language backend", shape=rectangle]; |
| ccsrc [label=".pb.h", shape=note]; |
| entries [label="Kythe entries", shape=note, color=blue]; |
| protosrc -> frontend; |
| frontend -> descriptor; |
| frontend -> descriptorfile [color=blue]; |
| protosrc -> indexer [color=blue]; |
| descriptorfile -> indexer [color=blue]; |
| descriptor -> backend; |
| backend -> ccsrc; |
| indexer -> entries [color=blue]; |
| } |
| -------------------------------------------------------------------------------- |
| |
| This extra information is stored as a file that contains a |
| `proto2.FileDescriptorSet` message, which in turn is a list of the |
| `proto2.FileDescriptorProto` messages used in the course of processing `.proto` |
| input. Note that this message does not contain `.proto` source text, so the |
| `proto_indexer` must have access to the original source files. |
| |
| We can add a verifier assertion to check that `Foo` declares a Kythe node: |
| |
| [source,c] |
| -------------------------------------------------------------------------------- |
| syntax = "proto3"; |
| package kythe.examples.proto.example; |
| |
| // A single proto message. |
| //- @Foo defines/binding MessageFoo? |
| message Foo { |
| } |
| -------------------------------------------------------------------------------- |
| |
| and see that it was unified with the appropriate VName: |
| |
| .Output |
| ---- |
| MessageFoo: EVar(... = App(vname, |
| (4.0, kythe, "", kythe/examples/proto/example.proto, protobuf))) |
| ---- |
| |
| == Using generated source code |
| |
| Imagine that we have a simple $$C++$$ user of our generated source code for |
| `Foo`. Its code, with a verifier assertion, looks like this: |
| |
| [source,c] |
| -------------------------------------------------------------------------------- |
| #include "kythe/examples/proto/example.pb.h" |
| |
| //- @Foo ref CxxFooDecl? |
| void UseProto(kythe::examples::proto::example::Foo* foo) { |
| } |
| -------------------------------------------------------------------------------- |
| |
| The Kythe pipeline for indexing our combined program looks like this: |
| |
| [kythe,dot,"first indexing pipeline",0] |
| -------------------------------------------------------------------------------- |
| digraph G { |
| size="7,7!"; |
| usersrc [label="proto_user.cc", shape=note, color=blue]; |
| ccextractor [label="C++ extractor", shape=rectangle, color=blue]; |
| kindex [label="proto_user.kindex", shape=note, color=blue]; |
| protosrc [label=".proto", shape=note]; |
| frontend [label="protoc", shape=rectangle]; |
| descriptor [label="descriptor", shape=note]; |
| descriptorfile [label="FileDescriptorSet", shape=note]; |
| indexer [label="Kythe proto_indexer", shape=rectangle]; |
| ccindexer [label="Kythe C++ indexer", shape=rectangle, color=blue]; |
| backend [label="C++ language backend", shape=rectangle]; |
| ccsrc [label=".pb.h", shape=note]; |
| entries [label="Kythe entries", shape=note]; |
| protosrc -> frontend; |
| frontend -> descriptor; |
| frontend -> descriptorfile; |
| protosrc -> indexer; |
| descriptorfile -> indexer; |
| descriptor -> backend; |
| backend -> ccsrc; |
| indexer -> entries; |
| usersrc -> ccextractor [color=blue]; |
| ccsrc -> ccextractor [color=blue]; |
| usersrc -> ccsrc [color=blue]; |
| ccextractor -> kindex [color=blue]; |
| kindex -> ccindexer [color=blue]; |
| ccindexer -> entries [color=blue]; |
| } |
| -------------------------------------------------------------------------------- |
| |
| When we use the verifier to inspect the resulting `CxxFooDecl`, we see that |
| it has not been unified with the VName for `Foo`: |
| |
| .Output |
| ---- |
| CxxFooDecl: EVar(... = |
| App(vname, (srl0y/pwih+G6wsjFLMTVKQPC7lLH3/9MVK2d2aJHeE=, |
| kythe, bazel-out/genfiles, kythe/examples/proto/example.pb.h, |
| c++))) |
| ---- |
| |
| This is because the `kythe::examples::proto::example::Foo` type is a $$C++$$ |
| type defined in `example.pb.h`. That it was defined in some original `.proto` |
| file has no meaning to the $$C++$$ compiler. Furthermore, the Kythe $$C++$$ |
| indexer has no understanding of the `protoc` language and the VNames that the |
| Kythe proto_indexer produces. |
| |
| Our goal is to add edges in the graph between `CxxFooDecl` and `MessageFoo` |
| so that clients can take into account their relationship when displaying |
| cross-references or answering other queries. We do not want to unify them in the |
| same node, as they are legitimately different objects. Users may wish to |
| navigate to the generated $$C++$$ code for `CxxFooDecl` or to view uses of |
| `MessageFoo` in other languages. To support these different uses, we will emit |
| a link:/docs/schema#generates[generates] edge such that `MessageFoo` |
| *generates* `CxxFooDecl`. Clients can choose to follow the edge or to disregard |
| it. |
| |
| Observe that the $$C++$$ indexer and `protoc` backend both observe the same |
| content in the `.pb.h` file; therefore, both programs see the same offsets |
| for various tokens. If the `protoc` backend were to link those offsets back |
| to the objects in the `FileDescriptorProto` using well-known names--and if the |
| Kythe proto_indexer guaranteed a particular mechanism for generating VNames |
| from those well-known names--we could close the loop in the $$C++$$ indexer by |
| emitting *generates* edges to the proto_indexer's nodes whenever the $$C++$$ |
| indexer trips over the `protoc` backend's marked offsets. |
| |
| In other words, if the `.pb.h` contained code like: |
| [source,c] |
| -------------------------------------------------------------------------------- |
| ... |
| class Foo { |
| ... |
| -------------------------------------------------------------------------------- |
| |
| and the `protoc` backend that generated it reported that the text range |
| `Foo` was associated with an object in its original `FileDescriptorProto` at |
| some location encoded as "4.0"—and the proto_indexer guaranteed it would |
| always emit objects with signatures based on their descriptor locations--the |
| $$C++$$ indexer would only need to watch for *defines/binding* edges starting at |
| that text range. Should such an edge be emitted, the $$C++$$ indexer would also |
| emit a *generates* edge to the `proto` node. |
| |
| === Annotations in `protoc` backends |
| |
| We have already seen how to command the `protoc` frontend to emit location |
| information for `.proto` source files. The frontend does not, however, know |
| anything about the source code that its various backends emit. We must pass |
| additional flags to these backends to get them to produce location information |
| as `proto2.GeneratedCodeInfo` messages. These messages connect byte offsets |
| in generated source code with paths in the `proto2.FileDescriptorProto` AST. |
| These paths are the same ones used by the `proto2.SourceCodeInfo` message that |
| the Kythe proto_indexer consumes; they are the paths we will use to link up |
| `protobuf` language nodes with the nodes for generated source code. |
| |
| Each `protoc` backend must be individually instrumented to produce |
| `proto2.GeneratedCodeInfo` messages. To turn annotation on for the $$C++$$ |
| backend, you can pass `--cpp_out=annotate_headers=1:normal/output/path` to |
| `protoc`. In practice, you will also need to provide an `annotation_pragma_name` |
| and an `annotation_guard_name`, so the full `cpp_out` value may look like |
| `annotate_headers=1,annotation_pragma_name=kythe_metadata,annotation_guard_name=KYTHE_IS_RUNNING:normal/output/path`. |
| |
| When `annotate_headers=1` is asserted to the $$C++$$ backend, it will generate |
| `.meta` files alongside any files with annotations. For example, in the same |
| directory as `example.pb.h`, you will find an `example.pb.h.meta` file. This |
| file contains a serialized `proto2.GeneratedCodeInfo` message. This message |
| contains a series of spans in `example.pb.h`, the filenames to the `.proto` |
| files that caused those spans to be generated, and the AST paths in the |
| `FileDescriptorProto` for those `.proto` files. `example.pb.h` explicitly |
| depends on `example.pb.h.meta` using a pragma and a preprocessor symbol: |
| |
| [source,c] |
| -------------------------------------------------------------------------------- |
| // Generated by the protocol buffer compiler. DO NOT EDIT! |
| // source: kythe/examples/proto/example.proto |
| |
| ... |
| |
| #ifdef KYTHE_IS_RUNNING |
| #pragma kythe_metadata "kythe/examples/proto/example.pb.h.meta" |
| #endif // KYTHE_IS_RUNNING |
| |
| ... |
| -------------------------------------------------------------------------------- |
| |
| The Kythe $$C++$$ extractor and indexer both understand what to do with this |
| pragma (and both define `KYTHE_IS_RUNNING`). The extractor will add the `.meta` |
| file to the `kindex` it produces; the indexer will load the `.meta` file, |
| translate it from `protoc` annotations to generic Kythe metadata, and use it |
| to append `generates` edges for `defines/binding` edges emitted from |
| `example.pb.h`. |
| |
| [kythe,dot,"first indexing pipeline",0] |
| -------------------------------------------------------------------------------- |
| digraph G { |
| size="7,7!"; |
| usersrc [label="proto_user.cc", shape=note]; |
| ccextractor [label="C++ extractor", shape=rectangle]; |
| kindex [label="proto_user.kindex", shape=note]; |
| protosrc [label=".proto", shape=note]; |
| frontend [label="protoc", shape=rectangle]; |
| descriptor [label="descriptor", shape=note]; |
| descriptorfile [label="FileDescriptorSet", shape=note]; |
| indexer [label="Kythe proto_indexer", shape=rectangle]; |
| ccindexer [label="Kythe C++ indexer", shape=rectangle]; |
| backend [label="C++ language backend", shape=rectangle]; |
| ccsrc [label=".pb.h", shape=note]; |
| ccmeta [label=".pb.h.meta", shape=note, color=blue]; |
| entries [label="Kythe entries", shape=note]; |
| protosrc -> frontend; |
| frontend -> descriptor; |
| frontend -> descriptorfile; |
| protosrc -> indexer; |
| descriptorfile -> indexer; |
| descriptor -> backend; |
| backend -> ccsrc; |
| backend -> ccmeta [color=blue]; |
| ccsrc -> ccmeta [color=blue]; |
| indexer -> entries; |
| usersrc -> ccsrc; |
| usersrc -> ccextractor; |
| ccsrc -> ccextractor; |
| ccmeta -> ccextractor [color=blue]; |
| ccextractor -> kindex; |
| kindex -> ccindexer; |
| ccindexer -> entries; |
| } |
| -------------------------------------------------------------------------------- |
| |
| Now we can write verifier assertions that show we have established a link |
| between the proto source and use sites of its generated code: |
| |
| [source,c] |
| -------------------------------------------------------------------------------- |
| #include "kythe/examples/proto/example.pb.h" |
| |
| //- @Foo ref CxxFooDecl |
| //- MessageFoo? generates CxxFooDecl |
| //- vname(_, "kythe", "", "kythe/examples/proto/example.proto", "protobuf") |
| //- defines/binding MessageFoo |
| void UseProto(kythe::examples::proto::example::Foo* foo) { |
| } |
| -------------------------------------------------------------------------------- |
| |
| .Output |
| ---- |
| MessageFoo: EVar(... = App(vname, |
| (4.0, kythe, "", kythe/examples/proto/example.proto, protobuf))) |
| ---- |
| |
| Of course, Kythe clients need to understand that *generates* edges should be |
| followed. Solving this problem is out of this document's scope. |
| |
| ==== Providing annotations for other languages |
| |
| To generate metadata for a different language backend, you must determine or |
| implement the following: |
| |
| * The `protoc` backend for the language must be able to produce |
| `proto2.GeneratedCodeInfo` buffers. |
| * There must be some way to signal to your indexer and extractor that a |
| `.meta` file is associated with a different source file. |
| * That `.meta` file must be made available to the extractor during extraction. |
| For hermetic build systems, this means that the target driving `protoc` must |
| list the `.meta` file as an output. Any target that uses that `protoc` |
| target must require the `.meta` file as an input. |
| * The indexer must read the `.meta` file and use it to emit `generates` |
| edges that connect up to the nodes produced by the Kythe proto_indexer. |
| |
| The method for annotating source code is designed such that it can |
| be implemented purely at the output stage; for example, if you have an |
| abstraction for emitting *defines/binding* edges from anchors, you can |
| check at every edge (starting from a file with loaded metadata) whether you |
| should emit an additional `generates` edge. |