Skip to content

Instantly share code, notes, and snippets.

@LegalizeAdulthood
Created November 11, 2025 05:18
Show Gist options
  • Select an option

  • Save LegalizeAdulthood/8710fc0daf0bd13e5841a25d0ca1f91a to your computer and use it in GitHub Desktop.

Select an option

Save LegalizeAdulthood/8710fc0daf0bd13e5841a25d0ca1f91a to your computer and use it in GitHub Desktop.

Clangd Matcher Generation LSP Extension

Overview

This document outlines a plan for extending the Language Server Protocol (LSP) to enable editors to obtain clang-query matcher strings for code at the cursor position. This feature will help developers generate matcher expressions that can be used as the basis for clang-tidy checks or standalone clang-query queries.

Table of Contents

  1. Motivation
  2. Design Goals
  3. LSP Extension Design
  4. Implementation Strategy
  5. Testing Approach
  6. Usage Examples
  7. Future Enhancements

Motivation

Problem Statement

Writing AST matchers for clang-query or clang-tidy checks requires:

  • Deep knowledge of the Clang AST structure
  • Manual inspection of AST dumps to understand node types
  • Trial and error to construct correct matcher expressions
  • Time-consuming iteration to match specific code patterns

Solution

An LSP extension that automatically generates clang-query matcher strings from cursor positions would:

  • Enable developers to instantly see the matcher for any code construct
  • Provide a starting point for writing custom clang-tidy checks
  • Reduce the learning curve for working with AST matchers
  • Accelerate the development of code analysis tools

Design Goals

Primary Goals

  1. Precision: Generate matchers that uniquely identify the AST node at the cursor position
  2. Usability: Provide matcher strings that work directly with clang-query
  3. Context-Awareness: Include sufficient context to make matchers specific but not overly rigid
  4. Performance: Leverage clangd's existing AST infrastructure for efficiency

Secondary Goals

  1. Configurability: Allow users to control matcher specificity and depth
  2. Educational: Help users learn AST matcher syntax through examples
  3. Integration: Work seamlessly with existing clangd features
  4. Extensibility: Support custom matcher generation strategies

LSP Extension Design

Custom LSP Request Method

Request

Method: clangd/generateMatcher

Parameters:

interface GenerateMatcherParams {
  // Standard LSP document identifier
  textDocument: TextDocumentIdentifier;
  
  // Cursor position in the document
  position: Position;
  
  // Optional: Control matcher specificity
  matcherMode?: 'exact' | 'contextual' | 'pattern';
  // - 'exact': Generate highly specific matcher (default)
  // - 'contextual': Include parent/ancestor context
  // - 'pattern': Generate more general matcher template
  
  // Optional: Include location constraints
  includeLocation?: boolean;
  
  // Optional: Include parent context in matcher
  includeParentContext?: boolean;
  
  // Optional: Maximum depth for nested matchers
  maxDepth?: number;
}

Response

interface GenerateMatcherResult {
  // The generated clang-query matcher string
  matcher: string;
  
  // The AST node type (e.g., "FunctionDecl", "CallExpr")
  nodeType: string;
  
  // Optional: Human-readable description
  description?: string;
  
  // Optional: Additional matcher variants
  alternatives?: string[];
  
  // Optional: Confidence score (0-1)
  confidence?: number;
}

Error Response

interface GenerateMatcherError {
  code: number;
  message: string;
  // Possible error codes:
  // -1: No AST node at position
  // -2: Unsupported node type
  // -3: AST not available
}

Server Capability Advertisement

During initialization, clangd advertises support for the extension:

{
  "capabilities": {
    "experimental": {
      "generateMatcher": true,
      "generateMatcherOptions": {
        "supportedModes": ["exact", "contextual", "pattern"],
        "supportsAlternatives": true,
        "supportsLocationConstraints": false
      }
    }
  }
}

Implementation Strategy

Architecture Overview

???????????????????????????????????????????????????????????????
?              Editor/Client                                  ?
?  (VS Code, Neovim, etc.)                                    ?
???????????????????????????????????????????????????????????????
           ? LSP Request
           ? clangd/generateMatcher
           ?
???????????????????????????????????????????????????????????????
?            ClangdLSPServer  ?
?  - Request validation       ?
?  - Response formatting   ?
???????????????????????????????????????????????????????????????
       ?
      ?
???????????????????????????????????????????????????????????????
?          ClangdServer      ?
?  - AST retrieval    ?
?  - Coordinate matcher generation            ?
???????????????????????????????????????????????????????????????
          ?
          ?
???????????????????????????????????????????????????????????????
?    MatcherGenerator       ?
?  - Node identification           ?
?  - Matcher string generation  ?
?  - Context analysis               ?
???????????????????????????????????????????????????????????????

Implementation in clangd

1. ClangdServer Extension

File: clang-tools-extra/clangd/ClangdServer.h

class ClangdServer {
public:
  // Existing methods...
  
  /// Generate a clang-query matcher string for the AST node at the
  /// specified position.
  void generateMatcher(
      PathRef File,
      Position Pos,
      const GenerateMatcherOptions &Options,
      Callback<GenerateMatcherResult> CB);
};

File: clang-tools-extra/clangd/ClangdServer.cpp

void ClangdServer::generateMatcher(
    PathRef File,
    Position Pos,
    const GenerateMatcherOptions &Options,
 Callback<GenerateMatcherResult> CB) {
  
  auto Action = [Pos, Options, CB = std::move(CB)](
      Expected<InputsAndAST> InpAST) mutable {
    if (!InpAST)
      return CB(InpAST.takeError());
    
    auto &AST = InpAST->AST;
    auto Offset = positionToOffset(InpAST->Inputs.Contents, Pos);
    if (!Offset)
return CB(Offset.takeError());
    
    // Find AST node at position
 auto SelectedNode = findNodeAtPosition(
   AST.getASTContext(),
        AST.getSourceManager(),
*Offset);
    
    if (!SelectedNode)
 return CB(error("No AST node at position"));
    
    // Generate matcher string
    MatcherGenerator Generator(AST.getASTContext(), Options);
    auto Result = Generator.generate(*SelectedNode, Pos);
    
    CB(std::move(Result));
  };
  
  WorkScheduler.runWithAST("GenerateMatcher", File, std::move(Action));
}

2. Matcher Generator Implementation

File: clang-tools-extra/clangd/MatcherGenerator.h

#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_MATCHER_GENERATOR_H
#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_MATCHER_GENERATOR_H

#include "clang/AST/ASTContext.h"
#include "clang/AST/ASTTypeTraits.h"
#include "clang/Basic/SourceLocation.h"
#include <string>

namespace clang {
namespace clangd {

enum class MatcherMode {
  Exact,       // Highly specific matcher
  Contextual,  // Include parent context
  Pattern      // General pattern matcher
};

struct GenerateMatcherOptions {
  MatcherMode Mode = MatcherMode::Exact;
  bool IncludeLocation = false;
  bool IncludeParentContext = true;
  unsigned MaxDepth = 3;
};

struct GenerateMatcherResult {
  std::string Matcher;
  std::string NodeType;
  std::string Description;
  std::vector<std::string> Alternatives;
  float Confidence = 1.0f;
};

class MatcherGenerator {
public:
  MatcherGenerator(ASTContext &Ctx, const GenerateMatcherOptions &Options)
   : Ctx(Ctx), Options(Options) {}
  
  /// Generate a matcher for the given AST node
  GenerateMatcherResult generate(
      const ast_type_traits::DynTypedNode &Node,
  SourceLocation CursorLoc);

private:
  // Node-specific matcher generation
  std::string generateForDecl(const Decl *D);
  std::string generateForStmt(const Stmt *S);
  std::string generateForType(const Type *T);
  
  // Helper methods
  std::string generateFunctionDeclMatcher(const FunctionDecl *FD);
  std::string generateCallExprMatcher(const CallExpr *CE);
  std::string generateVarDeclMatcher(const VarDecl *VD);
  std::string generateCXXRecordDeclMatcher(const CXXRecordDecl *RD);

  // Context generation
  std::string addParentContext(
      const ast_type_traits::DynTypedNode &Node,
  const std::string &BaseMatcher);
  
  // Uniqueness constraints
  std::string addUniqueConstraints(
      const ast_type_traits::DynTypedNode &Node);
  
  ASTContext &Ctx;
  GenerateMatcherOptions Options;
};

} // namespace clangd
} // namespace clang

#endif

File: clang-tools-extra/clangd/MatcherGenerator.cpp

#include "MatcherGenerator.h"
#include "clang/AST/Decl.h"
#include "clang/AST/DeclCXX.h"
#include "clang/AST/Expr.h"
#include "clang/AST/ExprCXX.h"
#include "clang/AST/Stmt.h"
#include "clang/Index/USRGeneration.h"
#include "llvm/Support/raw_ostream.h"

namespace clang {
namespace clangd {

GenerateMatcherResult MatcherGenerator::generate(
 const ast_type_traits::DynTypedNode &Node,
    SourceLocation CursorLoc) {
  
  GenerateMatcherResult Result;
  
  // Determine node type and generate appropriate matcher
  if (const auto *D = Node.get<Decl>()) {
    Result.Matcher = generateForDecl(D);
    Result.NodeType = D->getDeclKindName();
  } else if (const auto *S = Node.get<Stmt>()) {
    Result.Matcher = generateForStmt(S);
    Result.NodeType = S->getStmtClassName();
  } else if (const auto *T = Node.get<Type>()) {
    Result.Matcher = generateForType(T);
    Result.NodeType = "Type";
  } else {
    Result.Matcher = ""; // Unsupported node type
    Result.Confidence = 0.0f;
    return Result;
  }
  
  // Add parent context if requested
  if (Options.IncludeParentContext && !Result.Matcher.empty()) {
    Result.Matcher = addParentContext(Node, Result.Matcher);
  }
  
  // Generate alternative matchers
  if (Options.Mode == MatcherMode::Pattern) {
  // Generate more general alternatives
    // (Implementation details...)
  }
  
  return Result;
}

std::string MatcherGenerator::generateForDecl(const Decl *D) {
  if (const auto *FD = dyn_cast<FunctionDecl>(D))
    return generateFunctionDeclMatcher(FD);
  if (const auto *VD = dyn_cast<VarDecl>(D))
    return generateVarDeclMatcher(VD);
  if (const auto *RD = dyn_cast<CXXRecordDecl>(D))
    return generateCXXRecordDeclMatcher(RD);
  
// Generic decl matcher
  return "decl()";
}

std::string MatcherGenerator::generateFunctionDeclMatcher(
    const FunctionDecl *FD) {
  
  std::string Matcher = "functionDecl(";
  bool NeedComma = false;
  
  // Match by name
  if (FD->getDeclName().isIdentifier()) {
    Matcher += "hasName(\"" + FD->getNameAsString() + "\")";
    NeedComma = true;
  }
  
  // Match return type
  if (Options.Mode == MatcherMode::Exact) {
    if (NeedComma) Matcher += ", ";
 Matcher += "returns(asString(\"" + 
        FD->getReturnType().getAsString() + "\"))";
 NeedComma = true;
  }
  
  // Match parameter count
  if (Options.Mode == MatcherMode::Exact) {
    if (NeedComma) Matcher += ", ";
    Matcher += "parameterCountIs(" + 
               std::to_string(FD->getNumParams()) + ")";
    NeedComma = true;
  }
  
  Matcher += ")";
  return Matcher;
}

std::string MatcherGenerator::generateCallExprMatcher(
    const CallExpr *CE) {
  
  std::string Matcher = "callExpr(";
  bool NeedComma = false;
  
  // Match callee function name
  if (const auto *Callee = CE->getDirectCallee()) {
    if (Callee->getDeclName().isIdentifier()) {
      Matcher += "callee(functionDecl(hasName(\"" +
         Callee->getNameAsString() + "\")))";
      NeedComma = true;
    }
  }
  
  // Match argument count
  if (Options.Mode == MatcherMode::Exact) {
  if (NeedComma) Matcher += ", ";
    Matcher += "argumentCountIs(" + 
        std::to_string(CE->getNumArgs()) + ")";
    NeedComma = true;
  }
  
  // Match specific argument types
  if (Options.Mode == MatcherMode::Exact && CE->getNumArgs() > 0) {
    for (unsigned i = 0; i < CE->getNumArgs(); ++i) {
      if (NeedComma) Matcher += ", ";
      Matcher += "hasArgument(" + std::to_string(i) + ", ";
      Matcher += "hasType(asString(\"" +
     CE->getArg(i)->getType().getAsString() + "\")))";
      NeedComma = true;
    }
  }
  
  Matcher += ")";
  return Matcher;
}

std::string MatcherGenerator::generateVarDeclMatcher(const VarDecl *VD) {
  std::string Matcher = "varDecl(";
  bool NeedComma = false;
  
  // Match variable name
  if (VD->getDeclName().isIdentifier()) {
    Matcher += "hasName(\"" + VD->getNameAsString() + "\")";
    NeedComma = true;
  }
  
  // Match type
  if (Options.Mode == MatcherMode::Exact) {
    if (NeedComma) Matcher += ", ";
    Matcher += "hasType(asString(\"" + 
       VD->getType().getAsString() + "\"))";
    NeedComma = true;
  }
  
  // Match initializer presence
  if (VD->hasInit()) {
    if (NeedComma) Matcher += ", ";
    Matcher += "hasInitializer(expr())";
    NeedComma = true;
  }
  
  Matcher += ")";
  return Matcher;
}

std::string MatcherGenerator::generateCXXRecordDeclMatcher(
    const CXXRecordDecl *RD) {
  
  std::string Matcher = "cxxRecordDecl(";
  bool NeedComma = false;
  
  // Match class name
  if (RD->getDeclName().isIdentifier()) {
    Matcher += "hasName(\"" + RD->getNameAsString() + "\")";
    NeedComma = true;
  }
  
  // Match if it's a struct/class/union
  if (Options.Mode == MatcherMode::Exact) {
    if (NeedComma) Matcher += ", ";
    if (RD->isClass())
      Matcher += "isClass()";
    else if (RD->isStruct())
Matcher += "isStruct()";
    else if (RD->isUnion())
      Matcher += "isUnion()";
    NeedComma = true;
  }
  
  Matcher += ")";
  return Matcher;
}

std::string MatcherGenerator::generateForStmt(const Stmt *S) {
  if (const auto *CE = dyn_cast<CallExpr>(S))
    return generateCallExprMatcher(CE);
  
  // Add more statement types as needed
  if (isa<IfStmt>(S))
    return "ifStmt()";
  if (isa<ForStmt>(S))
    return "forStmt()";
  if (isa<WhileStmt>(S))
    return "whileStmt()";
  if (isa<ReturnStmt>(S))
    return "returnStmt()";
  
  // Generic statement matcher
  return "stmt()";
}

std::string MatcherGenerator::generateForType(const Type *T) {
  return "type(asString(\"" + 
 QualType(T, 0).getAsString() + "\"))";
}

std::string MatcherGenerator::addParentContext(
    const ast_type_traits::DynTypedNode &Node,
    const std::string &BaseMatcher) {
  
  auto Parents = Ctx.getParents(Node);
  if (Parents.empty())
    return BaseMatcher;
  
  const auto &Parent = Parents[0];
  
  // Add ancestor constraint based on parent type
  if (const auto *ParentDecl = Parent.get<FunctionDecl>()) {
    if (ParentDecl->getDeclName().isIdentifier()) {
      return BaseMatcher.substr(0, BaseMatcher.size() - 1) +
   ", hasAncestor(functionDecl(hasName(\"" +
             ParentDecl->getNameAsString() + "\"))))";
    }
  }
  
  if (const auto *ParentClass = Parent.get<CXXRecordDecl>()) {
    if (ParentClass->getDeclName().isIdentifier()) {
return BaseMatcher.substr(0, BaseMatcher.size() - 1) +
             ", hasAncestor(cxxRecordDecl(hasName(\"" +
        ParentClass->getNameAsString() + "\"))))";
    }
  }
  
  return BaseMatcher;
}

std::string MatcherGenerator::addUniqueConstraints(
    const ast_type_traits::DynTypedNode &Node) {
  // Add constraints to make matcher more specific
  // (Implementation based on Options.Mode)
  return "";
}

} // namespace clangd
} // namespace clang

3. LSP Handler Integration

File: clang-tools-extra/clangd/ClangdLSPServer.h

class ClangdLSPServer {
  // Existing methods...
  
  void onGenerateMatcher(const GenerateMatcherParams &Params,
         Callback<GenerateMatcherResult> Reply);
};

File: clang-tools-extra/clangd/ClangdLSPServer.cpp

void ClangdLSPServer::onInitialize(const InitializeParams &Params) {
  // Existing initialization...
  
  // Advertise matcher generation capability
  ServerCaps.experimental["generateMatcher"] = true;
}

void ClangdLSPServer::onGenerateMatcher(
    const GenerateMatcherParams &Params,
    Callback<GenerateMatcherResult> Reply) {
  
  GenerateMatcherOptions Options;
  if (Params.matcherMode) {
if (*Params.matcherMode == "exact")
      Options.Mode = MatcherMode::Exact;
    else if (*Params.matcherMode == "contextual")
      Options.Mode = MatcherMode::Contextual;
    else if (*Params.matcherMode == "pattern")
      Options.Mode = MatcherMode::Pattern;
  }
  
  Options.IncludeLocation = Params.includeLocation.value_or(false);
  Options.IncludeParentContext = 
      Params.includeParentContext.value_or(true);
  Options.MaxDepth = Params.maxDepth.value_or(3);
  
  Server->generateMatcher(
      Params.textDocument.uri.file(),
      Params.position,
      Options,
      std::move(Reply));
}

// Register the handler
void ClangdLSPServer::registerCustomMethods() {
  MsgHandler->bind("clangd/generateMatcher",
                 &ClangdLSPServer::onGenerateMatcher);
}

Testing Approach

Unit Tests

File: clang-tools-extra/clangd/unittests/MatcherGeneratorTests.cpp

#include "MatcherGenerator.h"
#include "TestTU.h"
#include "gmock/gmock.h"
#include "gtest/gtest.h"

namespace clang {
namespace clangd {
namespace {

using ::testing::HasSubstr;

TEST(MatcherGeneratorTest, FunctionDecl) {
  const char *Code = R"cpp(
    void foo() {}
    void bar() {}
  )cpp";
  
  auto TU = TestTU::withCode(Code);
  auto AST = TU.build();
  
  // Find "foo" function
  auto *FD = cast<FunctionDecl>(
      findDecl(AST, [](const NamedDecl *ND) {
  return ND->getNameAsString() == "foo";
      }));
  
  ast_type_traits::DynTypedNode Node = 
      ast_type_traits::DynTypedNode::create(*FD);
  
  GenerateMatcherOptions Options;
  MatcherGenerator Generator(AST.getASTContext(), Options);
  
  auto Result = Generator.generate(Node, FD->getLocation());
  
  EXPECT_EQ(Result.NodeType, "Function");
  EXPECT_THAT(Result.Matcher, HasSubstr("functionDecl"));
  EXPECT_THAT(Result.Matcher, HasSubstr("hasName(\"foo\")"));
}

TEST(MatcherGeneratorTest, CallExprWithContext) {
  const char *Code = R"cpp(
  void callee() {}
    void caller() {
  callee();
      callee(); // Second call
    }
  )cpp";
  
  auto TU = TestTU::withCode(Code);
  auto AST = TU.build();
  
  // Find first callee() call
  // (Implementation to find specific call expr...)
  
  GenerateMatcherOptions Options;
  Options.IncludeParentContext = true;
  MatcherGenerator Generator(AST.getASTContext(), Options);
  
  // Generate matcher for first call
  auto Result = Generator.generate(Node, CallLoc);
  
  EXPECT_THAT(Result.Matcher, HasSubstr("callExpr"));
  EXPECT_THAT(Result.Matcher, 
       HasSubstr("callee(functionDecl(hasName(\"callee\")))"));
  EXPECT_THAT(Result.Matcher,
    HasSubstr("hasAncestor(functionDecl(hasName(\"caller\")))"));
}

TEST(MatcherGeneratorTest, VarDeclInSpecificScope) {
  const char *Code = R"cpp(
    void foo() {
      int x = 1;
    }
    void bar() {
      int x = 2;
    }
  )cpp";
  
  auto TU = TestTU::withCode(Code);
  auto AST = TU.build();
  
  // Find 'x' in foo()
  // (Implementation...)
  
  GenerateMatcherOptions Options;
  Options.IncludeParentContext = true;
  MatcherGenerator Generator(AST.getASTContext(), Options);
  
  auto Result = Generator.generate(Node, VarLoc);
  
  EXPECT_THAT(Result.Matcher, HasSubstr("varDecl"));
EXPECT_THAT(Result.Matcher, HasSubstr("hasName(\"x\")"));
  EXPECT_THAT(Result.Matcher,
     HasSubstr("hasAncestor(functionDecl(hasName(\"foo\")))"));
}

TEST(MatcherGeneratorTest, ClassDeclaration) {
  const char *Code = R"cpp(
    class MyClass {
      void method();
    };
  )cpp";
  
  auto TU = TestTU::withCode(Code);
  auto AST = TU.build();
  
  auto *RD = cast<CXXRecordDecl>(
      findDecl(AST, [](const NamedDecl *ND) {
        return ND->getNameAsString() == "MyClass";
 }));
  
  ast_type_traits::DynTypedNode Node = 
    ast_type_traits::DynTypedNode::create(*RD);
  
  GenerateMatcherOptions Options;
  MatcherGenerator Generator(AST.getASTContext(), Options);
  
  auto Result = Generator.generate(Node, RD->getLocation());
  
  EXPECT_THAT(Result.Matcher, HasSubstr("cxxRecordDecl"));
  EXPECT_THAT(Result.Matcher, HasSubstr("hasName(\"MyClass\")"));
}

} // namespace
} // namespace clangd
} // namespace clang

Integration Tests with clang-query

File: clang-tools-extra/test/clangd/generate-matcher.test

# RUN: clangd -lit-test < %s | FileCheck %s

{"jsonrpc":"2.0","id":0,"method":"initialize","params":{"capabilities":{}}}
---
{"jsonrpc":"2.0","method":"textDocument/didOpen","params":{"textDocument":{"uri":"test:///main.cpp","languageId":"cpp","version":1,"text":"void foo() { int x = 42; }"}}}
---
# Generate matcher for function declaration
{"jsonrpc":"2.0","id":1,"method":"clangd/generateMatcher","params":{"textDocument":{"uri":"test:///main.cpp"},"position":{"line":0,"character":5}}}
# CHECK: "id": 1
# CHECK: "matcher": "functionDecl(hasName(\"foo\"))"
# CHECK: "nodeType": "Function"
---
# Generate matcher for variable declaration
{"jsonrpc":"2.0","id":2,"method":"clangd/generateMatcher","params":{"textDocument":{"uri":"test:///main.cpp"},"position":{"line":0,"character":17}}}
# CHECK: "id": 2
# CHECK: "matcher": "varDecl(hasName(\"x\"){{.*}}hasAncestor(functionDecl(hasName(\"foo\")))"
# CHECK: "nodeType": "Var"
---
{"jsonrpc":"2.0","id":3,"method":"shutdown"}
---
{"jsonrpc":"2.0","method":"exit"}

Manual Testing with clang-query

Test Script: test-matcher-generation.sh

#!/bin/bash

# Test file
cat > test.cpp << 'EOF'
void foo() {
  int x = 42;
bar(x);
}

class MyClass {
  void method();
};
EOF

# Start clangd and test matcher generation
# (Requires a test client that can send LSP requests)

echo "Testing function declaration matcher..."
# Generate matcher for 'foo' at line 1, column 6
# Expected: functionDecl(hasName("foo"))

echo "Testing variable declaration matcher..."
# Generate matcher for 'x' at line 2, column 7
# Expected: varDecl(hasName("x"), hasAncestor(functionDecl(hasName("foo"))))

echo "Testing call expression matcher..."
# Generate matcher for 'bar(x)' at line 3, column 3
# Expected: callExpr(callee(functionDecl(hasName("bar"))), ...)

echo "Testing class declaration matcher..."
# Generate matcher for 'MyClass' at line 6, column 7
# Expected: cxxRecordDecl(hasName("MyClass"), isClass())

# Verify each generated matcher works with clang-query
for matcher in "${MATCHERS[@]}"; do
  echo "match $matcher" | clang-query test.cpp -- 2>&1 | \
    grep "1 match" || echo "FAIL: $matcher"
done

Property-Based Tests

TEST(MatcherGeneratorTest, GeneratedMatcherProperties) {
  // For every supported AST node type
  for (auto NodeType : SupportedNodeTypes) {
    auto Code = generateCodeWithNode(NodeType);
    auto TU = TestTU::withCode(Code);
    auto AST = TU.build();
    
    auto Node = findNodeOfType(AST, NodeType);
    
    GenerateMatcherOptions Options;
    MatcherGenerator Generator(AST.getASTContext(), Options);
    auto Result = Generator.generate(Node, Node.getLocation());
    
    // Property 1: Matcher must be syntactically valid
    EXPECT_TRUE(isValidMatcherSyntax(Result.Matcher));
    
    // Property 2: Matcher must match at least one node
    auto Matches = runMatcher(Result.Matcher, Code);
    EXPECT_GE(Matches.size(), 1);
    
    // Property 3: In 'exact' mode, should match small number of nodes
    if (Options.Mode == MatcherMode::Exact) {
      EXPECT_LE(Matches.size(), 3);
    }
    
    // Property 4: Matched node should contain original position
    bool ContainsOriginalPos = false;
    for (const auto &Match : Matches) {
   if (nodeContainsLocation(Match, Node.getLocation())) {
        ContainsOriginalPos = true;
        break;
    }
    }
    EXPECT_TRUE(ContainsOriginalPos);
  }
}

Usage Examples

VS Code Extension

File: vscode-extension/src/extension.ts

import * as vscode from 'vscode';
import { LanguageClient } from 'vscode-languageclient/node';

export function activate(context: vscode.ExtensionContext) {
  // Existing clangd client setup...
  
  // Register command to generate matcher at cursor
  let disposable = vscode.commands.registerCommand(
    'clangd.generateMatcher',
    async () => {
      const editor = vscode.window.activeTextEditor;
      if (!editor) return;
      
      const position = editor.selection.active;
      const document = editor.document;
      
      try {
        const result = await client.sendRequest(
          'clangd/generateMatcher',
     {
          textDocument: { uri: document.uri.toString() },
      position: {
  line: position.line,
      character: position.character
         },
  matcherMode: 'contextual',
            includeParentContext: true
          }
        );
        
        // Display the matcher in various ways:
        
        // 1. Show in hover tooltip
 vscode.window.showInformationMessage(
   `Matcher: ${result.matcher}`
  );
        
        // 2. Copy to clipboard
        await vscode.env.clipboard.writeText(result.matcher);
        vscode.window.showInformationMessage(
      'Matcher copied to clipboard!'
        );
        
     // 3. Insert in new editor window
        const doc = await vscode.workspace.openTextDocument({
     content: `// Matcher for ${result.nodeType}\n${result.matcher}`,
       language: 'cpp'
        });
        await vscode.window.showTextDocument(doc);
 
      } catch (error) {
   vscode.window.showErrorMessage(
          `Failed to generate matcher: ${error}`
        );
      }
    }
  );
  
  context.subscriptions.push(disposable);
  
  // Add keybinding: Ctrl+Shift+M (or Cmd+Shift+M on Mac)
  // This goes in package.json
}

File: vscode-extension/package.json

{
  "contributes": {
    "commands": [
   {
        "command": "clangd.generateMatcher",
      "title": "Generate Clang-Query Matcher"
      }
    ],
    "keybindings": [
      {
     "command": "clangd.generateMatcher",
        "key": "ctrl+shift+m",
        "mac": "cmd+shift+m",
        "when": "editorTextFocus && editorLangId == cpp"
      }
    ],
    "menus": {
    "editor/context": [
 {
  "command": "clangd.generateMatcher",
          "when": "editorLangId == cpp",
      "group": "clangd"
        }
      ]
    }
  }
}

Neovim Plugin

File: nvim-plugin/lua/clangd-matcher.lua

local M = {}

function M.generate_matcher()
  local params = vim.lsp.util.make_position_params()
  params.matcherMode = 'contextual'
  params.includeParentContext = true
  
  vim.lsp.buf_request(
    0,
    'clangd/generateMatcher',
    params,
    function(err, result, ctx, config)
      if err then
        vim.notify('Error generating matcher: ' .. err.message, vim.log.levels.ERROR)
        return
      end
      
  if not result then
        vim.notify('No matcher generated', vim.log.levels.WARN)
   return
      end
      
      -- Show in floating window
      local lines = {
     '-- Matcher for ' .. result.nodeType,
   result.matcher,
      '',
        '-- Description:',
        result.description or 'N/A'
      }
      
  local buf = vim.api.nvim_create_buf(false, true)
      vim.api.nvim_buf_set_lines(buf, 0, -1, false, lines)
      
    local width = 80
local height = #lines
      local opts = {
        relative = 'cursor',
      width = width,
      height = height,
        row = 1,
        col = 0,
    style = 'minimal',
        border = 'rounded'
      }
      
      local win = vim.api.nvim_open_win(buf, true, opts)
   
      -- Copy to clipboard
      vim.fn.setreg('+', result.matcher)
    vim.notify('Matcher copied to clipboard', vim.log.levels.INFO)
    end
  )
end

function M.setup()
  vim.keymap.set('n', '<leader>cm', M.generate_matcher, {
    desc = 'Generate clang-query matcher'
  })
end

return M

Command-Line Tool

File: clang-tools-extra/clang-query/GenerateMatcherTool.cpp

// Standalone tool to generate matchers from source locations

#include "clang/Tooling/CommonOptionsParser.h"
#include "clang/Tooling/Tooling.h"
#include "llvm/Support/CommandLine.h"

using namespace clang;
using namespace clang::tooling;

static llvm::cl::OptionCategory GenerateMatcherCategory(
    "generate-matcher options");

static llvm::cl::opt<std::string> SourceFile(
    "file",
    llvm::cl::desc("Source file to analyze"),
    llvm::cl::Required,
    llvm::cl::cat(GenerateMatcherCategory));

static llvm::cl::opt<unsigned> Line(
    "line",
    llvm::cl::desc("Line number (1-based)"),
    llvm::cl::Required,
    llvm::cl::cat(GenerateMatcherCategory));

static llvm::cl::opt<unsigned> Column(
    "column",
    llvm::cl::desc("Column number (1-based)"),
    llvm::cl::Required,
    llvm::cl::cat(GenerateMatcherCategory));

static llvm::cl::opt<std::string> Mode(
    "mode",
    llvm::cl::desc("Matcher mode: exact, contextual, pattern"),
    llvm::cl::init("exact"),
    llvm::cl::cat(GenerateMatcherCategory));

int main(int argc, const char **argv) {
  auto ExpectedParser = CommonOptionsParser::create(
  argc, argv, GenerateMatcherCategory);
  
  if (!ExpectedParser) {
    llvm::errs() << ExpectedParser.takeError();
    return 1;
  }
  
  CommonOptionsParser &OptionsParser = ExpectedParser.get();
  ClangTool Tool(OptionsParser.getCompilations(),
     OptionsParser.getSourcePathList());
  
  // Run tool and generate matcher
  // (Implementation...)
  
  return Tool.run(newFrontendActionFactory<GenerateMatcherAction>().get());
}

Example Workflow

  1. Developer opens C++ file in editor

    class MyService {
    public:
      void processRequest(const Request& req) {
        if (req.isValid()) {
          handleValid(req);
        } else {
          handleInvalid(req);
        }
      }
    };
  2. Places cursor on handleValid(req) call

  3. Triggers matcher generation (Ctrl+Shift+M)

  4. Extension displays generated matcher:

    // Matcher for CallExpr
    callExpr(
      callee(functionDecl(hasName("handleValid"))),
      argumentCountIs(1),
      hasArgument(0, hasType(asString("const Request &"))),
      hasAncestor(cxxMethodDecl(
        hasName("processRequest"),
        ofClass(cxxRecordDecl(hasName("MyService")))
      ))
    )
  5. Developer copies matcher to create clang-tidy check:

    // In custom clang-tidy check
    Finder->addMatcher(
        callExpr(
        callee(functionDecl(hasName("handleValid"))),
        // Add custom conditions...
      ).bind("call"),
    this
    );

Future Enhancements

Short-term Enhancements

  1. Interactive Matcher Refinement

    • Allow users to adjust matcher specificity interactively
    • Provide slider for "specificity level"
    • Show real-time match count as user adjusts
  2. Matcher Templates

    • Pre-defined templates for common patterns
    • Template library for standard checks (nullptr checks, bounds checks, etc.)
  3. Visual Feedback

    • Highlight all code locations that match the generated matcher
    • Show match count in status bar
    • Color-code matches by confidence level
  4. Documentation Integration

    • Include links to AST matcher documentation
    • Show examples of similar matchers
    • Provide inline help for matcher functions

Medium-term Enhancements

  1. Multi-Node Matchers

    • Generate matchers for code selections (multiple nodes)
    • Support relationship matchers (e.g., "all calls to this function")
  2. Matcher Optimization

    • Suggest more efficient equivalent matchers
    • Detect and eliminate redundant constraints
    • Performance profiling for matchers
  3. Code Action Integration

    • "Generate clang-tidy check from matcher" code action
    • "Save matcher to query file" code action
    • "Test matcher on codebase" code action
  4. Matcher Library

    • Save and organize custom matchers
    • Share matcher libraries across team
    • Import/export matcher collections

Long-term Enhancements

  1. AI-Assisted Matcher Generation

    • Use ML to suggest matchers based on code patterns
    • Learn from user refinements
    • Suggest related patterns to match
  2. Collaborative Features

    • Share matchers via cloud service
    • Comment and rate matchers
    • Discover popular matchers for common patterns
  3. Cross-Language Support

    • Extend to other languages with AST representations
    • Unified matcher syntax across languages
  4. IDE-Specific Features

    • Visual matcher builder GUI
    • Drag-and-drop matcher construction
    • Interactive AST explorer with matcher generation

Research Directions

  1. Automatic Bug Pattern Detection

    • Analyze codebase to find common bug patterns
    • Auto-generate matchers for detected patterns
    • Suggest custom checks based on codebase analysis
  2. Matcher Inference from Examples

    • User marks multiple code examples
    • System infers general matcher pattern
    • Active learning to refine matcher
  3. Context-Aware Suggestions

    • Suggest matchers based on current task
    • Integrate with issue tracking systems
    • Learn team-specific patterns

Extensibility Points

Custom Matcher Strategies

// Allow plugins to register custom matcher generation strategies
class MatcherGenerationStrategy {
public:
  virtual ~MatcherGenerationStrategy() = default;
  
  virtual bool canHandle(const ast_type_traits::DynTypedNode &Node) = 0;
  
  virtual std::string generateMatcher(
      const ast_type_traits::DynTypedNode &Node,
      const GenerateMatcherOptions &Options,
      ASTContext &Ctx) = 0;
};

// Register custom strategy
MatcherGenerator::registerStrategy(
    std::make_unique<CustomMatcherStrategy>());

Matcher Post-Processing

// Allow plugins to post-process generated matchers
class MatcherPostProcessor {
public:
  virtual ~MatcherPostProcessor() = default;
  
  virtual std::string process(
      const std::string &Matcher,
      const GenerateMatcherResult &Result) = 0;
};

Custom Output Formats

// Support different output formats
enum class OutputFormat {
  ClangQuery,      // Default clang-query format
  ClangTidyCheck,  // Ready-to-use clang-tidy check
  Documentation,   // Documented matcher with examples
  Python,          // Python binding format
  JSON  // Structured JSON representation
};

Conclusion

This LSP extension provides a powerful tool for C++ developers working with Clang's AST matchers. By automatically generating matcher strings from cursor positions, it:

  • Reduces friction in learning AST matcher syntax
  • Accelerates development of custom static analysis tools
  • Improves accuracy by generating correct matchers directly from code
  • Enhances productivity through seamless editor integration

The implementation leverages clangd's existing infrastructure while adding minimal overhead, making it a practical addition to the LLVM toolchain.

References


Document Version: 1.0
Last Updated: 08-Nov-2025
Status: Design Proposal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment