Skip to content

Instantly share code, notes, and snippets.

@LegalizeAdulthood
Last active November 6, 2025 07:03
Show Gist options
  • Select an option

  • Save LegalizeAdulthood/d71ddc5763f2882972c8eddfeb64e849 to your computer and use it in GitHub Desktop.

Select an option

Save LegalizeAdulthood/d71ddc5763f2882972c8eddfeb64e849 to your computer and use it in GitHub Desktop.
Configurable Replacement Check Plan for clang-tidy

Configurable Replacement Check Plan for clang-tidy

Overview

This document outlines a comprehensive plan for implementing a configurable clang-tidy check that combines:

  1. AST Matching: Uses clang-query-style matcher expressions to select AST nodes
  2. Node Binding: Binds matched AST nodes to named identifiers
  3. Template Replacement: Uses Transformer's Stencil system for code generation
  4. Fix-it Generation: Generates suggested fix-its for matched code patterns

This check type enables users to define custom refactoring rules through configuration without writing C++ code.

Motivation

Currently, clang-tidy has two related but separate mechanisms:

  • QueryCheck (custom/QueryCheck.cpp): Allows AST matching via clang-query syntax with diagnostic messages
  • TransformerClangTidyCheck (utils/TransformerClangTidyCheck.cpp): Provides AST rewriting using the Transformer library

This new check combines the configurability of QueryCheck with the replacement capabilities of TransformerClangTidyCheck, allowing users to:

  • Define matchers in configuration files (no C++ code needed)
  • Bind AST nodes to identifiers
  • Specify replacement templates using simplified syntax
  • Generate automated fix-its for matched patterns
  • Leverage the proven Transformer/Stencil infrastructure for code generation

Architecture

1. Configuration Schema

The check will be configured through .clang-tidy YAML configuration with the following structure:

CheckOptions:
  - key: custom.ReplacementChecks
    value: |
      - name: prefer-emplace-back
        matcher: |
          cxxMemberCallExpr(
            on(hasType(cxxRecordDecl(hasName("std::vector")))),
            callee(cxxMethodDecl(hasName("push_back"))),
            hasArgument(0, cxxConstructExpr().bind("construct"))
          ).bind("call")
        replacement: "emplace_back(${construct.args})"
        message: "prefer emplace_back over push_back for efficiency"
        
      - name: nullptr-instead-of-null
        matcher: |
          implicitCastExpr(
            hasCastKind(CK_NullToPointer),
            has(integerLiteral(equals(0)).bind("zero"))
          ).bind("cast")
        replacement: "nullptr"
        message: "use nullptr instead of NULL or 0"

2. Key Components

A. Configuration Parser (ConfigurableReplacementCheck.h/cpp)

Header Structure:

namespace clang::tidy::custom {

struct ReplacementRule {
  std::string Name;                 // Rule identifier
  std::string MatcherStr;           // AST matcher string (clang-query syntax)
  std::string ReplacementTemplate;  // Template with ${binding} placeholders
  std::string Message;              // Diagnostic message
  DiagnosticIDs::Level Severity;    // Warning, Error, Note
};

// Inherits from TransformerClangTidyCheck to leverage existing infrastructure
class ConfigurableReplacementCheck : public TransformerClangTidyCheck {
public:
  ConfigurableReplacementCheck(StringRef Name, 
     const ReplacementRule &Rule,
  ClangTidyContext *Context);

private:
  // Convert user-friendly config to Transformer RewriteRule
  static transformer::RewriteRuleWith<std::string> 
  makeRuleFromConfig(const ReplacementRule &Rule);
  
  // Parse the matcher string into a DynTypedMatcher
  static ast_matchers::dynamic::DynTypedMatcher 
  parseMatcherString(StringRef MatcherStr, ClangTidyContext *Context);
  
  // Convert ${binding.part} template syntax to Transformer Stencil
  static transformer::Stencil 
  parseTemplateToStencil(StringRef Template);
  
  // Helper to create Stencil for node part extraction
  static transformer::Stencil 
  makeNodePartStencil(StringRef BindingName, StringRef PartSpecifier);
};

} // namespace clang::tidy::custom

B. Matcher Parsing (Leveraging existing QueryCheck)

Reuse the parsing logic from QueryCheck.cpp:

ast_matchers::dynamic::DynTypedMatcher 
ConfigurableReplacementCheck::parseMatcherString(
    StringRef MatcherStr, ClangTidyContext *Context) {
  
  clang::query::QuerySession QS({});
  query::QueryRef Q = query::QueryParser::parse(MatcherStr, QS);
  
  if (Q->Kind == query::QK_Match) {
    const auto &MatchQuery = llvm::cast<query::MatchQuery>(*Q);
    return MatchQuery.Matcher;
  }
  
  // Handle error cases
  if (Q->Kind == query::QK_Invalid) {
    const auto &InvalidQuery = llvm::cast<query::InvalidQuery>(*Q);
    Context->configurationDiag(InvalidQuery.ErrStr, DiagnosticIDs::Error);
  }
  
  return {}; // Empty matcher on error
}

C. Rule Construction Using Transformer

Core Implementation - Building the RewriteRule:

transformer::RewriteRuleWith<std::string> 
ConfigurableReplacementCheck::makeRuleFromConfig(const ReplacementRule &Rule) {
  
  // Parse the matcher string into DynTypedMatcher
  auto Matcher = parseMatcherString(Rule.MatcherStr, /* Context */);

  // Convert replacement template to Stencil
  auto ReplacementStencil = parseTemplateToStencil(Rule.ReplacementTemplate);
  
  // Build the transformer rule using the Transformer API
  return transformer::makeRule(
      Matcher,
      transformer::changeTo(
          transformer::node("root"),  // or first bound node
          ReplacementStencil),
  transformer::cat(Rule.Message));
}

// Constructor delegates to TransformerClangTidyCheck
ConfigurableReplacementCheck::ConfigurableReplacementCheck(
    StringRef Name, 
    const ReplacementRule &Rule,
    ClangTidyContext *Context)
    : TransformerClangTidyCheck(
          makeRuleFromConfig(Rule), Name, Context) {}

D. Template-to-Stencil Conversion

Convert user-friendly ${binding.part} syntax to Transformer Stencil:

using namespace transformer;

transformer::Stencil 
ConfigurableReplacementCheck::parseTemplateToStencil(StringRef Template) {
  
  std::vector<Stencil> Parts;
  
  // Parse the template string for ${...} placeholders
  size_t Pos = 0;
  size_t LastPos = 0;
  
  while ((Pos = Template.find("${", LastPos)) != StringRef::npos) {
    // Add literal text before placeholder
  if (Pos > LastPos) {
      Parts.push_back(cat(Template.substr(LastPos, Pos - LastPos).str()));
    }
    
    // Find end of placeholder
    size_t EndPos = Template.find("}", Pos);
    if (EndPos == StringRef::npos) {
      // Malformed template
      break;
}
    
    // Extract placeholder content: "binding" or "binding.part"
    StringRef Placeholder = Template.substr(Pos + 2, EndPos - Pos - 2);
    
    // Parse into binding name and optional part specifier
    auto DotPos = Placeholder.find('.');
    StringRef BindingName = (DotPos == StringRef::npos) 
        ? Placeholder 
   : Placeholder.substr(0, DotPos);
    
    StringRef PartSpecifier = (DotPos == StringRef::npos)
        ? StringRef("")
      : Placeholder.substr(DotPos + 1);
    
    // Create appropriate Stencil for this placeholder
  Parts.push_back(makeNodePartStencil(BindingName, PartSpecifier));
    
  LastPos = EndPos + 1;
  }
  
  // Add any remaining literal text
  if (LastPos < Template.size()) {
    Parts.push_back(cat(Template.substr(LastPos).str()));
  }
  
  // Combine all parts into single Stencil
  return catVector(std::move(Parts));
}

E. Node Part Extraction Using Stencil

Create Stencil objects for different node parts:

transformer::Stencil 
ConfigurableReplacementCheck::makeNodePartStencil(
    StringRef BindingName, StringRef PartSpecifier) {
  
  using namespace transformer;
  
  // Default: full node text
  if (PartSpecifier.empty()) {
    return node(BindingName.str());
  }
  
// Extract "name" - use access() for named declarations
  if (PartSpecifier == "name") {
  return access(BindingName.str(), 
       [](const ast_matchers::MatchFinder::MatchResult &Result) 
     -> llvm::Expected<std::string> {
      auto &Nodes = Result.Nodes;
      auto BN = Nodes.getNodeAs<NamedDecl>(/* bound name */);
      if (!BN) 
    return llvm::createStringError(
          llvm::inconvertibleErrorCode(), 
         "node is not a NamedDecl");
 return BN->getNameAsString();
    });
  }
  
  // Extract "type" - for typed declarations
  if (PartSpecifier == "type") {
    return access(BindingName.str(),
   [](const ast_matchers::MatchFinder::MatchResult &Result)
         -> llvm::Expected<std::string> {
   auto &Nodes = Result.Nodes;
      if (auto *VD = Nodes.getNodeAs<ValueDecl>(/* bound name */)) {
        return VD->getType().getAsString();
      }
      if (auto *TD = Nodes.getNodeAs<TypedefNameDecl>(/* bound name */)) {
        return TD->getUnderlyingType().getAsString();
      }
      return llvm::createStringError(
          llvm::inconvertibleErrorCode(),
          "node is not a typed declaration");
    });
  }
  
  // Extract "args" - for call expressions
  if (PartSpecifier == "args") {
    return access(BindingName.str(),
     [](const ast_matchers::MatchFinder::MatchResult &Result)
          -> llvm::Expected<std::string> {
      auto &Nodes = Result.Nodes;
      const SourceManager &SM = *Result.SourceManager;
      const LangOptions &LO = Result.Context->getLangOpts();
      
      // Try CallExpr
      if (auto *CE = Nodes.getNodeAs<CallExpr>(/* bound name */)) {
  std::vector<std::string> Args;
        for (const Expr *Arg : CE->arguments()) {
          CharSourceRange Range = CharSourceRange::getTokenRange(
  Arg->getSourceRange());
          Args.push_back(Lexer::getSourceText(Range, SM, LO).str());
        }
        return llvm::join(Args, ", ");
      }
      
      // Try CXXConstructExpr
      if (auto *CCE = Nodes.getNodeAs<CXXConstructExpr>(/* bound name */)) {
        std::vector<std::string> Args;
        for (const Expr *Arg : CCE->arguments()) {
          CharSourceRange Range = CharSourceRange::getTokenRange(
    Arg->getSourceRange());
  Args.push_back(Lexer::getSourceText(Range, SM, LO).str());
        }
        return llvm::join(Args, ", ");
      }
      
      return llvm::createStringError(
  llvm::inconvertibleErrorCode(),
          "node is not a call/construct expression");
    });
  }
  
  // Extract "body" - for function declarations
  if (PartSpecifier == "body") {
    return access(BindingName.str(),
       [](const ast_matchers::MatchFinder::MatchResult &Result)
  -> llvm::Expected<std::string> {
      auto &Nodes = Result.Nodes;
   const SourceManager &SM = *Result.SourceManager;
      const LangOptions &LO = Result.Context->getLangOpts();
      
      if (auto *FD = Nodes.getNodeAs<FunctionDecl>(/* bound name */)) {
        if (const Stmt *Body = FD->getBody()) {
          CharSourceRange Range = CharSourceRange::getTokenRange(
      Body->getSourceRange());
 return Lexer::getSourceText(Range, SM, LO).str();
        }
      }
    return llvm::createStringError(
 llvm::inconvertibleErrorCode(),
          "node is not a function with body");
    });
  }
  
  // Unknown part specifier - default to full node
  return node(BindingName.str());
}

Simpler Alternative Using Transformer's RangeSelector:

For common cases, use existing Transformer utilities:

transformer::Stencil 
ConfigurableReplacementCheck::makeNodePartStencil(
    StringRef BindingName, StringRef PartSpecifier) {
  
  using namespace transformer;
  
  // Use built-in Transformer functions where possible
  if (PartSpecifier.empty()) {
    return node(BindingName.str());
  }
  
  if (PartSpecifier == "name") {
    return name(BindingName.str());  // Built-in Stencil function
  }
  
  // For complex extractions, use run() with custom lambda
  return run([BindingName = BindingName.str(), 
         PartSpecifier = PartSpecifier.str()]
             (const ast_matchers::MatchFinder::MatchResult &Result) 
 -> llvm::Expected<std::string> {
    return extractNodePart(Result, BindingName, PartSpecifier);
  });
}

// Extraction logic in separate function
static llvm::Expected<std::string> 
extractNodePart(const ast_matchers::MatchFinder::MatchResult &Result,
        StringRef BindingName, StringRef PartSpecifier) {
  // Implementation similar to section D in original plan
  // but returns Expected<string> for error handling
  // ...
}

3. Registration and Integration

A. Module Registration

Extend CustomTidyModule.cpp to support replacement rules:

void registerCustomReplacementChecks(
    const ClangTidyOptions &Options,
    ClangTidyCheckFactories &Factories) {
  
  if (!Options.CustomReplacementChecks)
    return;
  
  for (const ReplacementRule &Rule : *Options.CustomReplacementChecks) {
    std::string CheckName = "custom-" + Rule.Name;
 
    Factories.registerCheckFactory(
        CheckName,
        [Rule](StringRef Name, ClangTidyContext *Context) {
     return std::make_unique<ConfigurableReplacementCheck>(
    Name, Rule, Context);
        });
  }
}

B. Options Integration

Add to ClangTidyOptions.h:

struct CustomReplacementRule {
  std::string Name;
  std::string Matcher;
  std::string Replacement;
  std::string Message;
  std::optional<std::string> Severity; // "warning", "error", "note"
};

struct ClangTidyOptions {
  // ... existing fields ...
  std::optional<std::vector<CustomReplacementRule>> CustomReplacementChecks;
};

Implementation Plan

Phase 1: Core Infrastructure

  1. Create ConfigurableReplacementCheck.h and .cpp inheriting from TransformerClangTidyCheck
  2. Implement basic configuration parsing from YAML
  3. Integrate matcher parsing (reuse QueryCheck logic)
  4. Create basic template-to-Stencil conversion for literal text

Phase 2: Stencil Conversion Engine

  1. Implement placeholder regex parsing in parseTemplateToStencil()
  2. Implement basic node reference (${binding}) using transformer::node()
  3. Add support for common parts using transformer::access() or custom lambdas
  4. Leverage existing Transformer utilities where applicable

Phase 3: Node Part Extractors

  1. Implement name extractor using existing Transformer functions
  2. Implement type extractor for ValueDecl/TypedefNameDecl
  3. Implement args extractor for CallExpr/CXXConstructExpr
  4. Implement body extractor for function declarations
  5. Add error handling for incompatible node types

Phase 4: Integration and Testing

  1. Wire up ConfigurableReplacementCheck to module registration
  2. Add configuration option parsing
  3. Create unit tests using Transformer test infrastructure
  4. Test with real-world replacement scenarios

Phase 5: Advanced Features

  1. Add support for more Transformer features (conditionals via ifBound())
  2. Support multiple replacement suggestions
  3. Add include insertion using transformer::addInclude()
  4. Performance profiling and optimization

Example Use Cases

1. Replace push_back with emplace_back

- name: prefer-emplace-back
  matcher: |
    cxxMemberCallExpr(
      on(hasType(cxxRecordDecl(hasName("std::vector")))),
          callee(cxxMethodDecl(hasName("push_back"))),
          hasArgument(0, cxxConstructExpr().bind("construct"))
      ).bind("call")
  replacement: "emplace_back(${construct.args})"
  message: "prefer emplace_back over push_back"

2. Use nullptr instead of NULL

- name: modernize-use-nullptr
  matcher: |
    implicitCastExpr(
      hasCastKind(CK_NullToPointer),
        has(integerLiteral(equals(0)).bind("zero"))
      ).bind("cast")
  replacement: "nullptr"
  message: "use nullptr instead of 0 or NULL"

3. Replace C-style casts with static_cast

- name: use-static-cast
  matcher: |
    cStyleCastExpr(
      unless(isExpansionInSystemHeader()),
      hasDestinationType(pointerType()),
      has(expr().bind("expr"))
    ).bind("cast")
  replacement: "static_cast<${cast.type}>(${expr})"
  message: "use static_cast instead of C-style cast"

4. Add braces to single-statement if

- name: add-braces-to-if
  matcher: |
    ifStmt(
        hasThen(stmt(unless(compoundStmt())).bind("then"))
    ).bind("if")
  replacement: "{ ${then} }"
  message: "add braces around single-statement body"

5. Replace auto_ptr with unique_ptr

- name: replace-auto-ptr
  matcher: |
    varDecl(
      hasType(classTemplateSpecializationDecl(
         hasName("std::auto_ptr"),
      hasTemplateArgument(0, templateArgument().bind("typeArg"))
      ))
    ).bind("var")
  replacement: "std::unique_ptr<${typeArg}>"
  message: "std::auto_ptr is deprecated, use std::unique_ptr"

Advantages of Using Transformer

1. Proven Infrastructure

  • Battle-tested code generation system
  • Proper source location handling
  • Conflict detection and resolution
  • Support for macros and complex source ranges

2. Rich Feature Set

  • Stencil combinators: cat(), node(), name(), access(), run()
  • Conditional generation: ifBound(), flatten()
  • RangeSelectors: Precise control over what gets replaced
  • Include management: Automatic header insertion
  • Error handling: Expected<> types for robust error reporting

3. Maintainability

  • Leverage existing Transformer bug fixes
  • Compatibility with other Transformer-based checks
  • Consistent API across clang-tidy
  • Reduced code duplication

4. Extensibility

  • Easy to add new Stencil functions
  • Custom access() lambdas for complex extractions
  • Can compose with existing Transformer utilities
  • Future-proof as Transformer evolves

Testing Strategy

Unit Tests

Leverage Transformer test utilities:

// ConfigurableReplacementCheckTest.cpp
#include "clang/Tooling/Transformer/Transformer.h"
#include "clang/Tooling/Transformer/Stencil.h"

TEST(ConfigurableReplacementCheck, BasicReplacement) {
  ReplacementRule Rule;
  Rule.Name = "test-rule";
  Rule.MatcherStr = "integerLiteral().bind(\"lit\")";
  Rule.ReplacementTemplate = "42";
  Rule.Message = "replace with 42";
  
  // Uses TransformerClangTidyCheck test infrastructure
  EXPECT_EQ(test(Rule, "int x = 0;"), "int x = 42;");
}

TEST(ConfigurableReplacementCheck, StencilSubstitution) {
  ReplacementRule Rule;
  Rule.Name = "swap-args";
  Rule.MatcherStr = R"(
    callExpr(
      callee(functionDecl(hasName("foo"))),
      hasArgument(0, expr().bind("arg0")),
      hasArgument(1, expr().bind("arg1"))
    ).bind("call")
  )";
  Rule.ReplacementTemplate = "foo(${arg1}, ${arg0})";
  Rule.Message = "swap arguments";
  
  EXPECT_EQ(test(Rule, "foo(a, b);"), "foo(b, a);");
}

TEST(ConfigurableReplacementCheck, ExtractName) {
  ReplacementRule Rule;
  Rule.Name = "extract-name";
  Rule.MatcherStr = "varDecl().bind(\"var\")";
  Rule.ReplacementTemplate = "renamed_${var.name}";
  
  EXPECT_EQ(test(Rule, "int x = 0;"), "int renamed_x = 0;");
}

Integration Tests

Create .clang-tidy config files with custom rules and verify they work correctly on real codebases, ensuring compatibility with TransformerClangTidyCheck behavior.

Error Handling

Configuration Errors

  • Invalid matcher syntax -> Emit configuration diagnostic (via QueryParser)
  • Unknown binding in template -> Transformer's Expected<> returns error
  • Invalid part specifier -> Fallback to full node text, emit warning

Runtime Errors

  • Node type mismatch -> Stencil returns Expected<> error, skip fix-it
  • Invalid source range -> Transformer handles gracefully
  • Conflicting replacements -> Transformer's conflict detection

Dependencies

Required Libraries

  • Transformer library (clang/Tooling/Transformer/)
    • Stencil.h - Template system
    • RewriteRule.h - Rule construction
    • RangeSelector.h - Source range selection
    • MatchConsumer.h - Match result processing
  • Existing clang-query infrastructure (Query.h, QueryParser.h)
  • AST Matchers (clang/ASTMatchers)
  • TransformerClangTidyCheck base class
  • YAML parser for configuration

Compatibility Considerations

  • Works with C++14 and later
  • Compatible with existing QueryCheck configuration
  • Fully compatible with TransformerClangTidyCheck (shares implementation)
  • Backward compatible with existing .clang-tidy files
  • Benefits from Transformer improvements automatically

Success Criteria

  1. Users can define custom refactoring rules via YAML configuration
  2. Rules leverage Transformer's robust Stencil system
  3. Replacements support node binding substitution
  4. Fix-its are generated with proper source location handling
  5. Performance is comparable to hand-written TransformerClangTidyCheck
  6. Configuration errors are clearly reported
  7. Documentation enables users to create their own rules
  8. Code reuse: <500 lines of new code, rest leverages Transformer

Conclusion

This revised plan provides a complete blueprint for implementing a configurable replacement check that properly leverages the Transformer library instead of rebuilding its functionality. By inheriting from TransformerClangTidyCheck and converting user-friendly configuration syntax to Transformer's Stencil system, we:

  1. Avoid reinventing the wheel - Use proven code generation infrastructure
  2. Reduce implementation complexity - Focus on config parsing and Stencil generation
  3. Ensure maintainability - Benefit from Transformer bug fixes and enhancements
  4. Provide consistency - Same behavior as other Transformer-based checks
  5. Enable extensibility - Easy to add features using Transformer's rich API

The implementation acts as a bridge between user-friendly YAML configuration and Transformer's powerful but code-centric API, making complex transformations accessible through simple configuration while maintaining robustness and performance.

Key Insight: We're not building a new transformation system - we're building a configuration layer on top of Transformer's existing, battle-tested system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment