Skip to content

Project improvement opportunities — code quality, tests, docs, CI #29

Description

@Lslightly

by deepseek-v4-flash

Codebase audit result. Grouped by area for prioritization.

1. Code Quality

1.1 Silent error swallowing

Multiple places ignore exec.Cmd.Run() errors:

  • cmd/batch_clone_build/extgen.gogobuildM2, adaptEscape, genscript all use _ = cmd.Run()
  • cmd/batch_clone_build/build.gobuild() doesn't check cmd.Run() error separately from context deadline
  • cmd/codeql_qdriver/main.go:135 — query phase error written to stderr file but not propagated
  • cmd/codeql_qdriver/main.go:235 — decode phase _ = cmd.Run()

Fix: at minimum log.Warn on non-fatal failures; propagate errors upward in library code.

1.2 Busy-wait loops in CSV collection

cmd/codeql_qdriver/main.go:337-338 and 362-363:

for !headerWritten.Load() {}

Spin-wait burns CPU. Replace with sync.Cond or channel coordination.

1.3 Global mutable state

  • config/artifact.go:106var Nowstr package-level string
  • config/artifact.go:108var logDirCache global map, not concurrent-safe
  • cmd/codeql_result_parser/analyzer.go:28var cleanedDir global map
  • cmd/codeql_result_parser/main.go:34var ql2analyzer global map

Blocks parallel testing and introduces subtle race conditions.

1.4 log.Fatalf in library/non-main code

  • config/repo.go:29RemoteURL() fatally logs
  • config/repogroup.go:85reposInDir fatally logs
  • config/query.go:18,29CreateQuery fatally logs

These should return error and let the caller decide how to handle.

1.5 Regex compiled in hot loop

cmd/escape_adapter/movedtoheap.go:49,85regexp.MustCompile runs on every log line. Should be lifted to package level.

1.6 Unreachable code

cmd/batch_clone_build/utils.go:11-13bypass[T] calls log.Fatal(err) then returns zero value. If log.Fatal doesn't terminate (e.g. flags modified), returns silently wrong data.

1.7 Dead code

utils/exec.go:24cmd.Dir assigned twice (lines 17 and 24).

1.8 Primitive command parsing

cmd/batch_clone_build/extgen.go:103strings.Split(script, " ") to parse CLI commands. Breaks on quoted arguments. Use strings.Fields as minimum, or a proper shell parser.

1.9 ANSI escape codes in log output

config/artifact.go:78log.Println("\033[33mWARNING: ...") dumps raw escape codes to log files and CI output.

2. Test Coverage

2.1 Near-zero unit test coverage

  • config/ package: 0 tests
  • utils/ package: 0 tests
  • cmd/codeql_qdriver/main.go: 0 tests
  • cmd/codeql_result_parser/: 0 tests
  • cmd/pprof2qlcsv/convert/: 0 tests
  • cmd/pprof-external-verify/: 0 tests

2.2 Existing tests are integration-only

  • clone_test.go requires network + git clone
  • movedtoheap_test.go requires CodeQL CLI + Go compiler
  • Cannot run in offline/lightweight CI environments

2.3 repos/test/pkgcall package name conflict

a_te.go declares package pkgcall_test, bigInt.go declares package pkgcall. go vet reports this. Should be fixed.

3. Architecture

3.1 Overly large package main files

  • cmd/batch_clone_build/build.go — 228 lines
  • cmd/codeql_qdriver/main.go — 371 lines
  • cmd/pprof-external-verify/instance_cnt.go — 407 lines

These cannot be unit-tested individually. Extract logic into testable internal packages.

3.2 Complex CSV collection concurrency

collectCSVsForQuery uses atomic.Int64 + atomic.Bool + spin-wait to coordinate header writing. sync.Once + channel would be cleaner and safer.

3.3 PassLogDir cache has no invalidation

config/artifact.go logDirCache caches by pass name, but a second call with the same pass in a different logical run returns stale path.

4. Dependencies

4.1 gorm.io/gorm listed in go.mod but unused

Dead dependency. Should be removed.

4.2 gopkg.in/yaml.v3 listed as indirect but github.com/goccy/go-yaml is the actual yaml lib used

Verify if transitive dependency can be pruned.

5. CI & Tooling

5.1 No linter config

No .golangci.yml — CI can't enforce code quality automatically.

5.2 No Makefile / Taskfile

Only make.bash (simple wrapper). Missing targets: test, lint, vet, clean.

5.3 CI scripts lack error isolation

scripts/ci.sh uses set -x and if ! ...; then exit 1 — no cleanup, no parallel stages.

6. Documentation

6.1 Mixed languages

doc/yaml-configuration.zh.md is Chinese, codeql_rename_query/README.md and codeql_result_parser/README.md are Chinese, most others are English. No consistent convention.

6.2 Stale README notes

  • cmd/codeql_rename_query/README.md: "NOTICE: this tool is not well tested"
  • cmd/codeql_result_parser/README.md: Unimplemented todo "后续可引入csv上的QL查询"

7. CSV Format Inconsistency

escape_adapter outputs CSV with a header row; pprof2qlcsv/convert outputs CSV without a header row. The verifier (pprof-external-verify/instance_cnt.go:238) documents "The file does not contain a header row" but this assumption breaks if format ever changes.


Suggested priority: 1.1 (silent errors) > 2.1 (test coverage) > 1.3 (global state) > 1.4 (log.Fatalf) > rest.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions