8000 The `pr view` command with the `--json files` flag omits some of the files changed · Issue #9916 · cli/cli · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

The pr view command with the --json files flag omits some of the files changed #9916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rosslh opened this issue Nov 13, 2024 · 3 comments
Labels
bug Something isn't working discuss Feature changes that require discussion primarily among the GitHub CLI team gh-pr relating to the gh pr command needs-triage needs to be reviewed

Comments

@rosslh
Copy link
rosslh commented Nov 13, 2024

Describe the bug

The pr view command with the --json files flag does not return all of the files changed if the PR has modified more than 100 files.

CLI version:

gh version 2.61.0 (2024-11-06)
https://github.com/cli/cli/releases/tag/v2.61.0

Steps to reproduce the behavior

  1. Run this: gh pr view https://github.com/sveltejs/svelte/pull/9739 --json files
  2. View the output - it's a JSON array with 100 entries
  3. Run this: gh pr view https://github.com/sveltejs/svelte/pull/9739 --json changedFiles
  4. View the output:
{
  "changedFiles": 144
}

Expected vs actual behavior

I would expect the number of files output with the --json files flag to match the integer output with the --json changedFiles flag.

If a limit is required for performance reasons, I would expect to be able to provide a --limit flag to increase the limit, like in the search command.

I don't see the current behaviour specified anywhere in the documentation for the view command, which is why I'm classifying this as a bug.

@rosslh rosslh added the bug Something isn't working label Nov 13, 2024
@cliAutomation cliAutomation added the needs-triage needs to be reviewed label Nov 13, 2024
@andyfeller andyfeller added the gh-pr relating to the gh pr command label Nov 13, 2024
@andyfeller
Copy link
Member

Thanks for raising this issue, @rosslh, let's dig into this a bit more!

What's going on behind the scenes?

I believe this is known limitation when this was introduced in #3414 due to a number of factors:

  • GitHub GraphQL API node limits
  • technical complexities of potentially pulling multiple fields that require pagination

Though not technically a bug, I recognize it is unexpected behavior much less what someone interested in the data would desire.

👉 I want to discuss this with my fellow maintainers as I'm uncertain how realistic it is to refactor how we build such complex data objects from the GitHub API given the multiple levels of nested pagination queries involved.

For more information, see detailed stack trace behind the relevant code

8000
func viewRun(opts *ViewOptions) error {
findOptions := shared.FindOptions{
Selector: opts.SelectorArg,
Fields: defaultFields,
}
if opts.BrowserMode {
findOptions.Fields = []string{"url"}
} else if opts.Exporter != nil {
findOptions.Fields = opts.Exporter.Fields()
}
pr, baseRepo, err := opts.Finder.Find(findOptions)

func (f *finder) Find(opts FindOptions) (*api.PullRequest, ghrepo.Interface, error) {
if len(opts.Fields) == 0 {
return nil, nil, errors.New("Find error: no fields specified")
}
if repo, prNumber, err := f.parseURL(opts.Selector); err == nil {
f.prNumber = prNumber
f.repo = repo
}
if f.repo == nil {
repo, err := f.baseRepoFn()
if err != nil {
return nil, nil, err
}
f.repo = repo
}
if opts.Selector == "" {
if branch, prNumber, err := f.parseCurrentBranch(); err != nil {
return nil, nil, err
} else if prNumber > 0 {
f.prNumber = prNumber
} else {
f.branchName = branch
}
} else if f.prNumber == 0 {
// If opts.Selector is a valid number then assume it is the
// PR number unless opts.BaseBranch is specified. This is a
// special case for PR create command which will always want
// to assume that a numerical selector is a branch name rather
// than PR number.
prNumber, err := strconv.Atoi(strings.TrimPrefix(opts.Selector, "#"))
if opts.BaseBranch == "" && err == nil {
f.prNumber = prNumber
} else {
f.branchName = opts.Selector
}
}
httpClient, err := f.httpClient()
if err != nil {
return nil, nil, err
}
// TODO(josebalius): Should we be guarding here?
if f.progress != nil {
f.progress.StartProgressIndicator()
defer f.progress.StopProgressIndicator()
}
fields := set.NewStringSet()
fields.AddValues(opts.Fields)
numberFieldOnly := fields.Len() == 1 && fields.Contains("number")
fields.AddValues([]string{"id", "number"}) // for additional preload queries below
if fields.Contains("isInMergeQueue") || fields.Contains("isMergeQueueEnabled") {
cachedClient := api.NewCachedHTTPClient(httpClient, time.Hour*24)
detector := fd.NewDetector(cachedClient, f.repo.RepoHost())
prFeatures, err := detector.PullRequestFeatures()
if err != nil {
return nil, nil, err
}
if !prFeatures.MergeQueue {
fields.Remove("isInMergeQueue")
fields.Remove("isMergeQueueEnabled")
}
}
var getProjectItems bool
if fields.Contains("projectItems") {
getProjectItems = true
fields.Remove("projectItems")
}
var pr *api.PullRequest
if f.prNumber > 0 {
if numberFieldOnly {
// avoid hitting the API if we already have all the information
return &api.PullRequest{Number: f.prNumber}, f.repo, nil
}
pr, err = findByNumber(httpClient, f.repo, f.prNumber, fields.ToSlice())
} else {

func findByNumber(httpClient *http.Client, repo ghrepo.Interface, number int, fields []string) (*api.PullRequest, error) {
type response struct {
Repository struct {
PullRequest api.PullRequest
}
}
query := fmt.Sprintf(`
query PullRequestByNumber($owner: String!, $repo: String!, $pr_number: Int!) {
repository(owner: $owner, name: $repo) {
pullRequest(number: $pr_number) {%s}
}
}`, api.PullRequestGraphQL(fields))
variables := map[string]interface{}{
"owner": repo.RepoOwner(),
"repo": repo.RepoName(),
"pr_number": number,
}
var resp response
client := api.NewClientFromHTTP(httpClient)
err := client.GraphQL(repo.RepoHost(), query, variables, &resp)
if err != nil {
return nil, err
}
return &resp.Repository.PullRequest, nil
}

cli/api/query_builder.go

Lines 375 to 382 in 8990b1c

// PullRequestGraphQL constructs a GraphQL query fragment for a set of pull request fields.
// It will try to sanitize the fields to just those available on pull request.
func PullRequestGraphQL(fields []string) string {
s := set.NewStringSet()
s.AddValues(fields)
s.RemoveValues(issueOnlyFields)
return IssueGraphQL(s.ToSlice())
}

cli/api/query_builder.go

Lines 313 to 373 in 8990b1c

// IssueGraphQL constructs a GraphQL query fragment for a set of issue fields.
func IssueGraphQL(fields []string) string {
var q []string
for _, field := range fields {
switch field {
case "author":
q = append(q, `author{login,...on User{id,name}}`)
case "mergedBy":
q = append(q, `mergedBy{login,...on User{id,name}}`)
case "headRepositoryOwner":
q = append(q, `headRepositoryOwner{id,login,...on User{name}}`)
case "headRepository":
q = append(q, `headRepository{id,name}`)
case "assignees":
q = append(q, `assignees(first:100){nodes{id,login,name},totalCount}`)
case "labels":
q = append(q, `labels(first:100){nodes{id,name,description,color},totalCount}`)
case "projectCards":
q = append(q, `projectCards(first:100){nodes{project{name}column{name}},totalCount}`)
case "projectItems":
q = append(q, `projectItems(first:100){nodes{id, project{id,title}, status:fieldValueByName(name: "Status") { ... on ProjectV2ItemFieldSingleSelectValue{optionId,name}}},totalCount}`)
case "milestone":
q = append(q, `milestone{number,title,description,dueOn}`)
case "reactionGroups":
q = append(q, `reactionGroups{content,users{totalCount}}`)
case "mergeCommit":
q = append(q, `mergeCommit{oid}`)
case "potentialMergeCommit":
q = append(q, `potentialMergeCommit{oid}`)
case "autoMergeRequest":
q = append(q, autoMergeRequest)
case "comments":
q = append(q, issueComments)
case "lastComment": // pseudo-field
q = append(q, issueCommentLast)
case "reviewRequests":
q = append(q, prReviewRequests)
case "reviews":
q = append(q, prReviews)
case "latestReviews":
q = append(q, prLatestReviews)
case "files":
q = append(q, prFiles)
case "commits":
q = append(q, prCommits)
case "lastCommit": // pseudo-field
q = append(q, `commits(last:1){nodes{commit{oid}}}`)
case "commitsCount": // pseudo-field
q = append(q, `commits{totalCount}`)
case "requiresStrictStatusChecks": // pseudo-field
q = append(q, `baseRef{branchProtectionRule{requiresStrictStatusChecks}}`)
case "statusCheckRollup":
q = append(q, StatusCheckRollupGraphQLWithoutCountByState(""))
case "statusCheckRollupWithCountByState": // pseudo-field
q = append(q, StatusCheckRollupGraphQLWithCountByState())
default:
q = append(q, field)
}
}
return strings.Join(q, ",")
}

cli/api/query_builder.go

Lines 104 to 112 in 8990b1c

var prFiles = shortenQuery(`
files(first: 100) {
nodes {
additions,
deletions,
path
}
}
`)

What can be done in short term?

gh api calls can be made similar to below that will pull all of this information for your use case:

QUERY='
query($endCursor: String){
  repository(name:"svelte", owner:"sveltejs"){
    pullRequest(number: 9739){
      files(first:100, after: $endCursor){
        totalCount
        pageInfo{
          endCursor
          hasNextPage
        }
        nodes{
          path
        }
      }
    }
  }
}'

gh api graphql -F query="$QUERY" --paginate --slurp | jq ".[].data.repository.pullRequest.files.nodes | length"

@andyfeller andyfeller added the discuss Feature changes that require discussion primarily among the GitHub CLI team label Nov 13, 2024
@rosslh
Copy link
Author
rosslh commented Nov 13, 2024

I appreciate the quick response, Andy! I was able to get something working with the gh api example you provided, so thanks for that 🙂

@andyfeller
Copy link
Member

Just in the context of this issue, there are several issue and PR fields that only retrieve 100 items:

Discussing this with the core maintainers, updating the documentation for the gh issue view, gh pr view commands as providing convenience to these fields with guidance to use gh api to pull complete information is the least we can do for right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working discuss Feature changes that require discussion primarily among the GitHub CLI team gh-pr relating to the gh pr command needs-triage needs to be reviewed
Projects
None yet
Development

No branches or pull requests

3 participants
0