GitHub REST API | Search files, content, pull requests, commits programmatically using Java without cloning

In this article we will create simple examples to search remote GitHub using search API REST endpoints through Java code. In earlier article, we went through Tree API. Here is link for reference.

GitHub REST API | Tree API to get remote repo files list & metadata recursively programmatically without cloning in local

Examples in this article:

  • Search for files with given file name & extension in given organization.
  • Search for code by content i.e. method name in given repository by language .
  • Search for open pull requests by date.
  • Search for commits by author and date range.

For testing purpose, we will be using below organisation, repository or author in our search queries.

  • Organization – Apache
  • Repository –  Apache commons-lang
  • Author – Gary Gregory (As per Apache commons-lang commit history, this user has most commits. Kudos to the great work !)

GitHub Search API

We will be using search API for searching GitHub. Here is the documentation for GitHub Search API. Search API is dependent on search ‘query’ which needs to be sent as URL request parameter as shown in below sample.

https://api.github.com/search/code?q=filename:WordUtil+extension:java+org:apache

‘q’ is key of query parameter. Query syntax is generally “<qualifier>:<value>”. Query supports very extensive syntax which can be found in Query Syntax Documentation. As we go through examples, we will also use different qualifiers & see how searches can be very useful.


Lets Code

We will use Apache commons HTTP Client library with Apache Fluent httpcomponents library for making REST calls to GitHub API endpoints. We will be using GSON for JSON response parsing.

Here is the code to make the REST call which we will use in all below examples.



Search files by file name & extension

This can be achieved using ‘Search Code API‘. We will use these qualifiers in query

  • filename – Name of the file to search. We will search for “WordUtil”
  • extension – Extension of the file to search. We will search “.java” files.
  • org – GitHub organization to search in.

Here is documentation for complete list of qualifiers for Code API Query.

Here is the output which shows searched file names along with repository & path of the file.



Search files by code/content with text matching

This is also achieved using ‘Search Code‘ API. We will use these qualifiers in this query

  • word to query – This doesn’t need any qualifier key. We will search for code/method “containsAny” in this example.
  • in – Where to search. In this case files.
  • language – Which language code to search. IN this case we ill search above code in Java code.
  • repo – GitHub repository to search in.

You can notice that we have also passed a request header “Accept” with value of  “application/vnd.github.v3.text-match+json”. This is to activate text matching functionality which will tell us exact line number & column number in file where search matched along with actual fragment of matched content.

Here is the output which shows name of file in which content if found along with repository URL & path of the file. Due to the header that we passed in the request for text matching, we are also able to get fragments of matched line.



Search for open pull requests for given branch

This can be achieved using ‘Search Issues/Pull Requests API‘. We will use these qualifiers in the query.

  • Word in title – This doesn’t need qualifier key. We will search for PR with ‘number’ word in their title.
  • type – Since we are looking for pull requests, we will give value as ‘pr’.
  • state – Since we are interested in only open PRs, we give value as ‘open’
  • base – This is the branch in which PR is intended to be merged. We give it as ‘master’
  • repo – GitHub repository to search in.
  • sort & order – This defines how results should be sorted.

Here is documentation for complete list of qualifiers for Search Issues/PRs API Query.

Here is the output which shows titles of the PRs matched along with the user details & path of the pull requests.



Search commits by author & date range

This can be achieved using ‘Search Commits API‘. As of the date of writing this article, Search Commit API is only available for developers preview & might not be good for production use.

We will use these qualifiers in query

  • author – GitHub author name whose commits need to be searched.
  • commiter-date – Date range to search commits in. This can include < or > etc. for date.

Here is documentation for complete list of qualifiers for Search Commits API Query.

Here is the output showing commit message along with the date.



Complete Code

 

Complete code is also committed & available in GitHub repository.

Further Reading

GitHub REST API | Get remote repo files list & download file content programmatically without cloning in local

GitHub REST API | Tree API to get remote repo files list & metadata recursively programmatically without cloning in local

One Reply to “GitHub REST API | Search files, content, pull requests, commits programmatically using Java without cloning”

Leave a Reply

Your email address will not be published. Required fields are marked *