github_utils#
Attributes#
Functions#
|
Crawls a GitHub repository to retrieve file URLs based on specified criteria. |
|
|
|
Module Contents#
- github_utils.crawl_github_repo(url: str = GITHUB_REPO, is_sub_dir: bool = False, branch_or_commit_name: str = COMMIT_HASH, project_path: str = PROJECT_PATH_2, access_token=f'{GITHUB_TOKEN}')[source]#
Crawls a GitHub repository to retrieve file URLs based on specified criteria.
- Parameters:
url (str) – The GitHub repository URL or sub-directory URL.
is_sub_dir (bool) – Flag indicating if the current URL is a sub-directory.
branch_name (str) – The branch name to crawl.
project_path (str) – The path of the project in the repository.
access_token (str, optional) – GitHub access token for authentication. Defaults to GITHUB_TOKEN.
- Returns:
List of file URLs that match the criteria.
- Return type:
list