Table of Contents
Github Dorks are specific search queries designed to find sensitive information exposed in public repositories on GitHub. GitHub’s search functionality to locate data such as passwords, API keys, private keys, and other confidential information that developers might accidentally commit to public repositories.
What is GitHub?
GitHub is like a big online library where people store their computer programs (we call these “code”). Imagine you have a big notebook where you write down all your favorite stories, and you share it with your friends so they can read or add more stories. GitHub is a place where programmers do just that but with computer programs.
What is a Dork?
Now, “dork” is a funny word, isn’t it? In this context, a “dork” means a special way of searching for something very specific. Just like you might look for a particular story in your notebook, programmers use “dorks” to find specific pieces of code on GitHub.
What are GitHub Dorks?
GitHub dorks are special searches that help find specific information on GitHub. Sometimes, people might accidentally put secret information in their code, like passwords or keys to important things. Using GitHub dorks, you can search to see if there is any such secret information that shouldn’t be there. GitHub Dork is similar to google dork.
- Credentials: Usernames, passwords, and authentication tokens.
- API Keys: Keys for services like AWS, Google Cloud, and others.
- Configuration Files: Files containing sensitive setup details.
- Sensitive Data: Personal information, credit card numbers, etc.
Example of a GitHub Dork
Let’s say you want to find files that might have the word “password” in them. You can use a GitHub dork like this:
password filename:config
This search tells GitHub to look for files named “config” that have the word “password” in them. It’s like asking a friend, “Hey, can you find all the pages in our notebook where we wrote ‘password’?”
Why is this Important?
Finding these secrets is important because we want to keep things safe. If bad people find these secrets, they might use them to do naughty things. So, programmers need to be very careful and make sure they don’t leave any important information lying around.
Protecting Your Own Information
Best Practices for Developers
Developers should follow best practices to avoid leaking sensitive information. This includes using .gitignore
files to exclude sensitive files from being committed, regularly reviewing commits, and using environment variables.
Using GitHub Secrets
GitHub provides a feature called “Secrets” which allows developers to store sensitive data securely. Utilize this feature to manage API keys, passwords, and other confidential information.
Types of Sensitive Information on GitHub
Category | Examples | Usage and Sensitivity |
---|---|---|
Filenames | config.yml , .env , .git-credentials , .npmrc _auth , .dockercfg , docker-compose.yml , settings.py , database.yml , secrets.yml , web.config , prod.exs , prod.secret.exs , config.json , credentials.json , oauth.json | Files containing configuration settings, credentials, and sensitive data. |
AWS Credentials | AWS_ACCESS_KEY_ID , AWS_SECRET_ACCESS_KEY , filenames with aws_access_key_id , aws_secret_access_key | Used for accessing AWS services, critical for security and cost control. |
Database Credentials | DB_USERNAME , DB_PASSWORD , DB_HOST , DB_NAME , DB_USER , DB_PASS , DATABASE_URL | Configuration files for web applications often contain database passwords and other sensitive settings. |
API Keys and Tokens | api_key , apikey , API_KEY , APIKEY , API_TOKEN , api_token , client_secret , client_id , filenames with .json , filenames with .exs secret | Keys and tokens for authentication and authorization in APIs and services. |
Configuration Files | .ftpconfig , sftp-config.json , deployment-config.json , .ssh/config , .ssh/known_hosts , sshd_config , authorized_keys | Files that configure system behavior, access controls, and network settings. |
SSH Keys | .ssh/id_rsa , .ssh/id_rsa.pub , .ssh/authorized_keys , .ssh/known_hosts , .ssh/config | Used for secure remote access via SSH, critical for server security. |
Email and SMTP Credentials | smtp_password , smtp_username , MAIL_PASSWORD , MAIL_USERNAME , EMAIL_HOST_PASSWORD , EMAIL_HOST_USER | Credentials for sending emails via SMTP, often contain sensitive login information. |
Other Sensitive Information | jenkins.xml , jenkins.plugins.txt , secrets.xml , hubot-scripts.json , .bash_history , .cshrc , .bash_profile , .bashrc , .zshrc | Files containing historical commands, scripts, and application secrets. |
Miscellaneous | config.php dbpassword , wp-config.php , configuration.php , config.inc.php , config.inc | Configuration files for web applications often containing database passwords and other sensitive settings. |
Searching for Specific Phrases
- “private_key”, “private key”, “BEGIN RSA PRIVATE KEY”, “BEGIN PGP PRIVATE KEY BLOCK”, “BEGIN EC PRIVATE KEY”: Phrases indicating the presence of encryption keys or sensitive cryptographic materials.
- “passphrase”, “password”, “secret”, “key”, “token”: Common terms revealing sensitive information like passwords, secrets, and authentication tokens.
Additional Dork Queries
- “SECRET_KEY”, “SECRET_TOKEN”, “client_secret”, “client_id”, “database_url”, “database_password”: Specific keywords often associated with API keys, tokens, database credentials, and configuration settings.
Conclusions:
GitHub Dorking can be both a helpful tool and a security risk. It helps find security issues but can also expose sensitive info. Protect your repositories by using best practices and tools.
FAQs:
-
How can I protect my information on GitHub?
To protect your information, use .gitignore files to exclude sensitive data, regularly audit your repositories, use environment variables, and take advantage of GitHub’s “Secrets” feature.
-
What should I do if I find sensitive information on GitHub?
If you find sensitive information, it’s best to report it responsibly. Contact the repository owner or use GitHub’s reporting tools to notify them of the issue. Avoid using or sharing the information without permission.