8000 GitHub - ttuanhung/CeWL: CeWL is a Custom Word List Generator
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
/ CeWL Public
forked from digininja/CeWL

CeWL is a Custom Word List Generator

Notifications You must be signed in to change notification settings

ttuanhung/CeWL

{"props":{"initialPayload":{"allShortcutsEnabled":false,"path":"/","repo":{"id":436501013,"defaultBranch":"master","name":"CeWL","ownerLogin":"ttuanhung","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2021-12-09T06:02:46.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/22086262?v=4","public":true,"private":false,"isOrgOwned":false},"currentUser":null,"refInfo":{"name":"master","listCacheKey":"v0:1639029774.4012","canEdit":false,"refType":"branch","currentOid":"625a651e051b7a92437555f496c9ff08185ef985"},"tree":{"items":[{"name":".dockerignore","path":".dockerignore","contentType":"file"},{"name":".gitignore","path":".gitignore","contentType":"file"},{"name":"Dockerfile","path":"Dockerfile","contentType":"file"},{"name":"Gemfile","path":"Gemfile","contentType":"file"},{"name":"Gemfile.lock","path":"Gemfile.lock","contentType":"file"},{"name":"README.md","path":"README.md","contentType":"file"},{"name":"cewl.rb","path":"cewl.rb","contentType":"file"},{"name":"cewl_lib.rb","path":"cewl_lib.rb","contentType":"file"},{"name":"changelog.md","path":"changelog.md","contentType":"file"},{"name":"fab.rb","path":"fab.rb","contentType":"file"}],"templateDirectorySuggestionUrl":null,"readme":null,"totalCount":10,"showBranchInfobar":true},"fileTree":null,"fileTreeProcessingTime":null,"foldersToFetch":[],"treeExpanded":false,"symbolsExpanded":false,"isOverview":true,"overview":{"banners":{"shouldRecommendReadme":false,"isPersonalRepo":false,"showUseActionBanner":false,"actionSlug":null,"actionId":null,"showProtectBranchBanner":false,"publishBannersInfo":{"dismissActionNoticePath":"/settings/dismiss-notice/publish_action_from_repo","releasePath":"/ttuanhung/CeWL/releases/new?marketplace=true","showPublishActionBanner":false},"interactionLimitBanner":null,"showInvitationBanner":false,"inviterName":null,"actionsMigrationBannerInfo":{"releaseTags":[],"showImmutableActionsMigrationBanner":false,"initialMigrationStatus":null}},"codeButton":{"contactPath":"/contact","isEnterprise":false,"local":{"protocolInfo":{"httpAvailable":true,"sshAvailable":null,"httpUrl":"https://github.com/ttuanhung/CeWL.git","showCloneWarning":null,"sshUrl":null,"sshCertificatesRequired":null,"sshCertificatesAvailable":null,"ghCliUrl":"gh repo clone ttuanhung/CeWL","defaultProtocol":"http","newSshKeyUrl":"/settings/ssh/new","setProtocolPath":"/users/set_protocol"},"platformInfo":{"cloneUrl":"https://desktop.github.com","showVisualStudioCloneButton":false,"visualStudioCloneUrl":"https://windows.github.com","showXcodeCloneButton":false,"xcodeCloneUrl":"xcode://clone?repo=https%3A%2F%2Fgithub.com%2Fttuanhung%2FCeWL","zipballUrl":"/ttuanhung/CeWL/archive/refs/heads/master.zip"}},"newCodespacePath":"/codespaces/new?hide_repo_select=true\u0026repo=436501013"},"popovers":{"rename":null,"renamedParentRepo":null},"commitCount":"127","overviewFiles":[{"displayName":"README.md","repoName":"CeWL","refName":"master","path":"README.md","preferredFileType":"readme","tabName":"README","richText":"\u003carticle class=\"markdown-body entry-content container-lg\" itemprop=\"text\"\u003e\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch1 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eCeWL - Custom Word List generator\u003c/h1\u003e\u003ca id=\"user-content-cewl---custom-word-list-generator\" class=\"anchor\" aria-label=\"Permalink: CeWL - Custom Word List generator\" href=\"#cewl---custom-word-list-generator\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eCopyright(c) 2020, Robin Wood \u003ca href=\"mailto:robin@digi.ninja\"\u003erobin@digi.ninja\u003c/a\u003e\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eBased on a discussion on PaulDotCom (episode 129) about creating custom word lists spidering a targets website and collecting unique words I decided to write CeWL, the Custom Word List generator. CeWL is a ruby app which spiders a given URL to a specified depth, optionally following external links, and returns a list of words which can then be used for password crackers such as John the Ripper.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eBy default, CeWL sticks to just the site you have specified and will go to a depth of 2 links, this behaviour can be changed by passing arguments. Be careful if setting a large depth and allowing it to go offsite, you could end up drifting on to a lot of other domains. All words of three characters and over are output to stdout. This length can be increased and the words can be written to a file rather than screen so the app can be automated.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eCeWL also has an associated command line app, FAB (Files Already Bagged) which uses the same meta data extraction techniques to create author/creator lists from already downloaded.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eFor anyone running CeWL with Ruby 2.7, you might get some warnings in the style:\u003c/p\u003e\n\u003cdiv class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\".../ruby-2.7.0/gems/mime-types-3.2.2/lib/mime/types/logger.rb:30: warning: `_1' is reserved for numbered parameter; consider another name\"\u003e\u003cpre class=\"notranslate\"\u003e\u003ccode\u003e.../ruby-2.7.0/gems/mime-types-3.2.2/lib/mime/types/logger.rb:30: warning: `_1' is reserved for numbered parameter; consider another name\n\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eThis is due to a new feature introduced in 2.7 which conflices with one line of code in the logger script from the mime-types gem. There is an update for it in the \u003ca href=\"https://github.com/mime-types/ruby-mime-types/commit/c44673179d24e495e5fb93282a87d37f09925d25#diff-f0a644249326afd54e7a0b90c807f8a6\"\u003egem's repo\u003c/a\u003e so hopefully that will be released soon. Till then, as far as I can tell, the warning does not affect CeWL in any way. If, for asthetics, you want to hide the warning, you can run the script as follows:\u003c/p\u003e\n\u003cdiv class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"ruby -W0 ./cewl.rb\"\u003e\u003cpre class=\"notranslate\"\u003e\u003ccode\u003eruby -W0 ./cewl.rb\n\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eHomepage: \u003ca href=\"https://digi.ninja/projects/cewl.php\" rel=\"nofollow\"\u003ehttps://digi.ninja/projects/cewl.php\u003c/a\u003e\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eGitHub: \u003ca href=\"https://github.com/digininja/CeWL\"\u003ehttps://github.com/digininja/CeWL\u003c/a\u003e\u003c/p\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003ePronunciation\u003c/h2\u003e\u003ca id=\"user-content-pronunciation\" class=\"anchor\" aria-label=\"Permalink: Pronunciation\" href=\"#pronunciation\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eSeeing as I was asked, CeWL is pronounced \"cool\".\u003c/p\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eInstallation\u003c/h2\u003e\u003ca id=\"user-content-installation\" class=\"anchor\" aria-label=\"Permalink: Installation\" href=\"#installation\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eCeWL needs the following gems to be installed:\u003c/p\u003e\n\u003cul dir=\"auto\"\u003e\n\u003cli\u003emime\u003c/li\u003e\n\u003cli\u003emime-types\u003c/li\u003e\n\u003cli\u003emini_exiftool\u003c/li\u003e\n\u003cli\u003enokogiri\u003c/li\u003e\n\u003cli\u003erubyzip\u003c/li\u003e\n\u003cli\u003espider\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp dir=\"auto\"\u003eThe easiest way to install these gems is with Bundler:\u003c/p\u003e\n\u003cdiv class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"gem install bundler\nbundle install\"\u003e\u003cpre class=\"notranslate\"\u003e\u003ccode\u003egem install bundler\nbundle install\n\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eAlternatively, you can install them manually with:\u003c/p\u003e\n\u003cdiv class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"gem install xxx\"\u003e\u003cpre class=\"notranslate\"\u003e\u003ccode\u003egem install xxx\n\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eThe gem \u003ccode\u003emini_exiftool\u003c/code\u003e gem also requires the exiftool application to be installed.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eAssuming you cloned the GitHub repo, the script should by executable by default, but if not, you can make it executable with:\u003c/p\u003e\n\u003cdiv class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"chmod u+x ./cewl.rb\"\u003e\u003cpre class=\"notranslate\"\u003e\u003ccode\u003echmod u+x ./cewl.rb\n\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eThe project page on my site gives some tips on solving common problems people\nhave encountered while running CeWL - \u003ca href=\"https://digi.ninja/projects/cewl.php\" rel=\"nofollow\"\u003ehttps://digi.ninja/projects/cewl.php\u003c/a\u003e\u003c/p\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eUsage\u003c/h2\u003e\u003ca id=\"user-content-usage\" class=\"anchor\" aria-label=\"Permalink: Usage\" href=\"#usage\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cdiv class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"./cewl.rb\n\nCeWL 5.4.2 (Break Out) Robin Wood (robin@digi.ninja) (https://digi.ninja/)\nUsage: cewl [OPTIONS] ... \u0026lt;url\u0026gt;\n\n OPTIONS:\n\t-h, --help: Show help.\n\t-k, --keep: Keep the downloaded file.\n\t-d \u0026lt;x\u0026gt;,--depth \u0026lt;x\u0026gt;: Depth to spider to, default 2.\n\t-m, --min_word_length: Minimum word length, default 3.\n\t-o, --offsite: Let the spider visit other sites.\n\t-w, --write: Write the output to the file.\n\t-u, --ua \u0026lt;agent\u0026gt;: User agent to send.\n\t-n, --no-words: Don't output the wordlist.\n\t-a, --meta: include meta data.\n\t--meta_file file: Output file for meta data.\n\t-e, --email: Include email addresses.\n\t--email_file \u0026lt;file\u0026gt;: Output file for email addresses.\n\t--meta-temp-dir \u0026lt;dir\u0026gt;: The temporary directory used by exiftool when parsing files, default /tmp.\n\t-c, --count: Show the count for each word found.\n\t-v, --verbose: Verbose.\n\t--debug: Extra debug information.\n\n\tAuthentication\n\t--auth_type: Digest or basic.\n\t--auth_user: Authentication username.\n\t--auth_pass: Authentication password.\n\n\tProxy Support\n\t--proxy_host: Proxy host.\n\t--proxy_port: Proxy port, default 8080.\n\t--proxy_username: Username for proxy, if required.\n\t--proxy_password: Password for proxy, if required.\n\n\tHeaders\n\t--header, -H: In format name:value - can pass multiple.\n\n \u0026lt;url\u0026gt;: The site to spider.\"\u003e\u003cpre class=\"notranslate\"\u003e\u003ccode\u003e./cewl.rb\n\nCeWL 5.4.2 (Break Out) Robin Wood (robin@digi.ninja) (https://digi.ninja/)\nUsage: cewl [OPTIONS] ... \u0026lt;url\u0026gt;\n\n OPTIONS:\n\t-h, --help: Show help.\n\t-k, --keep: Keep the downloaded file.\n\t-d \u0026lt;x\u0026gt;,--depth \u0026lt;x\u0026gt;: Depth to spider to, default 2.\n\t-m, --min_word_length: Minimum word length, default 3.\n\t-o, --offsite: Let the spider visit other sites.\n\t-w, --write: Write the output to the file.\n\t-u, --ua \u0026lt;agent\u0026gt;: User agent to send.\n\t-n, --no-words: Don't output the wordlist.\n\t-a, --meta: include meta data.\n\t--meta_file file: Output file for meta data.\n\t-e, --email: Include email addresses.\n\t--email_file \u0026lt;file\u0026gt;: Output file for email addresses.\n\t--meta-temp-dir \u0026lt;dir\u0026gt;: The temporary directory used by exiftool when parsing files, default /tmp.\n\t-c, --count: Show the count for each word found.\n\t-v, --verbose: Verbose.\n\t--debug: Extra debug information.\n\n\tAuthentication\n\t--auth_type: Digest or basic.\n\t--auth_user: Authentication username.\n\t--auth_pass: Authentication password.\n\n\tProxy Support\n\t--proxy_host: Proxy host.\n\t--proxy_port: Proxy port, default 8080.\n\t--proxy_username: Username for proxy, if required.\n\t--proxy_password: Password for proxy, if required.\n\n\tHeaders\n\t--header, -H: In format name:value - can pass multiple.\n\n \u0026lt;url\u0026gt;: The site to spider.\n\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch3 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eRunning CeWL in a Docker container\u003c/h3\u003e\u003ca id=\"user-content-running-cewl-in-a-docker-container\" class=\"anchor\" aria-label=\"Permalink: Running CeWL in a Docker container\" href=\"#running-cewl-in-a-docker-container\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eTo quickly use CeWL on your machine with Docker, you have to build it :\u003c/p\u003e\n\u003col dir=\"auto\"\u003e\n\u003cli\u003eBuild the container :\n\u003cdiv class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"docker build -t cewl .\"\u003e\u003cpre\u003edocker build -t cewl \u003cspan class=\"pl-c1\"\u003e.\u003c/span\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003c/li\u003e\n\u003cli\u003eContainer usage without interacting with local files :\n\u003cdiv class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"docker run -it --rm cewl [OPTIONS] ... \u0026lt;url\u0026gt;\"\u003e\u003cpre\u003edocker run -it --rm cewl [OPTIONS] ... \u003cspan class=\"pl-k\"\u003e\u0026lt;\u003c/span\u003eurl\u003cspan class=\"pl-k\"\u003e\u0026gt;\u003c/span\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003c/li\u003e\n\u003cli\u003eContainer usage with local files as input or output :\n\u003cdiv class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"# you have to mount the current directory when calling the container \ndocker run -it --rm -v \u0026quot;${PWD}:/host\u0026quot; cewl [OPTIONS] ... \u0026lt;url\u0026gt;\"\u003e\u003cpre\u003e\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e you have to mount the current directory when calling the container \u003c/span\u003e\ndocker run -it --rm -v \u003cspan class=\"pl-s\"\u003e\u003cspan class=\"pl-pds\"\u003e\"\u003c/span\u003e\u003cspan class=\"pl-smi\"\u003e${PWD}\u003c/span\u003e:/host\u003cspan class=\"pl-pds\"\u003e\"\u003c/span\u003e\u003c/span\u003e cewl [OPTIONS] ... \u003cspan class=\"pl-k\"\u003e\u0026lt;\u003c/span\u003eurl\u003cspan class=\"pl-k\"\u003e\u0026gt;\u003c/span\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp dir=\"auto\"\u003eI am going to stress here, I am not going to be offering any support for this. The work was done by \u003ca href=\"https://github.com/loris-intergalactique\"\u003e@loris-intergalactique\u003c/a\u003e who has offered to field any questions on it and give support. I don't use or know Docker, so please, don't ask me for help.\u003c/p\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eLicence\u003c/h2\u003e\u003ca id=\"user-content-licence\" class=\"anchor\" aria-label=\"Permalink: Licence\" href=\"#licence\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eThis project released under the Creative Commons Attribution-Share Alike 2.0 UK: England \u0026amp; Wales\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e\u003ca href=\"http://creativecommons.org/licenses/by-sa/2.0/uk/\" rel=\"nofollow\"\u003ehttp://creativecommons.org/licenses/by-sa/2.0/uk/\u003c/a\u003e\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eAlternatively, you can use GPL-3+ instead the of the original license.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e\u003ca href=\"http://opensource.org/licenses/GPL-3.0\" rel=\"nofollow\"\u003ehttp://opensource.org/licenses/GPL-3.0\u003c/a\u003e\u003c/p\u003e\n\u003c/article\u003e","loaded":true,"timedOut":false,"errorMessage":null,"headerInfo":{"toc":[{"level":1,"text":"CeWL - Custom Word List generator","anchor":"cewl---custom-word-list-generator","htmlText":"CeWL - Custom Word List generator"},{"level":2,"text":"Pronunciation","anchor":"pronunciation","htmlText":"Pronunciation"},{"level":2,"text":"Installation","anchor":"installation","htmlText":"Installation"},{"level":2,"text":"Usage","anchor":"usage","htmlText":"Usage"},{"level":3,"text":"Running CeWL in a Docker container","anchor":"running-cewl-in-a-docker-container","htmlText":"Running CeWL in a Docker container"},{"level":2,"text":"Licence","anchor":"licence","htmlText":"Licence"}],"siteNavLoginPath":"/login?return_to=https%3A%2F%2Fgithub.com%2Fttuanhung%2FCeWL"}}],"overviewFilesProcessingTime":0}},"appPayload":{"helpUrl":"https://docs.github.com","findFileWorkerPath":"/assets-cdn/worker/find-file-worker-263cab1760dd.js","findInFileWorkerPath":"/assets-cdn/worker/find-in-file-worker-2e7f7047116e.js","githubDevUrl":null,"enabled_features":{"copilot_workspace":null,"code_nav_ui_events":false,"react_blob_overlay":false,"accessible_code_button":true,"github_models_repo_integration":false}}}}
 
 

Repository files navigation

CeWL - Custom Word List generator

Copyright(c) 2020, Robin Wood robin@digi.ninja

Based on a discussion on PaulDotCom (episode 129) about creating custom word lists spidering a targets website and collecting unique words I decided to write CeWL, the Custom Word List generator. CeWL is a ruby app which spiders a given URL to a specified depth, optionally following external links, and returns a list of words which can then be used for password crackers such as John the Ripper.

By default, CeWL sticks to just the site you have specified and will go to a depth of 2 links, this behaviour can be changed by passing arguments. Be careful if setting a large depth and allowing it to go offsite, you could end up drifting on to a lot of other domains. All words of three characters and over are output to stdout. This length can be increased and the words can be written to a file rather than screen so the app can be automated.

CeWL also has an associated command line app, FAB (Files Already Bagged) which uses the same meta data extraction techniques to create author/creator lists from already downloaded.

For anyone running CeWL with Ruby 2.7, you might get some warnings in the style:

.../ruby-2.7.0/gems/mime-types-3.2.2/lib/mime/types/logger.rb:30: warning: `_1' is reserved for numbered parameter; consider another name

This is due to a new feature introduced in 2.7 which conflices with one line of code in the logger script from the mime-types gem. There is an update for it in the gem's repo so hopefully that will be released soon. Till then, as far as I can tell, the warning does not affect CeWL in any way. If, for asthetics, you want to hide the warning, you can run the script as follows:

ruby -W0 ./cewl.rb

Homepage: https://digi.ninja/projects/cewl.php

GitHub: https://github.com/digininja/CeWL

Pronunciation

Seeing as I was asked, CeWL is pronounced "cool".

Installation

CeWL needs the following gems to be installed:

  • mime
  • mime-types
  • mini_exiftool
  • nokogiri
  • rubyzip
  • spider

The easiest way to install these gems is with Bundler:

gem install bundler
bundle install

Alternatively, you can install them manually with:

gem install xxx

The gem mini_exiftool gem also requires the exiftool application to be installed.

Assuming you cloned the GitHub repo, the script should by executable by default, but if not, you can make it executable with:

chmod u+x ./cewl.rb

The project page on my site gives some tips on solving common problems people have encountered while running CeWL - https://digi.ninja/projects/cewl.php

Usage

./cewl.rb

CeWL 5.4.2 (Break Out) Robin Wood (robin@digi.ninja) (https://digi.ninja/)
Usage: cewl [OPTIONS] ... <url>

    OPTIONS:
	-h, --help: Show help.
	-k, --keep: Keep the downloaded file.
	-d <x>,--depth <x>: Depth to spider to, default 2.
	-m, --min_word_length: Minimum word length, default 3.
	-o, --offsite: Let the spider visit other sites.
	-w, --write: Write the output to the file.
	-u, --ua <agent>: User agent to send.
	-n, --no-words: Don't output the wordlist.
	-a, --meta: include meta data.
	--meta_file file: Output file for meta data.
	-e, --email: Include email addresses.
	--email_file <file>: Output file for email addresses.
	--meta-temp-dir <dir>: The temporary directory used by exiftool when parsing files, default /tmp.
	-c, --count: Show the count for each word found.
	-v, --verbose: Verbose.
	--debug: Extra debug information.

	Authentication
	--auth_type: Digest or basic.
	--auth_user: Authentication username.
	--auth_pass: Authentication password.

	Proxy Support
	--proxy_host: Proxy host.
	--proxy_port: Proxy port, default 8080.
	--proxy_username: Username for proxy, if required.
	--proxy_password: Password for proxy, if required.

	Headers
	--header, -H: In format name:value - can pass multiple.

    <url>: The site to spider.

Running CeWL in a Docker container

To quickly use CeWL on your machine with Docker, you have to build it :

  1. Build the container :
    docker build -t cewl .
  2. Container usage without interacting with local files :
    docker run -it --rm cewl [OPTIONS] ... <url>
  3. Container usage with local files as input or output :
    # you have to mount the current directory when calling the container 
    docker run -it --rm -v "${PWD}:/host" cewl [OPTIONS] ... <url>

I am going to stress here, I am not going to be offering any support for this. The work was done by @loris-intergalactique who has offered to field any questions on it and give support. I don't use or know Docker, so please, don't ask me for help.

Licence

This project released under the Creative Commons Attribution-Share Alike 2.0 UK: England & Wales

http://creativecommons.org/licenses/by-sa/2.0/uk/

Alternatively, you can use GPL-3+ instead the of the original license.

http://opensource.org/licenses/GPL-3.0

About

CeWL is a Custom Word List Generator

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Ruby 99.1%
  • Dockerfile 0.9%
0