文章詳情頁

python實現web郵箱掃描的示例(附源碼)

瀏覽：96日期：2022-06-23 17:13:37

信息收集是進行滲透測試的關鍵部分，掌握大量的信息對于攻擊者來說是一件非常重要的事情，比如，我們知道一個服務器的版本信息，我們就可以利用該服務器框架的相關漏洞對該服務器進行測試。那么如果我們掌握了該服務器的管理員的郵箱地址，我們就可以展開一個釣魚攻擊。所以，對web站點進行郵箱掃描，是進行釣魚攻擊的一種前提條件。

下面，我們利用python腳本來實現一個web站點的郵箱掃描爬取。目的是在實現這個腳本的過程中對python進行學習

最后有完整代碼

基本思路我們向工具傳入目標站點之后，首先要對輸入進行一個基本的檢查和分析，因為我們會可能會傳入各種樣式的地址，比如http://www.xxxx.com/、http://www.xxxx.com/123/456/789.html等等，我們需要對其進行簡單的拆分，以便于后面鏈接的爬取通過requests庫爬取目標地址的內容，并且在內容通過正則表達式中尋找郵箱地址查找爬取的網站中的超鏈接，通過這些超鏈接我們就能進入到該站點的另外一個頁面繼續尋找我們想要的郵箱地址。開工：該腳本所需要的一些庫

from bs4 import BeautifulSoup #BeautifulSoup最主要的功能是從網頁抓取數據，Beautiful Soup自動將輸入文檔轉換為Unicode編碼import requests #requests是python實現的最簡單易用的HTTP庫import requests.exceptionsimport urllib.parsefrom collections import deque #deque 是一個雙端隊列, 如果要經常從兩端append 的數據, 選擇這個數據結構就比較好了, 如果要實現隨機訪問,不建議用這個,請用列表. import re #是一個正則表達式的庫獲取掃描目標

user_url=str(input(’[+] Enter Target URL to Scan:’))urls =deque([user_url]) #把目標地址放入deque對象列表scraped_urls= set()#set() 函數創建一個無序不重復元素集，可進行關系測試，刪除重復數據，還可以計算交集、差集、并集等。emails = set()對網頁進行郵箱地址爬?。?00條）

首先要對目標地址進行分析，拆分目標地址的協議，域名以及路徑。然后利用requests的get方法訪問網頁，通過正則表達式過濾出是郵箱地址的內容。’[a-z0-0.-+]+@[a-z0-9.-+]+.[a-z]+’，符合郵箱格式的內容就進行收錄。

count=0try: while len(urls): #如果urls有長度的話進行循環 count += 1#添加計數器來記錄爬取鏈接的條數 if count ==101: break url = urls.popleft() #popleft（）會刪除urls里左邊第一條數據并傳給url scraped_urls.add(url) parts = urllib.parse.urlsplit(url) # 打印 parts會顯示：SplitResult(scheme=’http’, netloc=’www.baidu.com’, path=’’, query=’’, fragment=’’) base_url = ’{0.scheme}://{0.netloc}’.format(parts)#scheme：協議；netloc：域名 path = url[:url.rfind(’/’)+1] if ’/’ in parts.path else url#提取路徑 print(’[%d] Processing %s’ % (count,url)) try: head = {’User-Agent’:'Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11'} response = requests.get(url,headers = head) except(requests.exceptions.MissingSchema,requests.exceptions.ConnectionError): continue new_emails = set(re.findall(r’[a-z0-0.-+_]+@[a-z0-9.-+_]+.[a-z]+’, response.text ,re.I))#通過正則表達式從獲取的網頁中提取郵箱，re.I表示忽略大小寫 emails.update(new_emails)#將獲取的郵箱地址存在emalis中。通過錨點進入下一網頁繼續搜索

soup = BeautifulSoup(response.text, features=’lxml’) for anchor in soup.find_all(’a’): #尋找錨點。在html中，<a>標簽代表一個超鏈接，herf屬性就是鏈接地址 link = anchor.attrs[’href’] if ’href’ in anchor.attrs else ’’ #如果，我們找到一個超鏈接標簽，并且該標簽有herf屬性，那么herf后面的地址就是我們需要錨點鏈接。 if link.startswith(’/’):#如果該鏈接以/開頭，那它只是一個路徑，我們就需要加上協議和域名，base_url就是剛才分離出來的協議+域名link = base_url + link elif not link.startswith(’http’):#如果不是以/和http開頭的話，就要加上路徑。link =path + link if not link in urls and not link in scraped_urls:#如果該鏈接在之前沒還有被收錄的話，就把該鏈接進行收錄。urls.append(link)except KeyboardInterrupt: print(’[+] Closing’)for mail in emails: print(mail)完整代碼

from bs4 import BeautifulSoupimport requestsimport requests.exceptionsimport urllib.parsefrom collections import dequeimport reuser_url=str(input(’[+] Enter Target URL to Scan:’))urls =deque([user_url])scraped_urls= set()emails = set()count=0try: while len(urls): count += 1 if count ==100: break url = urls.popleft() scraped_urls.add(url) parts = urllib.parse.urlsplit(url) base_url = ’{0.scheme}://{0.netloc}’.format(parts) path = url[:url.rfind(’/’)+1] if ’/’ in parts.path else url print(’[%d] Processing %s’ % (count,url)) try: head = {’User-Agent’:'Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11'} response = requests.get(url,headers = head) except(requests.exceptions.MissingSchema,requests.exceptions.ConnectionError): continue new_emails = set(re.findall(r’[a-z0-0.-+_]+@[a-z0-9.-+_]+.[a-z]+’, response.text ,re.I)) emails.update(new_emails) soup = BeautifulSoup(response.text, features=’lxml’) for anchor in soup.find_all(’a’): link = anchor.attrs[’href’] if ’href’ in anchor.attrs else ’’ if link.startswith(’/’):link = base_url + link elif not link.startswith(’http’):link =path + link if not link in urls and not link in scraped_urls:urls.append(link)except KeyboardInterrupt: print(’[+] Closing’)for mail in emails: print(mail)實驗………………

python實現web郵箱掃描的示例(附源碼)

以上就是python實現web郵箱掃描的示例(附源碼)的詳細內容，更多關于python web郵箱掃描的資料請關注好吧啦網其它相關文章！

Python 編程

上一條：如何利用python和DOS獲取wifi密碼下一條：如何在Python中創建二叉樹

相關文章：

1. python GUI庫圖形界面開發之PyQt5動態(可拖動控件大小)布局控件QSplitter詳細使用方法與實例2. CSS3實例分享之多重背景的實現(Multiple backgrounds)3. CSS清除浮動方法匯總4. 不要在HTML中濫用div5. 父div高度不能自適應子div高度的解決方案6. js開發中的頁面、屏幕、瀏覽器的位置原理（高度寬度）說明講解（附圖）7. XML 非法字符（轉義字符）8. Python數據分析JupyterNotebook3魔法命令詳解及示例9. ASP動態include文件10. vue跳轉頁面常用的幾種方法匯總

排行榜

					
					python GUI庫圖形界面開發之PyQt5動態(可拖動控件大小)布局控件QSplitter詳細使用方法與實例
js實現純前端壓縮圖片
java語言實現猜數字游戲
springboot使JUL實現日志管理功能
原生JS封裝拖動驗證滑塊的實現代碼示例
Android實現動態改變shape.xml中圖形的顏色
python GUI庫圖形界面開發之PyQt5滑塊條控件QSlider詳細使用方法與實例
python實現web郵箱掃描的示例(附源碼)
IDEA下lombok安裝及找不到get,set的問題的解決方法
python 基于卡方值分箱算法的實現示例
python開發實例之Python的Twisted框架中Deferred對象的詳細用法與實例