怎么使用PythonAsyncio實(shí)現(xiàn)網(wǎng)站狀態(tài)檢查

本篇內(nèi)容介紹了“怎么使用Python Asyncio實(shí)現(xiàn)網(wǎng)站狀態(tài)檢查”的有關(guān)知識(shí)，在實(shí)際案例的操作過(guò)程中，不少人都會(huì)遇到這樣的困境，接下來(lái)就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧！希望大家仔細(xì)閱讀，能夠?qū)W有所成！

創(chuàng)新互聯(lián)公司是專業(yè)的崖州網(wǎng)站建設(shè)公司，崖州接單;提供做網(wǎng)站、成都網(wǎng)站制作,網(wǎng)頁(yè)設(shè)計(jì),網(wǎng)站設(shè)計(jì),建網(wǎng)站,PHP網(wǎng)站建設(shè)等專業(yè)做網(wǎng)站服務(wù);采用PHP框架,可快速的進(jìn)行崖州網(wǎng)站開(kāi)發(fā)網(wǎng)頁(yè)制作和功能擴(kuò)展;專業(yè)做搜索引擎喜愛(ài)的網(wǎng)站,專業(yè)的做網(wǎng)站團(tuán)隊(duì),希望更多企業(yè)前來(lái)合作!

1. 如何使用 Asyncio 檢查 HTTP 狀態(tài)

asyncio 模塊提供了對(duì)打開(kāi)套接字連接和通過(guò)流讀寫數(shù)據(jù)的支持。我們可以使用此功能來(lái)檢查網(wǎng)頁(yè)的狀態(tài)。

這可能涉及四個(gè)步驟，它們是：

打開(kāi)一個(gè)連接
寫一個(gè)請(qǐng)求
讀取響應(yīng)
關(guān)閉連接

2. 打開(kāi) HTTP 連接

可以使用 asyncio.open_connection() 函數(shù)在 asyncio 中打開(kāi)連接。在眾多參數(shù)中，該函數(shù)采用字符串主機(jī)名和整數(shù)端口號(hào)。

這是一個(gè)必須等待的協(xié)程，它返回一個(gè) StreamReader 和一個(gè) StreamWriter，用于使用套接字進(jìn)行讀寫。

這可用于在端口 80 上打開(kāi) HTTP 連接。

...
# open a socket connection
reader, writer = await asyncio.open_connection('www.google.com', 80)

我們還可以使用 ssl=True 參數(shù)打開(kāi) SSL 連接。這可用于在端口 443 上打開(kāi) HTTPS 連接。

...
# open a socket connection
reader, writer = await asyncio.open_connection('www.google.com', 443)

3. 寫入 HTTP 請(qǐng)求

打開(kāi)后，我們可以向 StreamWriter 寫入查詢以發(fā)出 HTTP 請(qǐng)求。例如，HTTP 版本 1.1 請(qǐng)求是純文本格式的。我們可以請(qǐng)求文件路徑“/”，它可能如下所示：

GET / HTTP/1.1
Host: www.google.com

重要的是，每行末尾必須有一個(gè)回車和一個(gè)換行符（\r\n），末尾有一個(gè)空行。

作為 Python 字符串，這可能如下所示：

'GET / HTTP/1.1\r\n'
'Host: www.google.com\r\n'
'\r\n'

在寫入 StreamWriter 之前，此字符串必須編碼為字節(jié)。這可以通過(guò)對(duì)字符串本身使用 encode() 方法來(lái)實(shí)現(xiàn)。默認(rèn)的“utf-8”編碼可能就足夠了。

...
# encode string as bytes
byte_data = string.encode()

然后可以通過(guò) StreamWriter 的 write() 方法將字節(jié)寫入套接字。

...
# write query to socket
writer.write(byte_data)

寫入請(qǐng)求后，最好等待字節(jié)數(shù)據(jù)發(fā)送完畢并等待套接字準(zhǔn)備就緒。這可以通過(guò) drain() 方法來(lái)實(shí)現(xiàn)。這是一個(gè)必須等待的協(xié)程。

...
# wait for the socket to be ready.
await writer.drain()

4. 讀取 HTTP 響應(yīng)

發(fā)出 HTTP 請(qǐng)求后，我們可以讀取響應(yīng)。這可以通過(guò)套接字的 StreamReader 來(lái)實(shí)現(xiàn)。可以使用讀取一大塊字節(jié)的 read() 方法或讀取一行字節(jié)的 readline() 方法來(lái)讀取響應(yīng)。

我們可能更喜歡 readline() 方法，因?yàn)槲覀兪褂玫氖腔谖谋镜?HTTP 協(xié)議，它一次發(fā)送一行 HTML 數(shù)據(jù)。readline() 方法是協(xié)程，必須等待。

...
# read one line of response
line_bytes = await reader.readline()

HTTP 1.1 響應(yīng)由兩部分組成，一個(gè)由空行分隔的標(biāo)頭，然后是一個(gè)空行終止的主體。header 包含有關(guān)請(qǐng)求是否成功以及將發(fā)送什么類型的文件的信息，body 包含文件的內(nèi)容，例如 HTML 網(wǎng)頁(yè)。

HTTP 標(biāo)頭的第一行包含服務(wù)器上所請(qǐng)求頁(yè)面的 HTTP 狀態(tài)。每行都必須從字節(jié)解碼為字符串。

這可以通過(guò)對(duì)字節(jié)數(shù)據(jù)使用 decode() 方法來(lái)實(shí)現(xiàn)。同樣，默認(rèn)編碼為“utf_8”。

...
# decode bytes into a string
line_data = line_bytes.decode()

5. 關(guān)閉 HTTP 連接

我們可以通過(guò)關(guān)閉 StreamWriter 來(lái)關(guān)閉套接字連接。這可以通過(guò)調(diào)用 close() 方法來(lái)實(shí)現(xiàn)。

...
# close the connection
writer.close()

這不會(huì)阻塞并且可能不會(huì)立即關(guān)閉套接字?，F(xiàn)在我們知道如何使用 asyncio 發(fā)出 HTTP 請(qǐng)求和讀取響應(yīng)，讓我們看一些檢查網(wǎng)頁(yè)狀態(tài)的示例。

6. 順序檢查 HTTP 狀態(tài)的示例

我們可以開(kāi)發(fā)一個(gè)示例來(lái)使用 asyncio 檢查多個(gè)網(wǎng)站的 HTTP 狀態(tài)。

在此示例中，我們將首先開(kāi)發(fā)一個(gè)協(xié)程來(lái)檢查給定 URL 的狀態(tài)。然后我們將為排名前 10 的網(wǎng)站中的每一個(gè)調(diào)用一次這個(gè)協(xié)程。

首先，我們可以定義一個(gè)協(xié)程，它將接受一個(gè) URL 字符串并返回 HTTP 狀態(tài)。

# get the HTTP/S status of a webpage
async def get_status(url):
	# ...

必須將 URL 解析為其組成部分。我們?cè)诎l(fā)出 HTTP 請(qǐng)求時(shí)需要主機(jī)名和文件路徑。我們還需要知道 URL 方案（HTTP 或 HTTPS）以確定是否需要 SSL。

這可以使用 urllib.parse.urlsplit() 函數(shù)來(lái)實(shí)現(xiàn)，該函數(shù)接受一個(gè) URL 字符串并返回所有 URL 元素的命名元組。

...
# split the url into components
url_parsed = urlsplit(url)

然后我們可以打開(kāi)基于 URL 方案的 HTTP 連接并使用 URL 主機(jī)名。

...
# open the connection
if url_parsed.scheme == 'https':
    reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)
else:
    reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)

接下來(lái)，我們可以使用主機(jī)名和文件路徑創(chuàng)建 HTTP GET 請(qǐng)求，并使用 StreamWriter 將編碼字節(jié)寫入套接字。

...
# send GET request
query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'
# write query to socket
writer.write(query.encode())
# wait for the bytes to be written to the socket
await writer.drain()

接下來(lái)，我們可以讀取 HTTP 響應(yīng)。我們只需要包含 HTTP 狀態(tài)的響應(yīng)的第一行。

...
# read the single line response
response = await reader.readline()

然后可以關(guān)閉連接。

...
# close the connection
writer.close()

最后，我們可以解碼從服務(wù)器讀取的字節(jié)、遠(yuǎn)程尾隨空白，并返回 HTTP 狀態(tài)。

...
# decode and strip white space
status = response.decode().strip()
# return the response
return status

將它們結(jié)合在一起，下面列出了完整的 get_status() 協(xié)程。它沒(méi)有任何錯(cuò)誤處理，例如無(wú)法訪問(wèn)主機(jī)或響應(yīng)緩慢的情況。這些添加將為讀者提供一個(gè)很好的擴(kuò)展。

# get the HTTP/S status of a webpage
async def get_status(url):
    # split the url into components
    url_parsed = urlsplit(url)
    # open the connection
    if url_parsed.scheme == 'https':
        reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)
    else:
        reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)
    # send GET request
    query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'
    # write query to socket
    writer.write(query.encode())
    # wait for the bytes to be written to the socket
    await writer.drain()
    # read the single line response
    response = await reader.readline()
    # close the connection
    writer.close()
    # decode and strip white space
    status = response.decode().strip()
    # return the response
    return status

接下來(lái)，我們可以為我們要檢查的多個(gè)網(wǎng)頁(yè)或網(wǎng)站調(diào)用 get_status() 協(xié)程。在這種情況下，我們將定義一個(gè)世界排名前 10 的網(wǎng)頁(yè)列表。

...
# list of top 10 websites to check
sites = ['https://www.google.com/',
    'https://www.youtube.com/',
    'https://www.facebook.com/',
    'https://twitter.com/',
    'https://www.instagram.com/',
    'https://www.baidu.com/',
    'https://www.wikipedia.org/',
    'https://yandex.ru/',
    'https://yahoo.com/',
    'https://www.whatsapp.com/'
    ]

然后我們可以使用我們的 get_status() 協(xié)程依次查詢每個(gè)。在這種情況下，我們將在一個(gè)循環(huán)中按順序這樣做，并依次報(bào)告每個(gè)狀態(tài)。

...
# check the status of all websites
for url in sites:
    # get the status for the url
    status = await get_status(url)
    # report the url and its status
    print(f'{url:30}:\t{status}')

在使用 asyncio 時(shí)，我們可以做得比順序更好，但這提供了一個(gè)很好的起點(diǎn)，我們可以在以后進(jìn)行改進(jìn)。將它們結(jié)合在一起，main() 協(xié)程查詢前 10 個(gè)網(wǎng)站的狀態(tài)。

# main coroutine
async def main():
    # list of top 10 websites to check
    sites = ['https://www.google.com/',
        'https://www.youtube.com/',
        'https://www.facebook.com/',
        'https://twitter.com/',
        'https://www.instagram.com/',
        'https://www.baidu.com/',
        'https://www.wikipedia.org/',
        'https://yandex.ru/',
        'https://yahoo.com/',
        'https://www.whatsapp.com/'
        ]
    # check the status of all websites
    for url in sites:
        # get the status for the url
        status = await get_status(url)
        # report the url and its status
        print(f'{url:30}:\t{status}')

最后，我們可以創(chuàng)建 main() 協(xié)程并將其用作 asyncio 程序的入口點(diǎn)。

...
# run the asyncio program
asyncio.run(main())

將它們結(jié)合在一起，下面列出了完整的示例。

# SuperFastPython.com
# check the status of many webpages
import asyncio
from urllib.parse import urlsplit
 
# get the HTTP/S status of a webpage
async def get_status(url):
    # split the url into components
    url_parsed = urlsplit(url)
    # open the connection
    if url_parsed.scheme == 'https':
        reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)
    else:
        reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)
    # send GET request
    query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'
    # write query to socket
    writer.write(query.encode())
    # wait for the bytes to be written to the socket
    await writer.drain()
    # read the single line response
    response = await reader.readline()
    # close the connection
    writer.close()
    # decode and strip white space
    status = response.decode().strip()
    # return the response
    return status
 
# main coroutine
async def main():
    # list of top 10 websites to check
    sites = ['https://www.google.com/',
        'https://www.youtube.com/',
        'https://www.facebook.com/',
        'https://twitter.com/',
        'https://www.instagram.com/',
        'https://www.baidu.com/',
        'https://www.wikipedia.org/',
        'https://yandex.ru/',
        'https://yahoo.com/',
        'https://www.whatsapp.com/'
        ]
    # check the status of all websites
    for url in sites:
        # get the status for the url
        status = await get_status(url)
        # report the url and its status
        print(f'{url:30}:\t{status}')
 
# run the asyncio program
asyncio.run(main())

運(yùn)行示例首先創(chuàng)建 main() 協(xié)程并將其用作程序的入口點(diǎn)。main() 協(xié)程運(yùn)行，定義前 10 個(gè)網(wǎng)站的列表。然后順序遍歷網(wǎng)站列表。 main()協(xié)程掛起調(diào)用get_status()協(xié)程查詢一個(gè)網(wǎng)站的狀態(tài)。

get_status() 協(xié)程運(yùn)行、解析 URL 并打開(kāi)連接。它構(gòu)造一個(gè) HTTP GET 查詢并將其寫入主機(jī)。讀取、解碼并返回響應(yīng)。main() 協(xié)程恢復(fù)并報(bào)告 URL 的 HTTP 狀態(tài)。

對(duì)列表中的每個(gè) URL 重復(fù)此操作。該程序大約需要 5.6 秒才能完成，或者平均每個(gè) URL 大約需要半秒。這突出了我們?nèi)绾问褂?asyncio 來(lái)查詢網(wǎng)頁(yè)的 HTTP 狀態(tài)。

盡管如此，它并沒(méi)有充分利用 asyncio 來(lái)并發(fā)執(zhí)行任務(wù)。

https://www.google.com/ : HTTP/1.1 200 OK
https://www.youtube.com/ : HTTP/1.1 200 OK
https://www.facebook.com/ : HTTP/1.1 302 Found
https://twitter.com/ : HTTP/1.1 200 OK
https://www.instagram.com/ : HTTP/1.1 200 OK
https://www.baidu.com/ : HTTP/1.1 200 OK
https://www.wikipedia.org/ : HTTP/1.1 200 OK
https://yandex.ru/ : HTTP/1.1 302 Moved temporarily
https://yahoo.com/ : HTTP/1.1 301 Moved Permanently
https://www.whatsapp.com/ : HTTP/1.1 302 Found

7. 并發(fā)查看網(wǎng)站狀態(tài)示例

asyncio 的一個(gè)好處是我們可以同時(shí)執(zhí)行許多協(xié)程。我們可以使用 asyncio.gather() 函數(shù)在 asyncio 中并發(fā)查詢網(wǎng)站的狀態(tài)。

此函數(shù)采用一個(gè)或多個(gè)協(xié)程，暫停執(zhí)行提供的協(xié)程，并將每個(gè)協(xié)程的結(jié)果作為可迭代對(duì)象返回。然后我們可以遍歷 URL 列表和可迭代的協(xié)程返回值并報(bào)告結(jié)果。

這可能是比上述方法更簡(jiǎn)單的方法。首先，我們可以創(chuàng)建一個(gè)協(xié)程列表。

...
# create all coroutine requests
coros = [get_status(url) for url in sites]

接下來(lái)，我們可以執(zhí)行協(xié)程并使用 asyncio.gather() 獲取可迭代的結(jié)果。

請(qǐng)注意，我們不能直接提供協(xié)程列表，而是必須將列表解壓縮為單獨(dú)的表達(dá)式，這些表達(dá)式作為位置參數(shù)提供給函數(shù)。

...
# execute all coroutines and wait
results = await asyncio.gather(*coros)

這將同時(shí)執(zhí)行所有協(xié)程并檢索它們的結(jié)果。然后我們可以遍歷 URL 列表和返回狀態(tài)并依次報(bào)告每個(gè)。

...
# process all results
for url, status in zip(sites, results):
    # report status
    print(f'{url:30}:\t{status}')

將它們結(jié)合在一起，下面列出了完整的示例。

# SuperFastPython.com
# check the status of many webpages
import asyncio
from urllib.parse import urlsplit
 
# get the HTTP/S status of a webpage
async def get_status(url):
    # split the url into components
    url_parsed = urlsplit(url)
    # open the connection
    if url_parsed.scheme == 'https':
        reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)
    else:
        reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)
    # send GET request
    query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'
    # write query to socket
    writer.write(query.encode())
    # wait for the bytes to be written to the socket
    await writer.drain()
    # read the single line response
    response = await reader.readline()
    # close the connection
    writer.close()
    # decode and strip white space
    status = response.decode().strip()
    # return the response
    return status
 
# main coroutine
async def main():
    # list of top 10 websites to check
    sites = ['https://www.google.com/',
        'https://www.youtube.com/',
        'https://www.facebook.com/',
        'https://twitter.com/',
        'https://www.instagram.com/',
        'https://www.baidu.com/',
        'https://www.wikipedia.org/',
        'https://yandex.ru/',
        'https://yahoo.com/',
        'https://www.whatsapp.com/'
        ]
    # create all coroutine requests
    coros = [get_status(url) for url in sites]
    # execute all coroutines and wait
    results = await asyncio.gather(*coros)
    # process all results
    for url, status in zip(sites, results):
        # report status
        print(f'{url:30}:\t{status}')
 
# run the asyncio program
asyncio.run(main())

運(yùn)行該示例會(huì)像以前一樣執(zhí)行 main() 協(xié)程。在這種情況下，協(xié)程列表是在列表理解中創(chuàng)建的。

然后調(diào)用 asyncio.gather() 函數(shù)，傳遞協(xié)程并掛起 main() 協(xié)程，直到它們?nèi)客瓿伞f(xié)程執(zhí)行，同時(shí)查詢每個(gè)網(wǎng)站并返回它們的狀態(tài)。

main() 協(xié)程恢復(fù)并接收可迭代的狀態(tài)值。然后使用 zip() 內(nèi)置函數(shù)遍歷此可迭代對(duì)象和 URL 列表，并報(bào)告狀態(tài)。

這突出了一種更簡(jiǎn)單的方法來(lái)同時(shí)執(zhí)行協(xié)程并在所有任務(wù)完成后報(bào)告結(jié)果。它也比上面的順序版本更快，在我的系統(tǒng)上完成大約 1.4 秒。

https://www.google.com/ : HTTP/1.1 200 OK
https://www.youtube.com/ : HTTP/1.1 200 OK
https://www.facebook.com/ : HTTP/1.1 302 Found
https://twitter.com/ : HTTP/1.1 200 OK
https://www.instagram.com/ : HTTP/1.1 200 OK
https://www.baidu.com/ : HTTP/1.1 200 OK
https://www.wikipedia.org/ : HTTP/1.1 200 OK
https://yandex.ru/ : HTTP/1.1 302 Moved temporarily
https://yahoo.com/ : HTTP/1.1 301 Moved Permanently
https://www.whatsapp.com/ : HTTP/1.1 302 Found

“怎么使用Python Asyncio實(shí)現(xiàn)網(wǎng)站狀態(tài)檢查”的內(nèi)容就介紹到這里了，感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí)可以關(guān)注創(chuàng)新互聯(lián)網(wǎng)站，小編將為大家輸出更多高質(zhì)量的實(shí)用文章！

本文題目：怎么使用PythonAsyncio實(shí)現(xiàn)網(wǎng)站狀態(tài)檢查
網(wǎng)頁(yè)鏈接：http://weahome.cn/article/ijihpg.html

真实的国产乱ⅩXXX66竹夫人,五月香六月婷婷激情综合,亚洲日本VA一区二区三区,亚洲精品一区二区三区麻豆

怎么使用PythonAsyncio實(shí)現(xiàn)網(wǎng)站狀態(tài)檢查

1. 如何使用 Asyncio 檢查 HTTP 狀態(tài)

2. 打開(kāi) HTTP 連接

3. 寫入 HTTP 請(qǐng)求

4. 讀取 HTTP 響應(yīng)

5. 關(guān)閉 HTTP 連接

6. 順序檢查 HTTP 狀態(tài)的示例

7. 并發(fā)查看網(wǎng)站狀態(tài)示例

其他資訊

網(wǎng)站制作

企業(yè)服務(wù)

網(wǎng)站建設(shè)

服務(wù)器托管