Python爬蟲如何使用CSS選擇器

Python爬蟲如何使用CSS選擇器？針對這個問題，這篇文章詳細(xì)介紹了相對應(yīng)的分析和解答，希望可以幫助更多想解決這個問題的小伙伴找到更簡單易行的方法。

成都創(chuàng)新互聯(lián)公司主要從事做網(wǎng)站、網(wǎng)站設(shè)計、網(wǎng)頁設(shè)計、企業(yè)做網(wǎng)站、公司建網(wǎng)站等業(yè)務(wù)。立足成都服務(wù)鷹潭,十載網(wǎng)站建設(shè)經(jīng)驗,價格優(yōu)惠、服務(wù)專業(yè),歡迎來電咨詢建站服務(wù):13518219792

CSS選擇器

這是另一種與find_all()方法有異曲同工的查找方法，寫CSS時，標(biāo)簽名不加任何修飾，類名前加.，id名前加#。

在這里我們也可以利用類似的方法來篩選元素，用到的方法是soup.select()，返回的類型是list。

（1）通過標(biāo)簽名查找

#!/usr/bin/python3
# -*- coding:utf-8 -*-
 
from bs4 import BeautifulSoup
 
html = """
The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
"""
 
# 創(chuàng)建 Beautiful Soup 對象，指定lxml解析器
soup = BeautifulSoup(html, "lxml")
 
print(soup.select("title"))
 
print(soup.select("b"))
 
print(soup.select("a"))

運行結(jié)果

[The Dormouse's story]
[The Dormouse's story]
[, Lacie, Tillie]

（2）通過類名查找

#!/usr/bin/python3
# -*- coding:utf-8 -*-
 
from bs4 import BeautifulSoup
 
html = """
The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
"""
 
# 創(chuàng)建 Beautiful Soup 對象，指定lxml解析器
soup = BeautifulSoup(html, "lxml")
 
print(soup.select(".title"))

運行結(jié)果

[The Dormouse's story
]

（3）通過id名查找

#!/usr/bin/python3
# -*- coding:utf-8 -*-
from bs4 import BeautifulSoup
 
html = """
The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
"""
 
# 創(chuàng)建 Beautiful Soup 對象，指定lxml解析器
soup = BeautifulSoup(html, "lxml")
 
print(soup.select("#link1"))

運行結(jié)果

[The Dormouse's story
]

（4）組合查找

#!/usr/bin/python3
# -*- coding:utf-8 -*-
 
from bs4 import BeautifulSoup
 
html = """
The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
"""
 
# 創(chuàng)建 Beautiful Soup 對象，指定lxml解析器
soup = BeautifulSoup(html, "lxml")
 
print(soup.select("p #link1"))

運行結(jié)果

[]

（5）屬性查找

查找時還可以加入屬性元素，屬性需要用中括號括起來，注意屬性和標(biāo)簽屬于同一節(jié)點，所以中間不能加空格，否則會無法匹配到。

#!/usr/bin/python3
# -*- coding:utf-8 -*-
 
from bs4 import BeautifulSoup
 
html = """
The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
"""
 
# 創(chuàng)建 Beautiful Soup 對象，指定lxml解析器
soup = BeautifulSoup(html, "lxml")
 
print(soup.select("a[class='sister']"))

運行結(jié)果

[, Lacie, Tillie]

同樣，屬性仍然可以與上述查找方式組合，不在同一節(jié)點的空格隔開，同一節(jié)點的不加空格。

#!/usr/bin/python3
# -*- coding:utf-8 -*-
 
from bs4 import BeautifulSoup
 
html = """
The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
"""
 
# 創(chuàng)建 Beautiful Soup 對象，指定lxml解析器
soup = BeautifulSoup(html, "lxml")
 
print(soup.select("p a[class='sister']"))

運行結(jié)果

[, Lacie, Tillie]

（6）獲取內(nèi)容

以上的select()方法返回的結(jié)果都是列表形式，可以遍歷形式輸出，然后用get_text()方法來獲取它的內(nèi)容。

#!/usr/bin/python3
# -*- coding:utf-8 -*-
 
from bs4 import BeautifulSoup
 
html = """
The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
"""
 
# 創(chuàng)建 Beautiful Soup 對象，指定lxml解析器
soup = BeautifulSoup(html, "lxml")
 
print(soup.select("p a[class='sister']"))
 
for item in soup.select("p a[class='sister']"):
    print(item.get_text())

運行結(jié)果

[
Tillie]
 
Lacie
Tillie

注意：為注釋內(nèi)容，未輸出

關(guān)于Python爬蟲如何使用CSS選擇器問題的解答就分享到這里了，希望以上內(nèi)容可以對大家有一定的幫助，如果你還有很多疑惑沒有解開，可以關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道了解更多相關(guān)知識。

網(wǎng)站名稱：Python爬蟲如何使用CSS選擇器
地址分享：http://weahome.cn/article/jhsegj.html

真实的国产乱ⅩXXX66竹夫人,五月香六月婷婷激情综合,亚洲日本VA一区二区三区,亚洲精品一区二区三区麻豆

Python爬蟲如何使用CSS選擇器

其他資訊

網(wǎng)站制作

企業(yè)服務(wù)

網(wǎng)站建設(shè)

服務(wù)器托管