這篇文章主要講解了“怎么使用Python的Pandas布爾索引”,文中的講解內(nèi)容簡(jiǎn)單清晰,易于學(xué)習(xí)與理解,下面請(qǐng)大家跟著小編的思路慢慢深入,一起來(lái)研究和學(xué)習(xí)“怎么使用Python的Pandas布爾索引”吧!
成都創(chuàng)新互聯(lián)公司成立10年來(lái),這條路我們正越走越好,積累了技術(shù)與客戶資源,形成了良好的口碑。為客戶提供網(wǎng)站制作、成都網(wǎng)站建設(shè)、網(wǎng)站策劃、網(wǎng)頁(yè)設(shè)計(jì)、域名申請(qǐng)、網(wǎng)絡(luò)營(yíng)銷、VI設(shè)計(jì)、網(wǎng)站改版、漏洞修補(bǔ)等服務(wù)。網(wǎng)站是否美觀、功能強(qiáng)大、用戶體驗(yàn)好、性價(jià)比高、打開(kāi)快等等,這些對(duì)于網(wǎng)站建設(shè)都非常重要,成都創(chuàng)新互聯(lián)公司通過(guò)對(duì)建站技術(shù)性的掌握、對(duì)創(chuàng)意設(shè)計(jì)的研究為客戶提供一站式互聯(lián)網(wǎng)解決方案,攜手廣大客戶,共同發(fā)展進(jìn)步。
1.計(jì)算布爾值統(tǒng)計(jì)信息
import pandas as pd import numpy as np import matplotlib.pyplot as plt #讀取movie,設(shè)定行索引是movie_title pd.options.display.max_columns = 50 movie = pd.read_csv("./data/movie.csv",index_col = 'movie_title') #判斷電影時(shí)長(zhǎng)是否超過(guò)兩個(gè)小時(shí) #Figure1 movie_2_hours = movie['duration'] > 120 #統(tǒng)計(jì)時(shí)長(zhǎng)超過(guò)兩小時(shí)的電影總數(shù) print(movie_2_hours.sum()) #result:1039 #統(tǒng)計(jì)時(shí)長(zhǎng)超過(guò)兩小時(shí)的電影的比例 print(movie_2_hours.mean()) #統(tǒng)計(jì)False和True的比例 print(movie_2_hours.value_counts(normalize = True)) #比較同一個(gè)DataFrame中的兩列 actors = movie[['actor_1_facebook_likes','actor_2_facebook_likes']].dropna() print((actors['actor_1_facebook_likes'] > actors['actor_2_facebook_likes']).mean()) #Figure2
運(yùn)行結(jié)果:
Figure1
Figure2
2. 構(gòu)建多個(gè)布爾條件
import pandas as pd import numpy as np import matplotlib.pyplot as plt #讀取movie,設(shè)定行索引是movie_title pd.options.display.max_columns = 50 movie = pd.read_csv("./data/movie.csv",index_col = 'movie_title') #創(chuàng)建多個(gè)布爾條件 criteria1 = movie.imdb_score > 8 criteria2 = movie.content_rating == "PG-13" criteria3 = (movie.title_year < 2000) | (movie.title_year >= 2010) """ print(criteria1.head()) print(criteria2.head()) print(criteria3.head()) 運(yùn)行結(jié)果:Figure1 """ #將多個(gè)布爾條件合并成一個(gè) criteria_final = criteria1 & criteria2 & criteria3 print(criteria_final.head()) #運(yùn)行結(jié)果:Figure2
運(yùn)行結(jié)果:
Figure1
Figure2
3.用布爾索引過(guò)濾
import pandas as pd import numpy as np import matplotlib.pyplot as plt #讀取movie,設(shè)定行索引是movie_title pd.options.display.max_columns = 50 movie = pd.read_csv("./data/movie.csv",index_col = 'movie_title') #創(chuàng)建第一個(gè)布爾條件 crit_a1 = movie.imdb_score > 8 crit_a2 = movie.content_rating == 'PG-13' crit_a3 = (movie.title_year < 2000) | (movie.title_year > 2009) final_crit_a = crit_a1 & crit_a2 & crit_a3 #創(chuàng)建第二個(gè)布爾條件 crit_b1 = movie.imdb_score < 5 crit_b2 = movie.content_rating == 'R' crit_b3 = (movie.title_year >= 2000) & (movie.title_year <= 2010) final_crit_b = crit_b1 & crit_b2 & crit_b3 #將兩個(gè)條件用或運(yùn)算合并起來(lái) final_crit_all = final_crit_a | final_crit_b print(final_crit_all.head()) #Figure 1 #用最終的布爾條件過(guò)濾數(shù)據(jù) print(movie[final_crit_all].head()) #Figure2
運(yùn)行結(jié)果:
Figure1
Figure2
import pandas as pd import numpy as np import matplotlib.pyplot as plt #讀取movie,設(shè)定行索引是movie_title pd.options.display.max_columns = 50 movie = pd.read_csv("./data/movie.csv",index_col = 'movie_title') #創(chuàng)建第一個(gè)布爾條件 crit_a1 = movie.imdb_score > 8 crit_a2 = movie.content_rating == 'PG-13' crit_a3 = (movie.title_year < 2000) | (movie.title_year > 2009) final_crit_a = crit_a1 & crit_a2 & crit_a3 #創(chuàng)建第二個(gè)布爾條件 crit_b1 = movie.imdb_score < 5 crit_b2 = movie.content_rating == 'R' crit_b3 = (movie.title_year >= 2000) & (movie.title_year <= 2010) final_crit_b = crit_b1 & crit_b2 & crit_b3 #將兩個(gè)條件用或運(yùn)算合并起來(lái) final_crit_all = final_crit_a | final_crit_b #使用loc,對(duì)指定的列做過(guò)濾操作,可以清楚地看到過(guò)濾是否起作用 cols = ['imdb_score','content_rating','title_year'] movie_filtered = movie.loc[final_crit_all,cols] print(movie_filtered.head(10))
運(yùn)行結(jié)果:
感謝各位的閱讀,以上就是“怎么使用Python的Pandas布爾索引”的內(nèi)容了,經(jīng)過(guò)本文的學(xué)習(xí)后,相信大家對(duì)怎么使用Python的Pandas布爾索引這一問(wèn)題有了更深刻的體會(huì),具體使用情況還需要大家實(shí)踐驗(yàn)證。這里是創(chuàng)新互聯(lián),小編將為大家推送更多相關(guān)知識(shí)點(diǎn)的文章,歡迎關(guān)注!