马老师第三弹–做个词云

接着之前的第二弹,这次做个词云。

词云

“词云”就是通过形成“关键词云层”或“关键词渲染”,对网络文本中出现频率较高的“关键词”的视觉上的突出。

举个例子:

wordcloud

wordcloud是Python优秀的词云展示第三方库。

github: https://github.com/amueller/word_cloud

安装:pip install wordcloud

使用方法很简单:

import wordcloud
corpus = ['this is the first document',
          'this document is the second document',
          'and this is the third one',
          'is this the first document']
wc = wordcloud.WordCloud(background_color='white')
wc.generate(' '.join(corpus))
wc.to_file("python_wordcloud.png")

总共分三步:

  1. 配置对象参数: wc = wordcloud.WordCloud()
参数描述
width词云图片宽度,默认400像素
height词云图片高度,默认200像素
min_font_size词云图片字体最小字号,默认4号
max_font_size词云图片最大字号,根据高度自动调节
font_step词云图片字体字号的步进间隔,默认为1
font_path指定字体文件的路径(中文要单独设置),默认None
max_words指定词云显示的最大单词数量,默认200
stop_words指定词云的排除词列表,即不显示的单词列表
mask指定词云形状,默认为长方形,需要引用 PIL.imread()函数
background_color指定词云图片的背景颜色,默认为黑色
  1. 加载词云文本:wc.generate(' '.join(corpus)) 可以是输入的文件内容,要求为str。
  2. 输出词云图片:wc.to_file("python_wordcloud.png")

大胆的想法

import jieba
import matplotlib.pyplot as plt
from wordcloud  import WordCloud
import numpy as np
from PIL import Image
txt = open('TeacherMa.txt', "r", encoding='utf-8').read()
words = jieba.lcut(txt)
wc = WordCloud(
        font_path='/System/Library/fonts/STHeiti Medium.ttc',#macos
        background_color='white',
        width=800,
        height=600,
        max_font_size=50,
        min_font_size=5,
        mode='RGBA',
        colormap='ocean_r'
    )
result = " ".join(words)
wc.generate(result)
wc.to_file("TeacherMa_wordcloud.png")

这不好看,又有了一个大胆的想法
wordcloud可以增加mask参数设置词云图片形状,所以我用ps做了个马老师的mask图片。
大概长这样(自行脑补):

然后图片就变成了这样:

mask = np.array(Image.open("/Users/xinzipanghuang/PycharmProjects/bilibili_danmu/ma2.png"))
txt = open('TeacherMa.txt', "r", encoding='utf-8').read()
words = jieba.lcut(txt)
wc = WordCloud(
        font_path='/System/Library/fonts/STHeiti Medium.ttc',#macos
        mask=mask,
        background_color='white',
        width=800,
        height=600,
        max_font_size=50,
        min_font_size=5,
        mode='RGBA',
    )
result = " ".join(words)
wc.generate(result)
wc.to_file("TeacherMa_wordcloud_shape.png")
Categories: Python

0 Comments

Leave a Reply

Your email address will not be published.