Sitemap

實測以 ChatGPT4 分析資料與製作圖表

3 min readJan 11, 2024
Press enter or click to view image in full size

以下使用 GhatGPT4 分析 Global YouTube Statistics 2023 ,資料包含頻道的相關資訊 eg. 頻道排名、頻道類別、訂閱數、觀看數、預估收入…等等,我們將以這份資料,分析頻道受歡迎的因素與主題。

Part 1. 資料分析 Data Analysis

Part 2. 資料視覺化 Data Visualization

Part 3. GhatGPT 使用心得

Part 1. 資料分析 Data Analysis

  • Please briefly introduce the data in the file.
    👉🏻 列出筆數、欄位介紹、資料範例、資料用途
Press enter or click to view image in full size
Q. Please briefly introduce the data in the file.
  • As a data scientist, please analyze the three most important factors that lead to top-ranking channels.
    👉🏻 畫出相關係數的 Heatmap,所有要素都與 rank 呈現負相關 e.g. 訂閱人數越多(數字越大)頻道排名越前面(數字越小)。列出相關係數最低三個要素:Subscribers, Video Views, Subscribers for Last 30 Days
Q: As a data scientist, please analyze the three most important factors that lead to top-ranking channels.
Press enter or click to view image in full size
A correlation coefficient heatmap by ChatGPT
  • What are the top three most important factors that lead to higher highest_monthly_earnings ?
    👉🏻 與每月收入相關係數最高的三個要素:Subscribers for Last 30 Days (0.65), Video Views (0.57), Subscribers (0.47)
Press enter or click to view image in full size
Q: What are the top three most important factors that lead to higher highest_monthly_earnings ?

Follow-up question: please list correlation as a table

Press enter or click to view image in full size
Q: Please list correlation as a table
  • What are the three most popular types of channels?
    👉🏻 Entertainment, Music, People
Press enter or click to view image in full size
Q: What are the three most popular types of channels?

Part 2. 資料視覺化 Data Visualization

  • Based on the analysis of the “Global_YouTube_Statistics.csv” data, what types of visualizations would best represent the trends and patterns in the data?
    👉🏻 列出幾個視覺化的建議 e.g. Bar chart, Histograms…
Press enter or click to view image in full size
Q: Based on the analysis of the “Global_YouTube_Statistics.csv” data, what types of visualizations would best represent the trends and patterns in the data?
  • Please calculate the number of channels in each channel_type, and create a bar plot
Press enter or click to view image in full size
Number of channels in each channel_type by ChatGPT
  • Please create a pie chart with the country percentage for channels within the top 100 ranks, if the percentage is smaller than 5% then combine it as a category — other.
    plot title: Country of the top-100 rank channels
Press enter or click to view image in full size
Country of the top-100 rank channels by ChatGPT
  • Please generate a WordCloud plot based on the title and without any special character
Press enter or click to view image in full size
WordCloud of Channel Title by ChatGPT

Part 3. GhatGPT 使用心得

優點

  • no-code,人人都能做資料分析
  • 方便快速,只需要等待幾秒鐘的時間就能獲得分析結果
  • 會附上程式碼,可以複製 ChatGPT 的程式碼去改

缺點

  • 有時候會出現分析錯誤 error analyzing
  • 服務穩定度問題,蠻常會出現 Internal Server Error
  • 偶爾還是會出現答非所問的情況
  • 輸出會包含大量的文字訊息(個人感覺,有些解釋的太冗長了)

Summary

以 ChatGPT 分析資料或製作圖表是蠻方便的,雖然有時候會遇到答案與預期的不同,需要多試幾個 prompt,但是方便且快速的特性幫助我在工作上解省大量的時間在製作圖表,推薦 👍

--

--

Jasmine
Jasmine

Written by Jasmine

Data Science | Data Analytics | Data Engineering — About me: https://www.linkedin.com/in/jia-min-li-jasmine/

No responses yet