{"id":241,"date":"2022-03-31T02:22:00","date_gmt":"2022-03-30T18:22:00","guid":{"rendered":"https:\/\/philip.twinight.co\/portfolio\/?p=241"},"modified":"2024-03-06T09:46:55","modified_gmt":"2024-03-06T01:46:55","slug":"exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects","status":"publish","type":"post","link":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/","title":{"rendered":"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects"},"content":{"rendered":"\n<p>This is an individual project of SDSC3014 &#8211; Introduction to Sharing Economy. I did the project in my year 2 2021\/22 Semester B.<\/p>\n\n\n\n<p><strong><span style=\"text-decoration: underline;\">Presentation Slides:<\/span><\/strong><\/p>\n\n\n<div class=\"ose-google-docs ose-uid-d8d49bb83f673515929fc371370e9509 ose-embedpress-responsive\" style=\"width:600px; height:550px; max-height:550px; max-width:100%; display:inline-block;\" data-embed-type=\"GoogleDocs\"><iframe loading=\"lazy\" allowFullScreen=\"true\" src=\"https:\/\/docs.google.com\/presentation\/d\/e\/2PACX-1vTy1fz8HSopDtFolw8BttaEmsU5MVGq1gUlX_eRNKzIEtd8AMaUq1K2U-pPMLYA3Q\/embed?start=false&#038;loop=false&#038;delayms=3000\" frameborder=\"0\" width=\"600\" height=\"550\" allowfullscreen=\"true\" mozallowfullscreen=\"true\" webkitallowfullscreen=\"true\"><\/iframe><\/div>\n\n\n\n<p>Course Instructor: Prof.&nbsp;KE Qing<\/p>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#Introduction\" >Introduction<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#Dataset_Source\" >Dataset Source<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#Module_1_Data_Exploration\" >Module 1: Data Exploration<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#Module_2_Data_Visualization\" >Module 2: Data Visualization<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#Module_3_Classification\" >Module 3: Classification<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#Module_4_Prediction\" >Module 4: Prediction<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#Module_5_Summary\" >Module 5: Summary<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Introduction\"><\/span>Introduction<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>In this course project, I will analyze a real-world dataset of Indiegogo by breaking it down into several modules.<br>Indiegogo is a P2P fundraising platform which allows people do crowdfunding when there is some interesting idea or fundraising for a charity or startup too.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Dataset_Source\"><\/span>Dataset Source<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Kaggle link:&nbsp;<a href=\"https:\/\/www.kaggle.com\/quentinmcteer\/indiegogo-crowdfunding-data\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.kaggle.com\/quentinmcteer\/indiegogo-crowdfunding-data<\/a><br>Original JSON files:&nbsp;<a href=\"https:\/\/webrobots.io\/indiegogo-dataset\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/webrobots.io\/indiegogo-dataset\/<\/a><\/p>\n\n\n\n<p><strong>Dataset Background<\/strong><a href=\"http:\/\/localhost:8888\/notebooks\/Downloads\/project.ipynb#Dataset-Background\"><\/a><\/p>\n\n\n\n<p>All the info here are copy directly from the kaggle link above for my own easier reading.<\/p>\n\n\n\n<p><strong>Context<\/strong><a href=\"http:\/\/localhost:8888\/notebooks\/Downloads\/project.ipynb#Context\"><\/a><\/p>\n\n\n\n<p>This is a mostly clean dataset that includes 22,000 Indiegogo crowdfunding campaigns between 2011-2020. Note that it is not a complete compilation of all Indiegogo campaigns during this time frame, just a sample. Using the original data, I created features by month, category, and country\/geography. Additionally, I added a &#8216;state&#8217; column that indicates whether or not the campaign was fully funded (i.e. was successful in achieving its goal). Finally, there are many other characteristic columns that qualify that type of campaign, including text data describing each Indiegogo project.<\/p>\n\n\n\n<p><strong>Content<\/strong><a href=\"http:\/\/localhost:8888\/notebooks\/Downloads\/project.ipynb#Content\"><\/a><\/p>\n\n\n\n<p>This csv was created using publicly available data housed under WebRobots.io. Web Robots is an IT firm based in Lithuania that is working on next-generation web crawling technologies. The Indiegogo data posted here is a cleaned-up version of an early 2021 Indiegogo web scraping project that the company put together.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Module_1_Data_Exploration\"><\/span>Module 1: Data Exploration<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>As I have never used the dataset before, so I have no idea what is inside and how the dataset look like. I will load&nbsp;<code>indiegogo.csv<\/code>&nbsp;first and explore the dataset for building my assumption. How do the data look like? Are there any missing values? What should I do with missing values, i.e. removal, imputation, etc.?<br>Are there any outliers? What should I do with outliers, i.e. drop them, explore the effects of outliers on models, etc.?<br>I will record my analysis procedures step by step through notation and comment. For example, I may have to remove some observations due to missingness, and remain others for further analysis.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;2]:\n# define all library that I may need to use\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport pandas as pd\nimport seaborn as sns\nimport warnings\nwarnings.simplefilter(action='ignore', category=FutureWarning)\npd.set_option('display.max_columns', 20)\npd.options.display.min_rows = 115\npd.options.mode.chained_assignment = None  # remove warning\nfrom collections import Counter\nIn &#x5B;3]:\n# load the csv file into a data frame and show the first 5 rows in order to have a quick look on the data\ndf = pd.read_csv ('indiegogo.csv')\ndf.head()\n<\/pre><\/div>\n\n\n<p><br>Out[3]:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th><\/th><th>currency<\/th><th>category<\/th><th>year_end<\/th><th>month_end<\/th><th>day_end<\/th><th>time_end<\/th><th>amount_raised<\/th><th>funded_percent<\/th><th>in_demand<\/th><th>year_launch<\/th><th>&#8230;<\/th><th>apr<\/th><th>may<\/th><th>jun<\/th><th>jul<\/th><th>aug<\/th><th>sep<\/th><th>oct<\/th><th>nov<\/th><th>dec<\/th><th>tperiod<\/th><\/tr><\/thead><tbody><tr><th>0<\/th><td>USD<\/td><td>Transportation<\/td><td>2010<\/td><td>5<\/td><td>12<\/td><td>23:59:00<\/td><td>840<\/td><td>16.80%<\/td><td>False<\/td><td>2010<\/td><td>&#8230;<\/td><td>1<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>1<\/td><\/tr><tr><th>1<\/th><td>USD<\/td><td>Human Rights<\/td><td>2010<\/td><td>7<\/td><td>2<\/td><td>23:59:00<\/td><td>250<\/td><td>20.83%<\/td><td>False<\/td><td>2010<\/td><td>&#8230;<\/td><td>0<\/td><td>0<\/td><td>1<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>2<\/td><\/tr><tr><th>2<\/th><td>USD<\/td><td>Human Rights<\/td><td>2010<\/td><td>7<\/td><td>10<\/td><td>23:59:00<\/td><td>200<\/td><td>16.67%<\/td><td>False<\/td><td>2010<\/td><td>&#8230;<\/td><td>0<\/td><td>0<\/td><td>1<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>3<\/td><\/tr><tr><th>3<\/th><td>USD<\/td><td>Photography<\/td><td>2010<\/td><td>10<\/td><td>9<\/td><td>23:59:00<\/td><td>500<\/td><td>25.00%<\/td><td>False<\/td><td>2010<\/td><td>&#8230;<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>1<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>4<\/td><\/tr><tr><th>4<\/th><td>USD<\/td><td>Human Rights<\/td><td>2011<\/td><td>1<\/td><td>12<\/td><td>23:59:00<\/td><td>360<\/td><td>0.65%<\/td><td>False<\/td><td>2010<\/td><td>&#8230;<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>1<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>5<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">5 rows \u00d7 74 columns<\/figcaption><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;4]:\n# print out the total number of rows and columns in the dataset\nprint(df.shape)\n#print out all the columns in the dataset\nprint(df.columns.tolist())\n<\/pre><\/div>\n\n\n<p>(20631, 74) [&#8216;currency&#8217;, &#8216;category&#8217;, &#8216;year_end&#8217;, &#8216;month_end&#8217;, &#8216;day_end&#8217;, &#8216;time_end&#8217;, &#8216;amount_raised&#8217;, &#8216;funded_percent&#8217;, &#8216;in_demand&#8217;, &#8216;year_launch&#8217;, &#8216;month_launch&#8217;, &#8216;day_launch&#8217;, &#8216;time_launch&#8217;, &#8216;project_id&#8217;, &#8216;tagline&#8217;, &#8216;title&#8217;, &#8216;url&#8217;, &#8216;state&#8217;, &#8216;date_launch&#8217;, &#8216;date_end&#8217;, &#8216;amount_raised_usd&#8217;, &#8216;goal_usd&#8217;, &#8216;australia&#8217;, &#8216;canada&#8217;, &#8216;switzerland&#8217;, &#8216;denmark&#8217;, &#8216;western_europe&#8217;, &#8216;great_britain&#8217;, &#8216;hong_kong&#8217;, &#8216;norway&#8217;, &#8216;sweden&#8217;, &#8216;singapore&#8217;, &#8216;united_states&#8217;, &#8216;education&#8217;, &#8216;productivity&#8217;, &#8216;energy_greentech&#8217;, &#8216;wellness&#8217;, &#8216;comics&#8217;, &#8216;fashion_wearables&#8217;, &#8216;video_games&#8217;, &#8216;photography&#8217;, &#8216;tv_shows&#8217;, &#8216;dance_theater&#8217;, &#8216;phones_accessories&#8217;, &#8216;audio&#8217;, &#8216;film&#8217;, &#8216;transportation&#8217;, &#8216;art&#8217;, &#8216;environment&#8217;, &#8216;writing_publishing&#8217;, &#8216;music&#8217;, &#8216;travel_outdoors&#8217;, &#8216;health_fitness&#8217;, &#8216;tabletop_games&#8217;, &#8216;home&#8217;, &#8216;local_business&#8217;, &#8216;food_beverage&#8217;, &#8216;culture&#8217;, &#8216;human_rights&#8217;, &#8216;podcasts_vlogs&#8217;, &#8216;camera_gear&#8217;, &#8216;jan&#8217;, &#8216;feb&#8217;, &#8216;mar&#8217;, &#8216;apr&#8217;, &#8216;may&#8217;, &#8216;jun&#8217;, &#8216;jul&#8217;, &#8216;aug&#8217;, &#8216;sep&#8217;, &#8216;oct&#8217;, &#8216;nov&#8217;, &#8216;dec&#8217;, &#8216;tperiod&#8217;]<\/p>\n\n\n\n<p><strong>Hypothesis: The duration, country or even the category can be used to predict whether the fundraising project is successful or not.<\/strong><a href=\"http:\/\/localhost:8888\/notebooks\/Downloads\/project.ipynb#Hypothesis:-The-duration,-country-or-even-the-category-can-be-used-to-predict-whether-the-fundraising-project-is-successful-or-not.\"><\/a><\/p>\n\n\n\n<p>I made this Hypothesis after first glance on the dataset. So what I am going to do is to clean and combine the dataset and drop useless column for my next visual module.<\/p>\n\n\n\n<p>So first, we have to find out where and how many the missing data are.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;6]:\nobj = df.isnull().sum()\nfor key,value in obj.iteritems():\n    print(key,&quot;,&quot;,value)\n<\/pre><\/div>\n\n\n<p>currency , 0 category , 0 year_end , 0 month_end , 0 day_end , 0 time_end , 0 amount_raised , 0 funded_percent , 0 in_demand , 0 year_launch , 0 month_launch , 0 day_launch , 0 time_launch , 0 project_id , 0 tagline , 12 title , 5 url , 0 state , 0 date_launch , 0 date_end , 0 amount_raised_usd , 0 goal_usd , 0 australia , 0 canada , 0 switzerland , 0 denmark , 0 western_europe , 0 great_britain , 0 hong_kong , 0 norway , 0 sweden , 0 singapore , 0 united_states , 0 education , 0 productivity , 0 energy_greentech , 0 wellness , 0 comics , 0 fashion_wearables , 0 video_games , 0 photography , 0 tv_shows , 0 dance_theater , 0 phones_accessories , 0 audio , 0 film , 0 transportation , 0 art , 0 environment , 0 writing_publishing , 0 music , 0 travel_outdoors , 0 health_fitness , 0 tabletop_games , 0 home , 0 local_business , 0 food_beverage , 0 culture , 0 human_rights , 0 podcasts_vlogs , 0 camera_gear , 0 jan , 0 feb , 0 mar , 0 apr , 0 may , 0 jun , 0 jul , 0 aug , 0 sep , 0 oct , 0 nov , 0 dec , 0 tperiod , 0<\/p>\n\n\n\n<p>As we can see tagline and title contain Null value, but these two aren&#8217;t the variables we have to consider so we don&#8217;t have to do anything to them.<\/p>\n\n\n\n<p>Now, we have to make a new dataframe which contains all the columns we want first.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;5]:\ndf1 = df&#x5B;&#x5B;'currency', 'category', 'state', 'date_launch', 'date_end', 'goal_usd']]\ndf1.head()\n<\/pre><\/div>\n\n\n<p><br>Out[5]:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th><\/th><th>currency<\/th><th>category<\/th><th>state<\/th><th>date_launch<\/th><th>date_end<\/th><th>goal_usd<\/th><\/tr><\/thead><tbody><tr><th>0<\/th><td>USD<\/td><td>Transportation<\/td><td>0<\/td><td>2010-04-21<\/td><td>2010-05-12<\/td><td>5000.0<\/td><\/tr><tr><th>1<\/th><td>USD<\/td><td>Human Rights<\/td><td>0<\/td><td>2010-06-10<\/td><td>2010-07-02<\/td><td>1200.0<\/td><\/tr><tr><th>2<\/th><td>USD<\/td><td>Human Rights<\/td><td>0<\/td><td>2010-06-18<\/td><td>2010-07-10<\/td><td>1200.0<\/td><\/tr><tr><th>3<\/th><td>USD<\/td><td>Photography<\/td><td>0<\/td><td>2010-09-09<\/td><td>2010-10-09<\/td><td>2000.0<\/td><\/tr><tr><th>4<\/th><td>USD<\/td><td>Human Rights<\/td><td>0<\/td><td>2010-09-14<\/td><td>2011-01-12<\/td><td>55000.0<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;6]:\nfrom datetime import datetime\ndf1&#x5B;&#x5B;'date_launch','date_end']] = df1&#x5B;&#x5B;'date_launch','date_end']].apply(pd.to_datetime)\ndf1&#x5B;'fund_time'] = (df1&#x5B;'date_end'] - df1&#x5B;'date_launch']).dt.days\ndf1.head()\n<\/pre><\/div>\n\n\n<p><br>Out[6]:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th><\/th><th>currency<\/th><th>category<\/th><th>state<\/th><th>date_launch<\/th><th>date_end<\/th><th>goal_usd<\/th><th>fund_time<\/th><\/tr><\/thead><tbody><tr><th>0<\/th><td>USD<\/td><td>Transportation<\/td><td>0<\/td><td>2010-04-21<\/td><td>2010-05-12<\/td><td>5000.0<\/td><td>21<\/td><\/tr><tr><th>1<\/th><td>USD<\/td><td>Human Rights<\/td><td>0<\/td><td>2010-06-10<\/td><td>2010-07-02<\/td><td>1200.0<\/td><td>22<\/td><\/tr><tr><th>2<\/th><td>USD<\/td><td>Human Rights<\/td><td>0<\/td><td>2010-06-18<\/td><td>2010-07-10<\/td><td>1200.0<\/td><td>22<\/td><\/tr><tr><th>3<\/th><td>USD<\/td><td>Photography<\/td><td>0<\/td><td>2010-09-09<\/td><td>2010-10-09<\/td><td>2000.0<\/td><td>30<\/td><\/tr><tr><th>4<\/th><td>USD<\/td><td>Human Rights<\/td><td>0<\/td><td>2010-09-14<\/td><td>2011-01-12<\/td><td>55000.0<\/td><td>120<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;7]:\ndf1= df1.rename(columns={'currency': 'country'})\ndf1 = df1.replace(&quot;USD&quot;, &quot;united_states&quot;)\ndf1 = df1.replace(&quot;SGD&quot;, &quot;singapore&quot;)\ndf1 = df1.replace(&quot;SEK&quot;, &quot;sweden&quot;)\ndf1 = df1.replace(&quot;NOK&quot;, &quot;norway&quot;)\ndf1 = df1.replace(&quot;HKD&quot;, &quot;hong_kong&quot;)\ndf1 = df1.replace(&quot;GBP&quot;, &quot;britain&quot;)\ndf1 = df1.replace(&quot;EUR&quot;, &quot;europe&quot;)\ndf1 = df1.replace(&quot;DKK&quot;, &quot;denmark&quot;)\ndf1 = df1.replace(&quot;CHF&quot;, &quot;switzerland&quot;)\ndf1 = df1.replace(&quot;CAD&quot;, &quot;canada&quot;)\ndf1 = df1.replace(&quot;AUD&quot;, &quot;australia&quot;)\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Module_2_Data_Visualization\"><\/span>Module 2: Data Visualization<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>After I have cleaned the data to what I want. Now I will have to plot those figure to visualize my exploration and findings for module 3.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;10]:\ndf1.groupby(&#x5B;'country', 'state']).size().to_frame().reset_index()\n<\/pre><\/div>\n\n\n<p><br>Out[10]:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th><\/th><th>country<\/th><th>state<\/th><th>0<\/th><\/tr><\/thead><tbody><tr><th>0<\/th><td>australia<\/td><td>0<\/td><td>272<\/td><\/tr><tr><th>1<\/th><td>australia<\/td><td>1<\/td><td>26<\/td><\/tr><tr><th>2<\/th><td>britain<\/td><td>0<\/td><td>1373<\/td><\/tr><tr><th>3<\/th><td>britain<\/td><td>1<\/td><td>106<\/td><\/tr><tr><th>4<\/th><td>canada<\/td><td>0<\/td><td>889<\/td><\/tr><tr><th>5<\/th><td>canada<\/td><td>1<\/td><td>33<\/td><\/tr><tr><th>6<\/th><td>denmark<\/td><td>0<\/td><td>13<\/td><\/tr><tr><th>7<\/th><td>denmark<\/td><td>1<\/td><td>1<\/td><\/tr><tr><th>8<\/th><td>europe<\/td><td>0<\/td><td>1485<\/td><\/tr><tr><th>9<\/th><td>europe<\/td><td>1<\/td><td>140<\/td><\/tr><tr><th>10<\/th><td>hong_kong<\/td><td>0<\/td><td>60<\/td><\/tr><tr><th>11<\/th><td>hong_kong<\/td><td>1<\/td><td>125<\/td><\/tr><tr><th>12<\/th><td>norway<\/td><td>1<\/td><td>1<\/td><\/tr><tr><th>13<\/th><td>singapore<\/td><td>0<\/td><td>30<\/td><\/tr><tr><th>14<\/th><td>singapore<\/td><td>1<\/td><td>10<\/td><\/tr><tr><th>15<\/th><td>sweden<\/td><td>0<\/td><td>9<\/td><\/tr><tr><th>16<\/th><td>sweden<\/td><td>1<\/td><td>1<\/td><\/tr><tr><th>17<\/th><td>switzerland<\/td><td>0<\/td><td>16<\/td><\/tr><tr><th>18<\/th><td>switzerland<\/td><td>1<\/td><td>5<\/td><\/tr><tr><th>19<\/th><td>united_states<\/td><td>0<\/td><td>14376<\/td><\/tr><tr><th>20<\/th><td>united_states<\/td><td>1<\/td><td>1660<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;11]:\nsns.countplot(x='state',data=df1)\n<\/pre><\/div>\n\n\n<p><br>Out[11]:&lt;AxesSubplot:xlabel=&#8217;state&#8217;, ylabel=&#8217;count&#8217;&gt;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-opt-id=1480639360  fetchpriority=\"high\" loading=\"eager\" decoding=\"async\" width=\"402\" height=\"262\" src=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-23.png\" alt=\"\" class=\"wp-image-245\" srcset=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:402\/h:262\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-23.png 402w, https:\/\/mlcznkdztmb6.i.optimole.com\/w:300\/h:196\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-23.png 300w\" sizes=\"auto, (max-width: 402px) 100vw, 402px\" \/><\/figure>\n\n\n\n<p>Only around 15% of Projects are granted.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;8]:\nnp.log10(df1&#x5B;'goal_usd']).plot.hist(bins = 35)\nplt.title('Logged Indiegogo Goals')\nplt.xlabel('log10 Goal')\n<\/pre><\/div>\n\n\n<p><br>Out[8]:Text(0.5, 0, &#8216;log10 Goal&#8217;)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-opt-id=1868930819  fetchpriority=\"high\" loading=\"eager\" decoding=\"async\" width=\"395\" height=\"278\" src=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-24.png\" alt=\"\" class=\"wp-image-246\" srcset=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:395\/h:278\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-24.png 395w, https:\/\/mlcznkdztmb6.i.optimole.com\/w:300\/h:211\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-24.png 300w\" sizes=\"auto, (max-width: 395px) 100vw, 395px\" \/><\/figure>\n\n\n\n<p>The goal dollar is in a normal distribution.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;13]:\ndf1&#x5B;'log_goal'] = df1.loc&#x5B;:, 'goal_usd'].apply(lambda l: np.log10(l+1))\n\u200b\nplt.figure(figsize = (6,6))\nsns.boxplot(x ='state', y = 'log_goal', data = df1)\nplt.title('Successful Fundraiser have on average lower Goals')\n\u200b\ndf.groupby('state')&#x5B;'goal_usd'].median().T\n<\/pre><\/div>\n\n\n<p><br>Out[13]:state 0 10000.00000 1 12062.59975 Name: goal_usd, dtype: float64<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-opt-id=906284972  loading=\"lazy\" decoding=\"async\" width=\"387\" height=\"387\" src=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-25.png\" alt=\"\" class=\"wp-image-248\" srcset=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:387\/h:387\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-25.png 387w, https:\/\/mlcznkdztmb6.i.optimole.com\/w:300\/h:300\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-25.png 300w, https:\/\/mlcznkdztmb6.i.optimole.com\/w:150\/h:150\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-25.png 150w\" sizes=\"auto, (max-width: 387px) 100vw, 387px\" \/><\/figure>\n\n\n\n<p>The median successful fundraiser had a goal of &#8216;10000USD&#8217; while the median unsuccessful fundraiser had a goal of &#8216;12062USD&#8217;.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;9]:\ndfg = df1.groupby(&#x5B;'country', 'state']).size()\ndf1.groupby(&#x5B;'country', 'state']).size().unstack(fill_value=0).plot.bar()\nplt.title(&quot;Number of fundraising in different country&quot;)\nplt.xlabel('Country')\nplt.ylabel('Number')\nplt.legend(&#x5B;&quot;Failure&quot;, &quot;Success&quot;]) \nplt.show()\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image size-full\"><img data-opt-id=2101901816  loading=\"lazy\" decoding=\"async\" width=\"402\" height=\"336\" src=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-26.png\" alt=\"\" class=\"wp-image-250\" srcset=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:402\/h:336\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-26.png 402w, https:\/\/mlcznkdztmb6.i.optimole.com\/w:300\/h:251\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-26.png 300w\" sizes=\"auto, (max-width: 402px) 100vw, 402px\" \/><\/figure>\n\n\n\n<p>We can clearly see that most of the fundraising is from the united_states.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;15]:\nduration_values = df1.loc&#x5B;:, 'fund_time'].unique()\nper_duration_approved = {}\nper_duration_count = {}\nfor dur in duration_values:\n    per_duration_approved&#x5B;dur] = df1.loc&#x5B;df1&#x5B;'fund_time'] == dur, 'state'].sum() \/ float((df1.loc&#x5B;:, 'fund_time']==dur).sum())\n    per_duration_count&#x5B;dur] = len(df1.loc&#x5B;df1&#x5B;'fund_time'] == dur, 'state'])\n\u200b\nplt.figure(figsize=(12, 5), dpi= 80)\nax = plt.subplot(1, 2, 2)\ndf1&#x5B;'log_goal'] = df1.loc&#x5B;:, 'goal_usd'].apply(lambda l: np.log10(l+1))\ndf1.plot(kind='scatter', x='log_goal', y='fund_time', s = 2, alpha = 0.2, ax=ax, fontsize=10, colormap='Paired', c='state');\nax.set_xlabel('State (log $)', fontsize=12)\nax.set_ylabel('Duration (days)', fontsize=12)\nplt.title('Goal is generally independent to duration')\n\u200b\n\u200b\nplt.subplot(1, 2, 1)    \nplt.bar(list(per_duration_approved.keys()),list(per_duration_approved.values()), alpha = 0.5)\nplt.xlabel('Duration (days)', fontsize=12)\nplt.ylabel('% Funded', fontsize=12)\nplt.title('Success rate as function of duration')\n<\/pre><\/div>\n\n\n<p><br>Out[15]:Text(0.5, 1.0, &#8216;Success rate as function of duration&#8217;)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-opt-id=121435306  loading=\"lazy\" decoding=\"async\" width=\"808\" height=\"371\" src=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-27.png\" alt=\"\" class=\"wp-image-252\" srcset=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:808\/h:371\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-27.png 808w, https:\/\/mlcznkdztmb6.i.optimole.com\/w:300\/h:138\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-27.png 300w, https:\/\/mlcznkdztmb6.i.optimole.com\/w:768\/h:353\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-27.png 768w\" sizes=\"auto, (max-width: 792px) 100vw, 792px\" \/><\/figure>\n\n\n\n<p><strong>Little conclusion on the above two graph<\/strong><a href=\"http:\/\/localhost:8888\/notebooks\/Downloads\/project.ipynb#Little-conclusion-on-the-above-two-graph\"><\/a><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>We can see that most of the fundraising event lasted for 0-100 days only.<\/li>\n\n\n\n<li>Goal USD (fundraising money) is generally independent to duration.<\/li>\n\n\n\n<li>As the sample data of those longer duration of the fundraising event are not enough, so we cannot really conclude that there is any relation between longer duration and higher success rate.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Module_3_Classification\"><\/span>Module 3: Classification<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>In this step, I will evaluate different machine learning model performance in order to decide which one should be used in the next prediction module.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;74]:\nimport sklearn\nfrom sklearn import model_selection\nfrom sklearn import preprocessing, svm\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.linear_model import LinearRegression, LogisticRegression\nfrom sklearn.naive_bayes import GaussianNB\nfrom sklearn.tree import DecisionTreeClassifier\nfrom sklearn.neighbors import KNeighborsClassifier\nfrom sklearn.discriminant_analysis import LinearDiscriminantAnalysis\nfrom sklearn.svm import SVC\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.metrics import accuracy_score\nfrom sklearn.datasets import make_regression\nfrom sklearn.ensemble import RandomForestRegressor\nIn &#x5B;20]:\nfinal = df1&#x5B;&#x5B;'country', 'category', 'state', 'goal_usd', 'fund_time']]\n<\/pre><\/div>\n\n\n<p><br>Out[20]:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th><\/th><th>country<\/th><th>category<\/th><th>state<\/th><th>goal_usd<\/th><th>fund_time<\/th><\/tr><\/thead><tbody><tr><th>0<\/th><td>united_states<\/td><td>Transportation<\/td><td>0<\/td><td>5000.0<\/td><td>21<\/td><\/tr><tr><th>1<\/th><td>united_states<\/td><td>Human Rights<\/td><td>0<\/td><td>1200.0<\/td><td>22<\/td><\/tr><tr><th>2<\/th><td>united_states<\/td><td>Human Rights<\/td><td>0<\/td><td>1200.0<\/td><td>22<\/td><\/tr><tr><th>3<\/th><td>united_states<\/td><td>Photography<\/td><td>0<\/td><td>2000.0<\/td><td>30<\/td><\/tr><tr><th>4<\/th><td>united_states<\/td><td>Human Rights<\/td><td>0<\/td><td>55000.0<\/td><td>120<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><br>First, I will have to change all the categorical data (string) into different value (integer) so that they can be processed.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;34]:\ndef handle_non_numerical_data(df):\n    columns = final.columns.values\n\u200b\n    for column in columns:\n        text_digit_vals = {}\n        def convert_to_int(val):\n            return text_digit_vals&#x5B;val]\n        \n        if final&#x5B;column].dtype != np.int64 and final&#x5B;column].dtype != np.float64:\n            column_contents = final&#x5B;column].values.tolist()\n            unique_elements = set(column_contents)\n            x = 0\n            for unique in unique_elements:\n                if unique not in text_digit_vals:\n                    text_digit_vals&#x5B;unique] = x\n                    x+=1\n            final&#x5B;column] = list(map(convert_to_int, final&#x5B;column]))\n  \n    return final\nIn &#x5B;36]:\nfinal = handle_non_numerical_data(final)\nfinal.head()\n<\/pre><\/div>\n\n\n<p><br>Out[36]:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th><\/th><th>country<\/th><th>category<\/th><th>state<\/th><th>goal_usd<\/th><th>fund_time<\/th><\/tr><\/thead><tbody><tr><th>0<\/th><td>9<\/td><td>23<\/td><td>0<\/td><td>5000.0<\/td><td>21<\/td><\/tr><tr><th>1<\/th><td>9<\/td><td>21<\/td><td>0<\/td><td>1200.0<\/td><td>22<\/td><\/tr><tr><th>2<\/th><td>9<\/td><td>21<\/td><td>0<\/td><td>1200.0<\/td><td>22<\/td><\/tr><tr><th>3<\/th><td>9<\/td><td>0<\/td><td>0<\/td><td>2000.0<\/td><td>30<\/td><\/tr><tr><th>4<\/th><td>9<\/td><td>21<\/td><td>0<\/td><td>55000.0<\/td><td>120<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>So now, I will split the dataset into 70:30 ratio of training and testing tests. The former will be used for model training and the latter for evaluating the performance of the trained model.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;52]:\nX = np.array(final.drop(&#x5B;'state'],1))\ny = np.array(final&#x5B;'state'])\n\u200b\nprint('Shape X: ', X.shape)\nprint('Shape y: ', y.shape)\n\u200b\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=666)\nprint('Train Shapes: ', X_train.shape, y_train.shape)\nprint('Test Shapes: ', X_test.shape, y_test.shape)\n<\/pre><\/div>\n\n\n<p>Shape X: (20631, 4) Shape y: (20631,) Train Shapes: (14441, 4) (14441,) Test Shapes: (6190, 4) (6190,)<\/p>\n\n\n\n<p><br><strong>Scale the data to fit the scaler<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;53]:\nfrom sklearn.preprocessing import StandardScaler\nscaler = StandardScaler()\nX_train_scaled = scaler.fit_transform(X_train)\nX_test_scaled = scaler.transform(X_test)\n<\/pre><\/div>\n\n\n<p><strong>Apply Dimensional Reduction by using Principal Components Analysis (PCA)<\/strong><a href=\"http:\/\/localhost:8888\/notebooks\/Downloads\/project.ipynb#Apply-Dimensional-Reduction-by-using-Principal-Components-Analysis-(PCA)\"><\/a><\/p>\n\n\n\n<p>I will use &#8216;n_componets=2&#8217; as the original data has 4 columns (country, category, goal_usd, fund_time) and the code can project the original data into 2 dimensions only which can speed up the learning algorithm.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;55]:\nfrom sklearn.decomposition import PCA\npca = PCA(n_components=2)\nX_train_dim_red = pca.fit_transform(X_train_scaled)\nX_test_dim_red = pca.transform(X_test_scaled)\n<\/pre><\/div>\n\n\n<p><strong>Model Evaluation<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;59]:\nmodels = &#x5B;]\nmodels.append(('CART', DecisionTreeClassifier()))\nmodels.append(('NB', GaussianNB()))\nmodels.append(('LDA', LinearDiscriminantAnalysis()))\nmodels.append(('KNN', KNeighborsClassifier()))\nmodels.append(('LR', LogisticRegression()))\nmodels.append(('SVM', SVC(gamma='auto')))\nmodels.append(('RFC', RandomForestClassifier()))\n\u200b\nresults = &#x5B;]\nnames = &#x5B;]\nscoring = 'accuracy'\nfor name, model in models:\n    kfold = model_selection.KFold(n_splits = 10, shuffle = True, random_state = 666)\n    cv_results = model_selection.cross_val_score(model, X_train_dim_red, y_train, cv = kfold, scoring = scoring)\n    results.append(cv_results)\n    names.append(name)\n    result = &quot;%s: %f (%f)&quot; % (name, cv_results.mean(), cv_results.std())\n    print(result)\n<\/pre><\/div>\n\n\n<p>CART: 0.889343 (0.008686) NB: 0.897307 (0.006796) LDA: 0.897445 (0.007028) KNN: 0.912472 (0.005327) LR: 0.900907 (0.007321) SVM: 0.913787 (0.007752) RFC: 0.912540 (0.006731)<\/p>\n\n\n\n<p>Although Support Vector Classification got the highest accuracy from fitting our data, the speed of it is way too slow. So I will use RandomForestClassifier instead.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Module_4_Prediction\"><\/span>Module 4: Prediction<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>In this part, I am going to use RandomForestCleassifier model to predict the outcome of whether Indiegogo fundraise was success or not.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;66]:\nclf = RandomForestClassifier()\nclf = clf.fit(X_train_dim_red, y_train)\nY_pred = clf.predict(DecisionTreeClassifier())\nIn &#x5B;67]:\nprediction = clf.predict(X_test_dim_red)\nprint('prediction: ', prediction)\ntype(prediction)\n<\/pre><\/div>\n\n\n<p>prediction: [0 0 0 &#8230; 0 0 0]<\/p>\n\n\n\n<p>Out[67]:numpy.ndarray<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;71]:\nmodel = RandomForestClassifier()\nmodel.fit(X_train_dim_red, y_train)\npredictions = model.predict(X_test_dim_red)\nprint(f'Model Accuracy: {accuracy_score(y_test, predictions):.2f}')\n<\/pre><\/div>\n\n\n<p>Model Accuracy: 0.91<\/p>\n\n\n\n<p><strong>Random Forest Classifier Feature Importance<\/strong><a href=\"http:\/\/localhost:8888\/notebooks\/Downloads\/project.ipynb#Random-Forest-Classifier-Feature-Importance\"><\/a><\/p>\n\n\n\n<p>Because of the feature is being reduced into 2 when using Principal Components Analysis so I have to use the original non-scaled data to process the model again in order to get the feature importance rank of all 4 features.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;94]:\n# define the new model\nnew_model = RandomForestClassifier()\n# fit the model\nnew_model.fit(X_train, y_train)\n# get importance\nimportance = new_model.feature_importances_\n# summarize feature importance\nfor i,v in enumerate(importance):\n    print('Feature: %0d, Score: %.5f' % (i,v))\n<\/pre><\/div>\n\n\n<p>Feature: 0, Score: 0.05462 Feature: 1, Score: 0.11225 Feature: 2, Score: 0.41621 Feature: 3, Score: 0.41691<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nIn &#x5B;95]:\nfeats = {} # a dict to hold feature_name: feature_importance\nfinal_df = final.drop(&#x5B;'state'],1)\nfor feature, importance in zip(final_df.columns, \nnew_model.feature_importances_):\n     feats&#x5B;feature] = importance #add the name\/value pair \n\u200b\nimportances = pd.DataFrame.from_dict(feats, orient=&quot;index&quot;).rename(columns={0: 'Gini-importance'})\nimportances.sort_values(by='Gini-importance').plot(kind='barh',figsize=(10,8))\n<\/pre><\/div>\n\n\n<p><br>Out[95]:&lt;AxesSubplot:&gt;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-opt-id=1156396944  loading=\"lazy\" decoding=\"async\" width=\"629\" height=\"466\" src=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-28.png\" alt=\"\" class=\"wp-image-256\" srcset=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:629\/h:466\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-28.png 629w, https:\/\/mlcznkdztmb6.i.optimole.com\/w:300\/h:222\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/image-28.png 300w\" sizes=\"auto, (max-width: 629px) 100vw, 629px\" \/><\/figure>\n\n\n\n<p>For more convient to see which feature best affecting the result in the model, I have made a new graph that use the feature names directly.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Module_5_Summary\"><\/span>Module 5: Summary<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>I am going to summarize my findings and draw conclusions by using Q and A format. So that this chance of experience can be better recorded.<\/p>\n\n\n\n<p>Q:&nbsp;<strong>What have you done and what have you learned?<\/strong><br>A: So basically to conclude what I have done, I have to understand the property of the given data first so that I can start building up the plan to analyze it which is all the stuff you can see in module 1. Then, I will go deeper into the data like visualizing different variables with the target value by plotting graph. It gives me some insight like whether the dataset is linearly separable so that the result will be much easier to predict in the later module. In the process of doing all of this, you can see that I keep adjusting the dataset so that it fits my requirement to finish certain tasks. For example, without scaling the data to fit the scaler, the model may not be that easy to learn and understand the problem as the values of the features are not that close to each other, the algorithm may become slower. Therefore, before the process of the dimension reduction and classification, I have applied some of the skills I learnt from the tutorials like I have to encode all the categorical items first so that the model can process it easily. Apart from that, I use quite a lot of time on reading docs of some function and the library. Except from the offical docs, I found stackoverflow and geeksforgeeks provided me many useful resources as well. This experience can certainly conslidate my coding skills and logic flow.<\/p>\n\n\n\n<p>Q:&nbsp;<strong>What is the biggest difficuly in this project and how did you solve it?<\/strong><br>A: Undoubtedly, everytime when I have to answer this question, I would say module 2 is the most challenging parts for me. As this module requires me to interpret it step by step, you cannot always plot the exact graph that give you insight on what is the next step, so it takes a lot of time to discover and explore it yourself. The dataset itself doesn&#8217;t give me any hints of finding the relation between feautres(variables) and objects(class). So I have to base on my little experience to plot graph myself to find something useful as there isn&#8217;t any fixed standard working flow on it. I stucked in this part for quite long after plotting many different graphs, but I turned out understand that not always there will be lots of big discoveries when you are trying to explore a new dataset. You have to understand that sometimes finding something simple doesn&#8217;t mean you have not done anything, it is also a part of exploration and analyzing result. There is not always a fixed solution for a problem, data only gives you insight on how you can use it to prove your view.<\/p>\n\n\n\n<p>Q:&nbsp;<strong>What do you think of annotation, how does it helps?<\/strong><br>A: I have annotate each part steps by steps in order to let the readers understand what is that part doing. It helps me to debug as I can find the problem of code quickly with those annotation. And when I try to present it to any other, I can share it with more clearer steps as everything was done in order.<\/p>\n\n\n\n<p>Q:&nbsp;<strong>How you finish solving this project?<\/strong><br>A: My basic flow of solving quesion and doing analysis is writing all steps onto a paper first. During the progress, I have to further brainstorm the possibility of doing wrong or missing important factors, so I will always think of what I am going to do in this coming module and strictly follow the procedures so that I won&#8217;t digress from the progress and don&#8217;t know what I am doing and what I have to do.<\/p>\n\n\n\n<p>Q:&nbsp;<strong>What are the main results?<\/strong><br>A: As I stated above, Support Vector Machines is currently the best performing model among the models I have tested, and the probability of Random Forest classifying positive samples on the test set is pretty high too which is over 91%.<\/p>\n\n\n\n<p>Q:&nbsp;<strong>From the result, what have you discovered?<\/strong><br>A: In my opinion, it seems that the overall funding time and the money to raise affecting the success rate of a fundraising event the most as it also proved our assumption that it is totally possible to predict the state of a campaign by only using the duration and the goal of the event.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is an individual project of SDSC3014 &#8211; Introduction to Sharing Economy. I did the project in my year 2 2021\/22 Semester B. Presentation Slides: Course Instructor: Prof.&nbsp;KE Qing Introduction &hellip; <a href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/\" class=\"more-link\"><span>Continue reading<span class=\"screen-reader-text\">Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects<\/span><\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":325,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[73,3],"tags":[25,13,30,29,21],"class_list":["post-241","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-analysis","category-proj","tag-2021-22-semester-b","tag-data-science","tag-introduction-to-sharing-economy","tag-sdsc3014","tag-year-2"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects - Philip\u2019s Data Science Diary<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects - Philip\u2019s Data Science Diary\" \/>\n<meta property=\"og:description\" content=\"This is an individual project of SDSC3014 &#8211; Introduction to Sharing Economy. I did the project in my year 2 2021\/22 Semester B. Presentation Slides: Course Instructor: Prof.&nbsp;KE Qing Introduction &hellip; Continue readingExploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects\" \/>\n<meta property=\"og:url\" content=\"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/\" \/>\n<meta property=\"og:site_name\" content=\"Philip\u2019s Data Science Diary\" \/>\n<meta property=\"article:published_time\" content=\"2022-03-30T18:22:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-03-06T01:46:55+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2022\/03\/Analyzing-Key-Success-Factors-of-Indiegogo.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1080\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Philip\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Philip\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/\"},\"author\":{\"name\":\"Philip\",\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/#\\\/schema\\\/person\\\/ef4f7cedd9b3bde11e126c4dbe1f8414\"},\"headline\":\"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects\",\"datePublished\":\"2022-03-30T18:22:00+00:00\",\"dateModified\":\"2024-03-06T01:46:55+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/\"},\"wordCount\":2160,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/#\\\/schema\\\/person\\\/ef4f7cedd9b3bde11e126c4dbe1f8414\"},\"image\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\/\\/philip.twinight.co\\/portfolio\\/wp-content\\/uploads\\/2022\\/03\\/Analyzing-Key-Success-Factors-of-Indiegogo.png\",\"keywords\":[\"2021\\\/22 Semester B\",\"Data Science\",\"Introduction to Sharing Economy\",\"SDSC3014\",\"Year 2\"],\"articleSection\":[\"Data Analysis\",\"Projects\"],\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/\",\"url\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/\",\"name\":\"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects - Philip\u2019s Data Science Diary\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\/\\/philip.twinight.co\\/portfolio\\/wp-content\\/uploads\\/2022\\/03\\/Analyzing-Key-Success-Factors-of-Indiegogo.png\",\"datePublished\":\"2022-03-30T18:22:00+00:00\",\"dateModified\":\"2024-03-06T01:46:55+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/#primaryimage\",\"url\":\"https:\\/\\/philip.twinight.co\\/portfolio\\/wp-content\\/uploads\\/2022\\/03\\/Analyzing-Key-Success-Factors-of-Indiegogo.png\",\"contentUrl\":\"https:\\/\\/philip.twinight.co\\/portfolio\\/wp-content\\/uploads\\/2022\\/03\\/Analyzing-Key-Success-Factors-of-Indiegogo.png\",\"width\":1920,\"height\":1080},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/2022\\\/03\\\/31\\\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\u9996\u9801\",\"item\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/#website\",\"url\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/\",\"name\":\"Philip\u2019s University Data Science Journey\",\"description\":\"Navigating Data Science: From Classroom to Career\",\"publisher\":{\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/#\\\/schema\\\/person\\\/ef4f7cedd9b3bde11e126c4dbe1f8414\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/#\\\/schema\\\/person\\\/ef4f7cedd9b3bde11e126c4dbe1f8414\",\"name\":\"Philip\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\/\\/philip.twinight.co\\/portfolio\\/wp-content\\/uploads\\/2024\\/03\\/favicon.png\",\"url\":\"https:\\/\\/philip.twinight.co\\/portfolio\\/wp-content\\/uploads\\/2024\\/03\\/favicon.png\",\"contentUrl\":\"https:\\/\\/philip.twinight.co\\/portfolio\\/wp-content\\/uploads\\/2024\\/03\\/favicon.png\",\"width\":16,\"height\":16,\"caption\":\"Philip\"},\"logo\":{\"@id\":\"https:\\/\\/philip.twinight.co\\/portfolio\\/wp-content\\/uploads\\/2024\\/03\\/favicon.png\"},\"description\":\"Data Scientist &amp; Systems Engineer. Graduated from City University of Hong Kong. Previously founded Twinight Limited as CTO, developing AI investment analytics and automated trading solutions. Currently working as a Test and Integration Engineer on a Vessel Traffic Service (VTS) system in the maritime industry since December 2024.\",\"sameAs\":[\"https:\\\/\\\/philip.twinight.co\\\/portfolio\"],\"url\":\"https:\\\/\\\/philip.twinight.co\\\/portfolio\\\/index.php\\\/author\\\/philip\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects - Philip\u2019s Data Science Diary","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/","og_locale":"en_GB","og_type":"article","og_title":"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects - Philip\u2019s Data Science Diary","og_description":"This is an individual project of SDSC3014 &#8211; Introduction to Sharing Economy. I did the project in my year 2 2021\/22 Semester B. Presentation Slides: Course Instructor: Prof.&nbsp;KE Qing Introduction &hellip; Continue readingExploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects","og_url":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/","og_site_name":"Philip\u2019s Data Science Diary","article_published_time":"2022-03-30T18:22:00+00:00","article_modified_time":"2024-03-06T01:46:55+00:00","og_image":[{"width":1920,"height":1080,"url":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2022\/03\/Analyzing-Key-Success-Factors-of-Indiegogo.png","type":"image\/png"}],"author":"Philip","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Philip","Estimated reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#article","isPartOf":{"@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/"},"author":{"name":"Philip","@id":"https:\/\/philip.twinight.co\/portfolio\/#\/schema\/person\/ef4f7cedd9b3bde11e126c4dbe1f8414"},"headline":"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects","datePublished":"2022-03-30T18:22:00+00:00","dateModified":"2024-03-06T01:46:55+00:00","mainEntityOfPage":{"@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/"},"wordCount":2160,"commentCount":0,"publisher":{"@id":"https:\/\/philip.twinight.co\/portfolio\/#\/schema\/person\/ef4f7cedd9b3bde11e126c4dbe1f8414"},"image":{"@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#primaryimage"},"thumbnailUrl":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2022\/03\/Analyzing-Key-Success-Factors-of-Indiegogo.png","keywords":["2021\/22 Semester B","Data Science","Introduction to Sharing Economy","SDSC3014","Year 2"],"articleSection":["Data Analysis","Projects"],"inLanguage":"en-GB","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/","url":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/","name":"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects - Philip\u2019s Data Science Diary","isPartOf":{"@id":"https:\/\/philip.twinight.co\/portfolio\/#website"},"primaryImageOfPage":{"@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#primaryimage"},"image":{"@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#primaryimage"},"thumbnailUrl":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2022\/03\/Analyzing-Key-Success-Factors-of-Indiegogo.png","datePublished":"2022-03-30T18:22:00+00:00","dateModified":"2024-03-06T01:46:55+00:00","breadcrumb":{"@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#primaryimage","url":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2022\/03\/Analyzing-Key-Success-Factors-of-Indiegogo.png","contentUrl":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2022\/03\/Analyzing-Key-Success-Factors-of-Indiegogo.png","width":1920,"height":1080},{"@type":"BreadcrumbList","@id":"https:\/\/philip.twinight.co\/portfolio\/index.php\/2022\/03\/31\/exploring-the-success-factors-of-crowdfunding-campaigns-a-data-driven-analysis-of-indiegogo-projects\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\u9996\u9801","item":"https:\/\/philip.twinight.co\/portfolio\/"},{"@type":"ListItem","position":2,"name":"Exploring the Success Factors of Crowdfunding Campaigns: A Data-Driven Analysis of Indiegogo Projects"}]},{"@type":"WebSite","@id":"https:\/\/philip.twinight.co\/portfolio\/#website","url":"https:\/\/philip.twinight.co\/portfolio\/","name":"Philip\u2019s University Data Science Journey","description":"Navigating Data Science: From Classroom to Career","publisher":{"@id":"https:\/\/philip.twinight.co\/portfolio\/#\/schema\/person\/ef4f7cedd9b3bde11e126c4dbe1f8414"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/philip.twinight.co\/portfolio\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":["Person","Organization"],"@id":"https:\/\/philip.twinight.co\/portfolio\/#\/schema\/person\/ef4f7cedd9b3bde11e126c4dbe1f8414","name":"Philip","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/favicon.png","url":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/favicon.png","contentUrl":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/favicon.png","width":16,"height":16,"caption":"Philip"},"logo":{"@id":"https:\/\/mlcznkdztmb6.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/ig:avif\/https:\/\/philip.twinight.co\/portfolio\/wp-content\/uploads\/2024\/03\/favicon.png"},"description":"Data Scientist &amp; Systems Engineer. Graduated from City University of Hong Kong. Previously founded Twinight Limited as CTO, developing AI investment analytics and automated trading solutions. Currently working as a Test and Integration Engineer on a Vessel Traffic Service (VTS) system in the maritime industry since December 2024.","sameAs":["https:\/\/philip.twinight.co\/portfolio"],"url":"https:\/\/philip.twinight.co\/portfolio\/index.php\/author\/philip\/"}]}},"_links":{"self":[{"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/posts\/241","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/comments?post=241"}],"version-history":[{"count":22,"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/posts\/241\/revisions"}],"predecessor-version":[{"id":339,"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/posts\/241\/revisions\/339"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/media\/325"}],"wp:attachment":[{"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/media?parent=241"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/categories?post=241"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/philip.twinight.co\/portfolio\/index.php\/wp-json\/wp\/v2\/tags?post=241"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}