The State Changers engaged in a detailed discussion on data mining and web scraping. They focused on David’s project about gathering public data from non-profit organizations in Austin, focusing on sites like the United Way of Austin. Pricing issues from outsourcing the task were also addressed, alongside the option to use no-code tools to scrape data.
David expressed hesitation about spending thousands on outsourcing the task and proposed getting a better understanding of what he needs and looking for alternate offers. He additionally considered reaching out to organizations, explaining his project, and asking them to share their database.
The State Changers discussed several tool options for web scraping like Octoparse and also mentioned videos that might help David understand the technicality of the task. The possibility of using multiple free-tier tools to scrape the data was explored, highlighting the potential for cost savings.
However, upon inspecting the website David wanted to scrape, it appeared like the website consolidated the data within an HTML file which complicates the data extraction process as there was no structured data type back end. Hence, the most recommended no-code solution seemed to be striking a deal and getting the cooperation of the entity that controls the data, such as the United Way. Even though this could have some legal implications, striving to keep it on the up and up is strongly suggested.
(Source: Office Hours 7/18 )