Time
3 weeks
Location
USA
Sector
Healthcare
Description
The client’s objective was to streamline lead generation for the company’s sales department by creating a real-time data mining system. This endeavor presented several challenges, including limitations imposed by search engines, data integrity issues, and the complexities of data matching.
Solution
To address these challenges, we developed a universal, multi-channel, and scalable engine complete with a REST API for data access. We tackled the intricate problem of data matching using a statistical approach. The system was constructed following best practices in test-driven development, micro-services, and continuous integration to ensure both stability and speed.
Results
The client received a comprehensive software solution for the automatic real-time collection of information about potential clients from open sources. This eliminated the need for manual data collection by the sales department, resulting in cost savings for the company and an expansion of its market share.
Technologies
Flask, MongoDB, Scrapy, BeautifulSoup, Gouse, Pandas, Numpy, NLTK, Luminati