Crawling Competitors

Tools: Puppeteer and MySQL
Team: Myself

Overview

To support our sales team, I built an automated system that crawls competitor websites to gather provider information and supply as a csv to be imported into salesforce. The solution leveraged Puppeteer for browser automation and MySQL for data storage and workflow control. By working zip code by zip code, the system systematically covered entire regions and provided our reps with a steady pipeline of prospects.


Problem

Our competitors maintained searchable provider directories on their websites, but manually searching them was inefficient and often left gaps in coverage. The sales team needed a way to consistently identify new providers across the country without dedicating hours to repetitive searches.


Approach

I designed a system that:

  1. Controlled Workflow with MySQL
    • Created a table of all U.S. zip codes, each with a processing state:
      • null → Not yet started
      • running → Currently being processed
      • done → Successfully scraped
      • failed → Attempted but error occurred
    • Built a second table to hold provider details (name, address, phone, etc.), tied back to the originating zip code.
  2. Automated Crawling with Puppeteer
    • Puppeteer launched the competitor’s provider search page.
    • For each zip code marked null, the script entered the code, executed the search, and scraped the resulting provider data.
    • Extracted information was written into the Providers Table.
  3. Error Handling and Recovery
    • Any errors (timeouts, parsing failures, or site issues) marked the zip as failed for later review.
    • Completed zips were updated to done, allowing the script to safely resume where it left off if interrupted.

Results

  • The system processed thousands of zip codes in a structured, repeatable way.
  • Sales received a centralized, queryable database of potential providers, with new leads tied to geographic regions.
  • Failed zips were easy to reprocess, ensuring no area was missed.
  • By automating the collection, we reduced manual effort and provided more comprehensive coverage than the team had achieved previously.
  • As FDA issues arose with competitor devices I was able to generate a list of providers who used that device to our sales team with 24 hours.

Impact

This project gave the sales team a scalable, automated lead source that updated as often as needed. Instead of wasting time hunting for provider information, they were able to focus on outreach and conversions.