I'm wondering if this is Do-Able with Zenno MP... I have a Pro License but have not attempted to do this.
Its feels like I can probably pull this off via the GUI Bot tool like ZennoMP but I'm not sure if I should find a python guy to bust out a script instead.
1. scrape URLs for specific target Market --- "sports bars" + "santa monica" => via googlemaps
Return 200 results (google limits pagination 20 pages = 200 entries)
via Regex match/Extract 10 URLs per page = 200 URLs
Add/Store the matches to an ARRAY or File or network Datastore POST?PUT (mogoid/mysql)
(this looks doable -- and perhaps via POST to network datastore)
2. Find the contact page for the site --- > via Iterate through EACH row and APPENDing a string-
/contact
/contactus
/contact-us
/contact.html
/contactus.php
/contact-us.asp
3. attempt a Get Request or Navigate to that String/URL
http://www.getwastedsportsbar.com/contact
http://www.getwastedsportsbar.com/contact-us
http://www.getwastedsportsbar.com/contactus
http://www.getwastedsportsbar.com/contact/contact.html
etc.
FIlter via http status codes and save the OK 200 URL's
http://httpstatus.es/
4. Take 200 Status Codes and Fill out form via the IntelliSearch feature?
if Captchas fill em out too.
Anyway I think I could maybe pull of 1-2 --- just not sure bout 3 and 4.
Any suggestions would be helpful.
Thanks!
Amul
Its feels like I can probably pull this off via the GUI Bot tool like ZennoMP but I'm not sure if I should find a python guy to bust out a script instead.
1. scrape URLs for specific target Market --- "sports bars" + "santa monica" => via googlemaps
Return 200 results (google limits pagination 20 pages = 200 entries)
via Regex match/Extract 10 URLs per page = 200 URLs
Add/Store the matches to an ARRAY or File or network Datastore POST?PUT (mogoid/mysql)
(this looks doable -- and perhaps via POST to network datastore)
2. Find the contact page for the site --- > via Iterate through EACH row and APPENDing a string-
/contact
/contactus
/contact-us
/contact.html
/contactus.php
/contact-us.asp
3. attempt a Get Request or Navigate to that String/URL
http://www.getwastedsportsbar.com/contact
http://www.getwastedsportsbar.com/contact-us
http://www.getwastedsportsbar.com/contactus
http://www.getwastedsportsbar.com/contact/contact.html
etc.
FIlter via http status codes and save the OK 200 URL's
http://httpstatus.es/
4. Take 200 Status Codes and Fill out form via the IntelliSearch feature?
if Captchas fill em out too.
Anyway I think I could maybe pull of 1-2 --- just not sure bout 3 and 4.
Any suggestions would be helpful.
Thanks!
Amul