Need help with proxies

zxcvbn

Новичок
Регистрация
17.07.2012
Сообщения
76
Благодарностей
1
Баллы
0
Hi,

I have a client who wants me to go to a site and check for some details by entering the input form and searching for some text.

Template is ready and works fine.

But, the site bans a proxy after 4-5 iterations. So, I have to use proxies.

Current requirement is to work on 1 million urls per day. So, approximately .5 million proxies per day are required :p

So, can someone please guide me to some proxy providers or proxy softwares which can get me these many proxies.

And how to integrate these with zennoposter? e.g. if i use a proxy software to leech and test 10000s of proxies daily, then how to use it with zenno?

Thanks
 

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 715
Баллы
113
You will not find that amount of proxies daily from any provider. I'm almost 100% sure
Google it anyway
 

zxcvbn

Новичок
Регистрация
17.07.2012
Сообщения
76
Благодарностей
1
Баллы
0

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 715
Баллы
113
What can I say... Good luck with this :-)

Try to post on blackhatworld.com
But as I say.. I'm almost sure there are no any providers who offer that amount of proxies.
 

archel

Client
Регистрация
02.05.2011
Сообщения
175
Благодарностей
22
Баллы
18
You won't find it. And if you would, it would be too costly.

Are the proxies always being banned after 4-5 times, even if you eg wait an hour between them?
 

zxcvbn

Новичок
Регистрация
17.07.2012
Сообщения
76
Благодарностей
1
Баллы
0
Actually, they ban the proxy for some 1-2 hours and then one can reuse them but if a proxy is used more than 5-10 times in a single go, they ban it permanently.
By banning I mean, the result shown are not the expected results and are different that the result if a fresh proxy would have been used.

Thanks
You won't find it. And if you would, it would be too costly.

Are the proxies always being banned after 4-5 times, even if you eg wait an hour between them?
 
Регистрация
01.02.2011
Сообщения
99
Благодарностей
15
Баллы
0
How do you know its because of IP? There are numerous cases of big work places, colleges, etc having one IP, or small range, and a boat load of people using services without any IP bans.

If the site has 1million fresh pages a day to scrape, its most likely got a shit load of traffic from work places & school/college IP's, so what are you doing different to the norm?

Also, is the site been indexed in google? Cloak your user agent as google & bing bots, or just scrape from the mother of all scrapers, google.
 
  • Спасибо
Реакции: zxcvbn

archel

Client
Регистрация
02.05.2011
Сообщения
175
Благодарностей
22
Баллы
18
If CaptainObvious' idea wouldn't work, then you can still look at using proxies, but eg only 20.000.
If you loop them and use a certain ip only once every hour (let's say they aren't banned then), then you can still check 480.000 urls which is still fair.
 
  • Спасибо
Реакции: zxcvbn

drvosjeca

Client
Регистрация
26.10.2011
Сообщения
512
Благодарностей
455
Баллы
63
it is not so much to ip's as you think, bigger issue is being toooooo obvious...

What usually makes things more difficult is jumping directly to some pages/links.
Solution is in most cases simple, and all you need is few extra steps. You can change user agents and cloak them (as CaptainObvious said), use different referals, make a site search before jumping directly to link... and in the end you will have great results. You will see that you actually dont need so many ip's to make it work.
 
  • Спасибо
Реакции: zxcvbn

Thru_K

Client
Регистрация
04.07.2012
Сообщения
45
Благодарностей
12
Баллы
0

zxcvbn

Новичок
Регистрация
17.07.2012
Сообщения
76
Благодарностей
1
Баллы
0
Hi guys, thanks for the response. I couldn't come online for few days. Anyways, still working around on this. I will try to implement what CaptainObvious suggested.

Let's hope to make it work.

Thanks
 

Кто просматривает тему: (Всего: 2, Пользователи: 0, Гости: 2)