Ajax scraping..

amul

Client
Регистрация
02.07.2011
Сообщения
147
Благодарностей
10
Баллы
18
I'm trying to scrape profile names from quora..
They use ajax to paginate the list -- and I have that working (good)

However the list of followers is 200k and after about 3-5k loaded in memory zenno project maker hits a breakpoint and gets unstable/crashes.

I've turned off imgs/flash etc.

Suggestions?

thx
Amul
 

bigcajones

Client
Регистрация
09.02.2011
Сообщения
1 216
Благодарностей
683
Баллы
113
If you have the basics built in PM Amul, then just run it in the poster. You won't have the crashing problem in there.
 
  • Спасибо
Реакции: amul

amul

Client
Регистрация
02.07.2011
Сообщения
147
Благодарностей
10
Баллы
18
Thanks Clint!

I'll try it out
 

amul

Client
Регистрация
02.07.2011
Сообщения
147
Благодарностей
10
Баллы
18
I got the template into Poster.. its executing but hangs after template log's in.
--

is this a bug?

1) only works when i bump up the template number pass 1
2) instance will not get pass loading.. works fine in the maker

q1-1.jpg
 

amul

Client
Регистрация
02.07.2011
Сообщения
147
Благодарностей
10
Баллы
18
--- update.


looks like the 2nd instance is working just takes a few mins to get it going.. for some reason.
 

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 715
Баллы
113
You can use Tab - Settings - Timeout to reduce time of loading.
 
  • Спасибо
Реакции: amul

amul

Client
Регистрация
02.07.2011
Сообщения
147
Благодарностей
10
Баллы
18
appears the quora is severely rate limiting me.. perhaps I didn't add enough pauses stay "hidden"
---

might have to setup some proxies and try again
thanks for the suggestion guys!

-Amul
 

amul

Client
Регистрация
02.07.2011
Сообщения
147
Благодарностей
10
Баллы
18
ok.. proxies in place.

Still the instances are executing each step VERY VERY slowly.
-- is there a specific setting that I need to look into?

-- Also when I have the "show instance" on does that slow everything down (only running 1 instance)

thnx
Amul
PS: we need a instructional video on the intellisense (english)..
 

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 715
Баллы
113
  • Спасибо
Реакции: amul

amul

Client
Регистрация
02.07.2011
Сообщения
147
Благодарностей
10
Баллы
18
that tab timeout worked!
-- thanks

Now the instance is crashing/auto starting when I hit 3500 records loaded on the page. My guess is the browser instance has ran out ram/available memory.

Is there a way to clear memory and continue pagination?
thanks!
Amul
 

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)