- Регистрация
- 25.11.2012
- Сообщения
- 544
- Благодарностей
- 26
- Баллы
- 28
I'm having an issue with scraping some websites.
I've come across site where the ahref on the page only gives a partial url. For example, a normal url might be:
But when scraping a page, I only can get a partial url:
/itm/Viking-Traditional-Ladies-Comfort-7-Speed-town-bike-black-/271123065658?pt=UK_Bikes_GL&var=&hash=item3f2031ab3a
Even inspecting the ahref on the page with Firebug, it shows the partial url. I've tried various ways in regex, but the full url isn't there.
Anyway, I've saved the partial url to my list. How could I prefix the start of the url, i.e. ]http://www.ebay.co.uk/, to the other half?
I've come across site where the ahref on the page only gives a partial url. For example, a normal url might be:
Код:
[URL]http://www.ebay.co.uk/itm/Viking-Traditional-Ladies-Comfort-7-Speed-town-bike-black-/271123065658?pt=UK_Bikes_GL&var=&hash=item3f2031ab3a[/URL]
But when scraping a page, I only can get a partial url:
/itm/Viking-Traditional-Ladies-Comfort-7-Speed-town-bike-black-/271123065658?pt=UK_Bikes_GL&var=&hash=item3f2031ab3a
Even inspecting the ahref on the page with Firebug, it shows the partial url. I've tried various ways in regex, but the full url isn't there.
Anyway, I've saved the partial url to my list. How could I prefix the start of the url, i.e. ]http://www.ebay.co.uk/, to the other half?