Please Help With Amazon Description RegEx

  • Автор темы Автор темы LBrown
  • Дата начала Дата начала

LBrown

Client
Регистрация
20.11.2011
Сообщения
6
Реакции
0
Баллы
0
It should be really obvious and I've been able to scrape other elements on the page but I just can't pick up the description.

So, for this page:

http://www.amazon.com/gp/product/B004X6TSOG/

the whole description is between

Код:
Развернуть Свернуть Копировать
<div class="productDescriptionWrapper"></div class="EmptyClear">

so the RegEx should be

Код:
Развернуть Свернуть Копировать
(?<=\<div class\=\"productDescriptionWrapper\"\>).*?(?=\<div class\=\"emptyClear\"\>)

But it doesn't work when I test it.

Can someone show me where I've gone wrong and what I need to do to fix it, please?
 
Your regexp is wrong. You need this (?<=\<div class\=\"productDescriptionWrapper\"\>).*?(?=\<\/div class\=\"EmptyClear\"\>)
 
Thank you for your answer but when I try "Test a Regular Expression" with yours nothing comes up. I've tried with DOM HTML and Source HTML.
 
You should also do \ before spaces:
(?<=\<div\ class\=\"productDescriptionWrapper\"\>).*?(?=\<\/div\ class\=\"EmptyClear\"\>)

Though the answer of shifu should work in test mode.
Make also look if <div class="productDescriptionWrapper"></div class="EmptyClear"> is correct in DOM, cause on my pc it's <div class=productDescriptionWrapper></div class=EmptyClear>
 
  • Спасибо
Реакции: LBrown
Okay, So I tried this with Source HTML (?<=\<div\ class\=\"productDescriptionWrapper\"\>).*?(?=\<\/div\ class\=\"EmptyClear\"\>)

And this with DOM HTML (?<=\<DIV\ class\=productDescriptionWrapper\>).*(?=\<DIV\ class\=emptyClear\>)

And still nothing. Is there something wrong with the Regular expression builder? Has anyone else been able to get the Regular Expression builder to give results from these?
 
drvosjeca fixed it for me. Thanks to all for the help.
 
41039162_1645044255605609_943693482747232256_o.jpg
 

Кто просматривает тему: (Всего: 0, Пользователи: 0, Гости: 0)