- Регистрация
- 25.11.2012
- Сообщения
- 544
- Благодарностей
- 26
- Баллы
- 28
Having some issues in crafting a regex, and hope for some advice.
I'm scraping domains with the lookahead and lookbehind regex.
Here is a sample domains to capture:
And my regex:
I only want to capture the main domain without the www., like: domain.com. My regex captures both www and non-www. I know that the regex engine is always eager to match anything, but would appreciate some help.
Cheers!
I'm scraping domains with the lookahead and lookbehind regex.
Here is a sample domains to capture:
Код:
http://domain.com/
https://domain.com/
http://www.domain.com/
https://www.domain.com/
Код:
(?<=https?://|https?://www.).*?(?="|</a>|/)
Cheers!