Help with some regex

shabbysquire

Client
Регистрация
25.11.2012
Сообщения
544
Реакции
26
Баллы
28
Having some issues in crafting a regex, and hope for some advice.

I'm scraping domains with the lookahead and lookbehind regex.

Here is a sample domains to capture:

Код:
Развернуть Свернуть Копировать
http://domain.com/
https://domain.com/

http://www.domain.com/
https://www.domain.com/

And my regex:

Код:
Развернуть Свернуть Копировать
(?<=https?://|https?://www.).*?(?="|</a>|/)

I only want to capture the main domain without the www., like: domain.com. My regex captures both www and non-www. I know that the regex engine is always eager to match anything, but would appreciate some help.

Cheers!
 
just remove it after regex )
 
I have done that, but it's just a challenge for me to improve my regex skills. ;-)

The solution is to ignore the: www. So I need to find out what it is!
 
Done:

Код:
Развернуть Свернуть Копировать
(?<=https?://(?:www\.)?)(?!www\.).*?(?=['/"]|</a>)
 
  • Спасибо
Реакции: Ribas

Кто просматривает тему: (Всего: 0, Пользователи: 0, Гости: 0)