- Регистрация
- 06.08.2013
- Сообщения
- 94
- Благодарностей
- 5
- Баллы
- 8
Hi,
I made this regex to clean the HTML code and keep only the H, p, strong and b tags :
It works but I would like to go further by deleting the links without deleting those whose domain name would be in a whitelist.
How to do this ? Thanks for your help.
I made this regex to clean the HTML code and keep only the H, p, strong and b tags :
Код:
<div.*?>|<span.*?>|<figcaption.*?>|<img.*?>|<hl.*?>|</hl.*?>|</span.*?>|<picture.*?>|</picture.*?>|</div.*?>|<svg.*?>|<path.*?>|<figure.*?>|</figcaption.*?>|</figure.*?>|class.*?(?=>)|</path.*?>|</svg.*?>|<source.*?>|</source.*?>|(?<!\()<a.*?>|</a.?>(?!\))|<aside.*?>|</aside.*?>|rel=".*?"|target=".*?"|<header[\w\W]*header>|Share.*?(?=<)|Previous\ article|Next\ article
How to do this ? Thanks for your help.