How to clean urls list scraped from G - trim to sub sub

ek3ekytop

Client
Регистрация
09.03.2012
Сообщения
111
Благодарностей
17
Баллы
18
Hello I try to remove duplicate telegram groups and channels scraped with google.
i get results like this
C#:
https://t.me/s/manchester_utdfc?before=13282
https://t.me/s/manchester_utdfc?before=14097
https://t.me/s/manchester_utdfc?before=14188
https://t.me/s/manchester_utdfc?before=21442
https://t.me/s/manchester_utdfc?before=27042
https://t.me/s/manchester_utdfc/16279
How can this list be cleared and removed dublicates? From these results I only need one
C#:
https://t.me/s/manchester_utdfc
 

myndeswx

Client
Регистрация
15.05.2017
Сообщения
433
Благодарностей
103
Баллы
43
You just need 4 cubes in a loop, and a temporary list
So you have-
OriginalList tmpList

cube1-
take line from original list

cube2 -
regex replace \?.*

Cube3-
Add to tmpList

Run in a loop
When there are no more lines in original list -
Remove duplicates from tmpList - done.
 
  • Спасибо
Реакции: ek3ekytop

ek3ekytop

Client
Регистрация
09.03.2012
Сообщения
111
Благодарностей
17
Баллы
18
Thank you. It works
 

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)