How to compare two big files?

  • Автор темы Автор темы bfirst
  • Дата начала Дата начала

bfirst

Client
Регистрация
05.11.2013
Сообщения
56
Реакции
0
Баллы
6
Hi again

I am working on this problem for few days but without any positive results. Please help me!

I have two files with about 2.500.000 URLs in each one. I have to compare this two lists and remove from second list all urls which are in first list.

I have tried function Delete urls in list processing. I took one record from first list and do Delete urls with specified value. It was going too slow.

Then I have created regexp with all urls from first list and do delete urls with regular expression. Same results.

At the end I have created an regexp and compare two files (not lists) to get all results which are not listed in second file. I am waiting now 9 hours and project still works.

Please help if there is any faster sollution to compare two lists? I need something like an option in scrapebox
1. Compare lists on domain level
2. Compare lists on URL level.
But scrapebox is limited to 1.000.000 URLs
 
Have you tried gscraper?
 
Why you dont split your file in 3 parts to use with scrapebox?
You will need to compare each list 3 times, with the 3 differents parts, but that will work
 
Последнее редактирование:

Кто просматривает тему: (Всего: 0, Пользователи: 0, Гости: 0)