New line delimeter? How to scrape and mix articles

Tengamer

Client
Регистрация
15.02.2012
Сообщения
8
Благодарностей
0
Баллы
0
Hi

I want to build a scraper like of Senuke, where I pull a random article, and random blocks of text, assembles it to a new txt file.

I would like to pull just say 2 blocks(paragraphs) from any text file within the directory. But I cant define the newline /n as a delimiter. What should I put there instead?
 

Tengamer

Client
Регистрация
15.02.2012
Сообщения
8
Благодарностей
0
Баллы
0
For now because I really needed it, i mass find and replace all files with <somedelimitr>

But I need to remove that part of the flow and be automatic from the scraping to mixing.

SO how do I tell ZP to pull random 1 to 5 blocks?
 
Регистрация
01.02.2011
Сообщения
99
Благодарностей
15
Баллы
0

Tengamer

Client
Регистрация
15.02.2012
Сообщения
8
Благодарностей
0
Баллы
0
Tested it works! I didnt realize u can do a spintax like entries.

For the new line, so theres no way around that except create a custom delimiter?
 

archel

Client
Регистрация
02.05.2011
Сообщения
175
Благодарностей
22
Баллы
18
You don't have to define /n because you can define the opposite. If you make a text file called test.txt with the following:

This is my first paragraph.

This is my second paragraph.

This is my third paragraph.

and you put in the same map your template, then the macro:

{-RegExp.RegExp-|-{-File.GetBlock-|-{-Project.Directory-}test.txt-|--|-random-|-false-}-|-[a-zA-Z\d+].*-|-0-}
gives as result
This is my first paragraph.

{-RegExp.RegExp-|-{-File.GetBlock-|-{-Project.Directory-}test.txt-|--|-random-|-false-}-|-[a-zA-Z\d+].*-|-1-}
gives
This is my second paragraph.

{-RegExp.RegExp-|-{-File.GetBlock-|-{-Project.Directory-}test.txt-|--|-random-|-false-}-|-[a-zA-Z\d+].*-|-2-}
gives
This is my third paragraph.


With the regexp above, it will only work when your paragraph begins with a letter or number.

Another problem can arise when your paragraph is built like this:
"This is my paragraph.
This is still the same paragraph."

instead of
"This is my paragraph. This is still the same paragraph."
 

Tengamer

Client
Регистрация
15.02.2012
Сообщения
8
Благодарностей
0
Баллы
0
Kinda complex. I'll take a look at this later. But thanks for the idea archel.
 

BlackSun

Client
Регистрация
24.01.2011
Сообщения
119
Благодарностей
3
Баллы
0
For New Line use the macro {-String.Enter-}


Use it like this to pull a random paragraph from a block of text:
Код:
{-String.Split-|-This is your block of text including new lines-|-{-String.Enter-}-|-{-Random.Int-|-0-|-5-}-}
Of course paragraphs might be delemetered with 2 new lines in a row. {-String.Enter-}{-String.Enter-}

So you could do this:

Код:
{-String.Split-|-This is your block of text including new lines-|-{-String.Enter-}{-String.Enter-}-|-{-Random.Int-|-0-|-5-}-}
You'd probably want to do String.SplitCount to check how many paragraphs there are.

You'd end up with something like this:

Код:
{-String.Split-|-This is your block of text including new lines-|-{-String.Enter-}{-String.Enter-}-|-{-Random.Int-|-0-|-{-String.SplitCount-|-This is your block of text including new lines-|-{-String.Enter-}{-String.Enter-}-}-}-}
 

Tengamer

Client
Регистрация
15.02.2012
Сообщения
8
Благодарностей
0
Баллы
0
This is awesome. Thanks!
 

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)