Stuck removing whitespace form articles in zenno

Harambulus

Client
Регистрация
16.09.2011
Сообщения
365
Благодарностей
10
Баллы
18
Im making something to scrape artciles and rearrange the paragraphs so they are random. The way I did it such that there wouldnt be repeats causes there to be alot of whitespaces and '.' in the output.

Here is an example:

And since you're shopping online, checking the upcoming trends in fashion is only a quick search away.

If you can't return an item that doesn't fit for some reason, you'll be stuck with it. Before you do too much shopping on a site, find out how they charge for shipping. The only disadvantage that comes with shopping over the internet for cheap women's clothes is that you can't try on any of your items. Internet sales are common, and they may offer great deals. Shopping online can save you a huge amount of money and still allow you to buy the newest styles so that your look is just as current as if you'd bought the clothes off the rack in a designer clothing store. Leaving out the brand and typing a simple description of the garment may return results that feature similar items you like even better.

The internet is a vast place, and countless online boutiques exist that deal in cheap women's clothes. That's why millions of women have turned to the internet as a source for their clothes. And you don't need to be completely specific with a search, either.
























.

Adding a few garments to your wardrobe can be a great way to feel good about yourself and look great. And just in case, be sure that you check the website's return policy.

The first thing to keep in mind is, of course, style. And since they're major, established stores, they're could be a bit more trustworthy than others. Finally, don't forget to consider shipping and handling charges when you're adding items to your cart.

Many people overlook online coupons when they are shopping online. Once you know what's going to be popular this year, you'll be able to shop with more confidence. Don't be afraid to check out major stores' websites, either. But buying fashionable clothes can cut into your bank account quickly. Some coupon sites offer discounts to certain online stores, and special offers are frequently offered on many internet stores. In today's tough economy, finding cheap women's clothes is important. Those cheap women's clothes you thought were saving you a fortune could come with a fortune's worth of mailing fees.If you find something you like, it is a simple matter to compare prices elsewhere. Knowing exactly what your sizes are will help you tremendously. There's no driving from store to store, you simply run a search and see what other retailers are offering your items for.
























. With the ability to compare so many different prices, you'll be guaranteed to find the lowest one for your clothes.
























.
























. Also, be sure that you know your sizes.
























.
























.
























.

Women love to shop for the newest fashions.
So how do I get rid of the dots and the whitespaces. Ive been having a real hard time with string replace and regex cos I jsut duont know how to get only the whitespace rather than the spaces between words. I tried a regex suggestions of .+(?=\r\n) but that didnt wrok...just didnt do anything. Any other ideas?

Alternatively maybe there are otehr options to prevent it on the front end? What I want to do is take a 500 word article say. Split the strings by paragraph and save all the praragraphs randomly without repeating any paragraph. The whitespaces I think are coming from zenno taking a random line and saving the whitespace each time so dunno how to prevent that while still having it do what I want- to take a random line without it repeating? I messed around with not deleting the file but doing a logic operation to check if it exists in the target file but I got frustrated and gave up before I figured that out.

So help on either of those issues would be welcome.
 

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 715
Баллы
113
If you compile that kind of text forms from different scraped pieces, try to use word processing - trim in early beginning steps to delete all unnecessary white space at the beginning and at the end of text.
 

bigcajones

Client
Регистрация
09.02.2011
Сообщения
1 216
Благодарностей
683
Баллы
113
Use this regex to get what you want...

[W\w].+
 
  • Спасибо
Реакции: rostonix

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 715
Баллы
113

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)