Scrape data and save it into columns

arve_lek

Client
Регистрация
05.06.2012
Сообщения
14
Благодарностей
2
Баллы
3
Is it possible to take all data from scraped text and put it into columns?

1 - first scrape name of the user = (?<=class="ta-tab-offername">).*?(?=</a>) - and save into table - column A
2 - second scrape price of the product = (?<=ta-price-tab\ ">).*?(?=&nbsp;zł</span>)- and save into table - column B
3 - third scrape price of the delivery = (?<=deliverycost">).*?(?=&nbsp;zł</span>) and save into table - column C

1:

paul
kate

2:

10zł
12zł

3:
11zł
11.50zł

Into:

paul;10zł;11zł
kate;12zł;11.50zł
 

VladZen

Administrator
Команда форума
Регистрация
05.11.2014
Сообщения
22 480
Благодарностей
5 917
Баллы
113
Responded to your question in tickets.
There is no special option to save data into certain column, but you can devide this step.
Save data to list at first, then import from list to certain column in a table, see operatings with tables.
 

infohills

Новичок
Регистрация
13.03.2024
Сообщения
2
Благодарностей
0
Баллы
1
You'll likely need libraries for text processing (e.g., re in Python, regular expression libraries in other languages) and data manipulation (e.g., pandas in Python, data structures in other languages).

Python
import re

def process_scraped_text(text):
"""
Extracts user names, product prices, and delivery costs from scraped text
and returns them in a formatted table.

Args:
text (str): The scraped text containing product information.

Returns:
str: A formatted table with user names, product prices, and delivery costs.
"""

names = re.findall(r"(?<=class=\"ta-tab-offername\">).*?(?=</a>)", text)
prices = re.findall(r"(?<=ta-price-tab\ ">).*?(?=&nbsp;zł</span>)", text)
delivery_costs = re.findall(r"(?<=deliverycost\">).*?(?=&nbsp;zł</span>)", text)

# Handle potential mismatches or missing data
if len(names) != len(prices) or len(names) != len(delivery_costs):
print("Warning: Mismatch in data length. Consider handling missing data.")
# Implement logic to handle mismatches (e.g., use filler values, remove rows)

formatted_table = ""
for i in range(len(names)):
formatted_table += f"{names};{prices};{delivery_costs}\n"

return formatted_table

# Example usage (assuming you have scraped text in a variable called 'scraped_text')
formatted_table = process_scraped_text(scraped_text)
print(formatted_table)
 

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)